diskover is a file system crawler that indexes your files metadata in Elasticsearch and visualizes your disk usage in Kibana. It crawls and indexes your files on a local computer or remote server using nfs or cifs.
File metadata is bulk added and streamed into Elasticsearch, allowing you to search and visualize your files in Kibana without having to wait until the crawl is finished. diskover is written in Python and runs on Linux, OS X/macOS and Windows.
diskover aims to help manage your storage by identify old and unused files and give better insights into file duplication and wasted space.
Kibana dashboards / saved searches and visualizations (included in diskover download)
diskover-web (diskover's web file manager)
Gource visualization support (see videos below) 
Linux, OS X/macOS or Windows(tested on OS X 10.11.6, Ubuntu 16.04 and Windows 7)Python 2.7. or Python 3.5.(tested on Python 2.7.10, 2.7.12, 3.5.3)Python elasticsearch client moduleelasticsearch (tested on 5.3.0, 5.4.0)Python requests modulerequestsPython scandir modulescandir (included in Python 3.5.)Elasticsearch(local or AWS ES Service, tested on Elasticsearch 5.3.0, 5.4.2)Kibana(tested on Kibana 5.3.0, 5.4.2)
- diskover-web (diskover's web panel for searching/tagging files)
- X-Pack (for graphs, reports, monitoring and http auth)
- Gource (for Gource visualizations of diskover Elasticsearch data)
$ git clone https://github.com/shirosaidev/diskover.git $ cd diskoverYou need to have at least Python 2.7. or Python 3.5. and have installed required Python dependencies using pip.
$ sudo pip install -r requirements.txtStart diskover as root user with:
$ cd /path/you/want/to/crawl $ sudo python /path/to/diskover.pyFor Windows, run CygWin terminal as administrator and then run diskover.
Defaults for crawl with no flags is to only index files 5+ MB and 30+ days modified time. Use -h to see cli options.
A successfull crawl should look like this:
___ ___ ___ ___ ___ ___ ___ ___ /\ \ /\ \ /\ \ /\__\ /\ \ /\__\ /\ \ /\ \ /::\ \ _\:\ \ /::\ \ /:/ _/_ /::\ \ /:/ _/_ /::\ \ /::\ \ /:/\:\__\ /\/::\__\ /\:\:\__\ /::-"\__\ /:/\:\__\ |::L/\__\ /::\:\__\ /::\:\__\ \:\/:/ / \::/\/__/ \:\:\/__/ \;:;-",-" \:\/:/ / |::::/ / \:\:\/ / \;:::/ / \::/ / \:\__\ \::/ / |:| | \::/ / L;/__/ \:\/ / |:\/__/ \/__/ \/__/ \/__/ \|__| \/__/ v1.0.12 \/__/ \|__| https://github.com/shirosaidev/diskover 2017-05-17 21:17:09,254 [INFO][diskover] Connecting to Elasticsearch 2017-05-17 21:17:09,260 [INFO][diskover] Checking for ES index: diskover-2017.04.22 2017-05-17 21:17:09,262 [WARNING][diskover] ES index exists, deleting 2017-05-17 21:17:09,340 [INFO][diskover] Creating ES index Crawling: [100%] |########################################| 8570/8570 2017-05-17 21:17:16,972 [INFO][diskover] Finished crawling 2017-05-17 21:17:16,973 [INFO][diskover] Directories Crawled: 8570 2017-05-17 21:17:16,973 [INFO][diskover] Files Indexed: 322 2017-05-17 21:17:16,973 [INFO][diskover] Elapsed time: 7.72081303596 Read the wiki for more documentation on how to use diskover.
For discussions or questions about diskover, please ask on Google Group.
For bugs about diskover, please use the issues page.
See the license file.
