Skip to content

File system crawler and disk space analyzer using Elasticsearch and Kibana.

License

Notifications You must be signed in to change notification settings

halonent/diskover

Repository files navigation

diskover - File system crawler and disk space analyzer using Elasticsearch and Kibana

diskover

diskover is a file system crawler that indexes your files metadata in Elasticsearch and visualizes your disk usage in Kibana. It crawls and indexes your files on a local computer or remote server using nfs or cifs.

File metadata is bulk added and streamed into Elasticsearch, allowing you to search and visualize your files in Kibana without having to wait until the crawl is finished. diskover is written in Python and runs on Linux, OS X/macOS and Windows.

diskover aims to help manage your storage by identify old and unused files and give better insights into file duplication and wasted space.

Screenshots

Kibana dashboards / saved searches and visualizations (included in diskover download) kibana-screenshotdiskover-web (diskover's web file manager) diskover-web Gource visualization support (see videos below) diskover-gource

diskover Gource videos

Installation Guide

Requirements

  • Linux, OS X/macOS or Windows (tested on OS X 10.11.6, Ubuntu 16.04 and Windows 7)
  • Python 2.7. or Python 3.5. (tested on Python 2.7.10, 2.7.12, 3.5.3)
  • Python elasticsearch client moduleelasticsearch (tested on 5.3.0, 5.4.0)
  • Python requests modulerequests
  • Python scandir modulescandir (included in Python 3.5.)
  • Elasticsearch (local or AWS ES Service, tested on Elasticsearch 5.3.0, 5.4.2)
  • Kibana (tested on Kibana 5.3.0, 5.4.2)

Windows Additional Requirements

Optional Installs

  • diskover-web (diskover's web panel for searching/tagging files)
  • X-Pack (for graphs, reports, monitoring and http auth)
  • Gource (for Gource visualizations of diskover Elasticsearch data)

Download

$ git clone https://github.com/shirosaidev/diskover.git $ cd diskover

You need to have at least Python 2.7. or Python 3.5. and have installed required Python dependencies using pip.

$ sudo pip install -r requirements.txt

Getting Started

Start diskover as root user with:

$ cd /path/you/want/to/crawl $ sudo python /path/to/diskover.py

For Windows, run CygWin terminal as administrator and then run diskover.

Defaults for crawl with no flags is to only index files 5+ MB and 30+ days modified time. Use -h to see cli options.

A successfull crawl should look like this:

 ___ ___ ___ ___ ___ ___ ___ ___ /\ \ /\ \ /\ \ /\__\ /\ \ /\__\ /\ \ /\ \ /::\ \ _\:\ \ /::\ \ /:/ _/_ /::\ \ /:/ _/_ /::\ \ /::\ \ /:/\:\__\ /\/::\__\ /\:\:\__\ /::-"\__\ /:/\:\__\ |::L/\__\ /::\:\__\ /::\:\__\ \:\/:/ / \::/\/__/ \:\:\/__/ \;:;-",-" \:\/:/ / |::::/ / \:\:\/ / \;:::/ / \::/ / \:\__\ \::/ / |:| | \::/ / L;/__/ \:\/ / |:\/__/ \/__/ \/__/ \/__/ \|__| \/__/ v1.0.12 \/__/ \|__| https://github.com/shirosaidev/diskover 2017-05-17 21:17:09,254 [INFO][diskover] Connecting to Elasticsearch 2017-05-17 21:17:09,260 [INFO][diskover] Checking for ES index: diskover-2017.04.22 2017-05-17 21:17:09,262 [WARNING][diskover] ES index exists, deleting 2017-05-17 21:17:09,340 [INFO][diskover] Creating ES index Crawling: [100%] |########################################| 8570/8570 2017-05-17 21:17:16,972 [INFO][diskover] Finished crawling 2017-05-17 21:17:16,973 [INFO][diskover] Directories Crawled: 8570 2017-05-17 21:17:16,973 [INFO][diskover] Files Indexed: 322 2017-05-17 21:17:16,973 [INFO][diskover] Elapsed time: 7.72081303596 

User Guide

Read the wiki for more documentation on how to use diskover.

Discussions/Questions

For discussions or questions about diskover, please ask on Google Group.

Bugs

For bugs about diskover, please use the issues page.

License

See the license file.

About

File system crawler and disk space analyzer using Elasticsearch and Kibana.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python89.9%
  • Shell10.1%