Skip to content

cybersecuritytutorial/data_hacking

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data_hacking

Welcome to the Click Security Data Hacking Project

"Hacking in the sense of deconstructing an idea, hardware, anything and getting it to do something it wasn’t intended or to better understand how something works."(BSides CFP)

So hacking here means we want to quickly deconstruct data, understand what we've got and how to best utilize it for the problem at hand.

The primary motivation for these exercises is to explore the nexus of iPython, Pandas and Scikit Learn on security data of various kinds. The exercises will often intentionally show common missteps, warts in the data, paths that didn't work out that well and results that could definitely be improved upon. In general we're trying to capture what worked and what didn't, not only is that more realistic but often much more informative to the reader. :)

Python Modules Used:
Exercises:
  • Detecting Algorithmically Generated Domains
    • GitHub Project
    • Notebook Viewer
  • Hierarchical Clustering of Syslogs
    • GitHub Project
    • Notebook Viewer
  • Exploration of data from Malware Domain List
    • GitHub Project
    • Notebook Viewer

#####Setup:

  • Required packages:

    • Brew
      • graphviz, freetype, zmq
    • Python
      • ipython, pandas, matplotlib, pyzmq, jinja2
  • Some of the exercises use packages from the data_hacking repository, to install those packages into your python site packages:

     %> sudo python setup.py install
  
  • To uninstall:
     %> sudo pip uninstall data_hacking
  

Running the Notebooks:

Most of the notebooks will have relative paths to some resources, data files or images. In general the easiest way we found to run ipython on the notebooks is to change into that project directory and run ipython with this alias (put in your .bashrc or whatever):

alias ipython='ipython notebook --FileNotebookManager.notebook_dir=`pwd`'
$ cd data_hacking/fun_with_syslog
$ ipython (as aliased above)

About

Click Security Data Hacking Project

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 97.0%
  • Python 2.7%
  • Other 0.3%