General Python Resources

Important Tools and Libraries

  • IPython: A REPL for easy interactive python development. Extremely useful for testing ideas out one line of code at a time.
  • matplotlib:  A very nice plotting library, capable of generating production-level visualizations programmatically. Matlab-like syntax makes plotting very easy.
  • NumPy: The fundamental package for scientific computing with Python.
  • SciPy: the open source library for mathematics, science and engineering
  • scikit-learn: a robust machine learning library building on top of NumPy, SciPy and matplotlib. Includes of a wide variety of modeling techniques.
  • Pandas (python data analysis library): data structures and tools for common data analysis tasks, including an efficient data frame implementation (similar to R).
  • BeautifulSoup: A general parsing library particularly useful for parsing html and xml.
  • NLTK: Natural Language Toolkit for Python, including tools for text preprocessing, tokenization, and vectorization (you may  also be interested in an online book that shows how NLTK is used).
  • NetworkX: Python language library for the creation, manipulation, and analysis of graphs and networks.

References for Data Analysis in Python

Installation of Python and Scientific Libraries