Python

General Python Resources

Python Programming Language, the official Python website

The Python Tutorial

The Python Standard Library

How to Think Like a Computer Scientist – The Python Version

Think Python, on-line book by Allen Downey

Learn Python the Hard Way, another online book on Python

Learning with Python – Interactive Edition, an interactive textbook on Python programming

Important Tools And Libraries

IPython: A REPL for easy interactive python development. Extremely useful for testing ideas out one line of code at a time.

matplotlib: A very nice plotting library, capable of generating production-level visualizations programmatically. Matlab-like syntax makes plotting very easy.

NumPy: The fundamental package for scientific computing with Python.

SciPy: the open source library for mathematics, science and engineering

scikit-learn: a robust machine learning library building on top of NumPy, SciPy and matplotlib. Includes of a wide variety of modeling techniques.

Pandas (python data analysis library): data structures and tools for common data analysis tasks, including an efficient data frame implementation (similar to R).

BeautifulSoup: A general parsing library particularly useful for parsing html and xml.

NLTK: Natural Language Toolkit for Python, including tools for text preprocessing, tokenization, and vectorization (you may also be interested in an online book that shows how NLTK is used).

NetworkX: Python language library for the creation, manipulation, and analysis of graphs and networks.

References For Data Analysis In Python

Python Scientific Lecture Notes: including detailed notes on NumPy, Matplotlib, and Scipy

NumPy and SciPy Documentation: including NumPy User Guide and Cookbook

Tentative NumPy Tutorial: from SciPy.org

Getting Started with Python for Data Science: from Kaggle.com – includes information about installing Python and relevant libraries.

Introduction to NumPy and Matplotlib – YouTube video

Matplotlib User’s Guide

Matplotlib Pyplot Tutorial

IPython Documentation

Natural Language Processing with Python: Online book on text processing and analysis using NLTK.

Installation Of Python And Scientific Libraries

winpython – (Windows) Preferred Python distribution for this class (includes scientific and data analysis libraries such as Numpy, Pandas, and scikit-learn, as well as IPython).

Anaconda – (Mac, Windows, Linux) Python distribution for large-scale data processing and scientific computing (includes scientific and data analysis libraries such as Numpy, Pandas, and scikit-learn, as well as IPython)

Notepad++: Excellent Python-friendly text editor

Installing NumPy and SciPy

Installing scikit-learn

Installing Pandas

Standalone Python Distributions

General Python Resources

Python Programming Language, the official Python website

The Python Tutorial

The Python Standard Library

How to Think Like a Computer Scientist – The Python Version

Think Python, on-line book by Allen Downey

Learn Python the Hard Way, another online book on Python

Learning with Python – Interactive Edition, an interactive textbook on Python programming

Important Tools And Libraries

IPython: A REPL for easy interactive python development. Extremely useful for testing ideas out one line of code at a time.

matplotlib: A very nice plotting library, capable of generating production-level visualizations programmatically. Matlab-like syntax makes plotting very easy.

NumPy: The fundamental package for scientific computing with Python.

SciPy: the open source library for mathematics, science and engineering

scikit-learn: a robust machine learning library building on top of NumPy, SciPy and matplotlib. Includes of a wide variety of modeling techniques.

Pandas (python data analysis library): data structures and tools for common data analysis tasks, including an efficient data frame implementation (similar to R).

BeautifulSoup: A general parsing library particularly useful for parsing html and xml.

NLTK: Natural Language Toolkit for Python, including tools for text preprocessing, tokenization, and vectorization (you may also be interested in an online book that shows how NLTK is used).

NetworkX: Python language library for the creation, manipulation, and analysis of graphs and networks.

Installation Of Python And Scientific Libraries

winpython – (Windows) Preferred Python distribution for this class (includes scientific and data analysis libraries such as Numpy, Pandas, and scikit-learn, as well as IPython).

Anaconda – (Mac, Windows, Linux) Python distribution for large-scale data processing and scientific computing (includes scientific and data analysis libraries such as Numpy, Pandas, and scikit-learn, as well as IPython)

Notepad++: Excellent Python-friendly text editor

Installing NumPy and SciPy

Installing scikit-learn

Installing Pandas

Standalone Python Distributions

References For Data Analysis In Python

Python Scientific Lecture Notes: including detailed notes on NumPy, Matplotlib, and Scipy

NumPy and SciPy Documentation: including NumPy User Guide and Cookbook

Tentative NumPy Tutorial: from SciPy.org

Getting Started with Python for Data Science: from Kaggle.com – includes information about installing Python and relevant libraries.

Introduction to NumPy and Matplotlib – YouTube video

Matplotlib User’s Guide

Matplotlib Pyplot Tutorial

IPython Documentation

Natural Language Processing with Python: Online book on text processing and analysis using NLTK.