Hands-on Introduction to Python Analytic Stack
Thurgood Marshall East
Python is in high demand. In addition to being used purely for development, Python programming is one of the top skills for data scientists because it is a full stack analytics package. You can access data with it (or crawl to gather data), slice it and dice it, throw it into a database, visualize it, and perform machine learning with it.
This course will cover some of the tools that data scientists are using to analyze data. Specifically, we will introduce the IPython Notebook (Jupyter), the pandas toolkit, and the plotting facilities in matplotlib.
Developers or admins who know Python or another language and want to learn about the analytic stack, specifically iPython Notebook, pandas, and Matplotlib.
Attendees will return to work with a basic understanding of the Python tools for data analysis.
- Anaconda Distribution
- IPython Notebook
- Navigation in Notebook
- Executing code in Notebook
- pandas Introduction
- Getting data
- Cleaning data
- Examining data
- Filtering, joining and updating data
- Working with aggregates
- Creating pivot tables
- Plotting Introduction
- matplotlib architecture
- Line plots
- Histograms
- Box Plots
- Tweaking axis, labels, legends
Attendees should have the (free) Anaconda stack installed on their machine. This is a large download, so please do it before the class. Downloads for Windows, Mac, and Linux can be found at http://continuum.io/downloads
connect with us