Skip to content

Python for Data Science (Seminar Course at UC Berkeley; AY 250)

License

Notifications You must be signed in to change notification settings

profjsb/python-seminar

Repository files navigation

Python Computing for Data Science

Binder

A Graduate Seminar Course at UC Berkeley (AY 250)

Campbell Hall: Monday 4:10 - 7:00 PM SPRING 2022

Synopsis

Python has become the de factosuperglue language for modern scientific computing. In this course we will learn Pythonic interactions with databases, imaging processing, advanced statistical and numerical packages, web frameworks, machine-learning, and parallelism. Each week will involve lectures and coding projects. In the final capstone project, students will build a working codebase useful for their own research domain.

This class is for any student working in a quantitative discipline and with familiarity with Python. Those who completed the Python Bootcamp or equivalent will be eligible. You should follow the steps to install the Anaconda 3-2021-* distribution as well as git.

Course Schedule

DateContentReadingLeader
Jan 24 Online onlyNumpy, Scipy, & Pandas
Binder
- scipy §§ 1.3, 1.5, 2.2
- numpy
- skim chap 4/5 of McKinney
Josh
Jan 31Data visualization (Matplotlib, Bokeh, Altair)- Skim Tufte's Visualization book
- colormap talk (Scipy 2015)
Josh
Feb 7Application building and TestingNoneJosh
Feb 14Parallelism (asyncio, dask, ray, jax)NoneJosh
Feb 21Holiday (no class)
Feb 28Database interaction (sqlite, postgres, SQLAlchemy),
Large datasets (xarray, HDF5)
NoneJosh
Mar 7Machine Learning I (sklearn: regression, classification; dask-learn, auto-ml)NoneJosh
Mar 1428Machine Learning II (keras [tensorflow])Deep Learning with KerasJosh
Mar 21Spring Break
Mar 28Interacting with the world (requests, email, IoT/pyserial)NoneJosh
Apr 1
Friday 10-1pm
Web frameworks & RESTful APIs, FlaskNoneJosh
Apr 4No lecture
Apr 11Bayesian programming & Symbolic mathProbabalistic Programming eBook
install:
pip install pymc3
Josh
Apr 18Image processing (OpenCV, skimage)NoneStefan van der Walt
Apr 25Speeding it up (Numba, Cython, wrapping legacy code)NoneJosh
Onwardfinal project work

Useful Books

Sidebar Concepts

Throughout these lectures we will be peppering in sidebar knowledge concepts:

  • Jupyter & JuypterLab
  • using git & github
  • Docker
  • Data science workflows
  • reproducible research
  • application building
  • debugging
  • testing

Workflow

Each Monday we will be introducing a reasonably self-contained topic with two back-to-back lectures. In between a short (~20 minute) breakout coding session will be conducted. Homeworks will require you to write a large (several hundred line) codebase.

Help sessions will be conducted interactively on the Piazza site for the course. There is also an in-person help session every TBD. Email Josh with any questions.

Contact

Email us at ucbpythonclass@gmail.com or contact the professor directly (joshbloom@berkeley.edu). You can also contact the GSI, Ellianna Abrahams, at (ellianna@berkeley.edu. Auditing is not permitted by the University but those wishing to sit in on a class or two should contact the professor before attending.

About

Python for Data Science (Seminar Course at UC Berkeley; AY 250)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 13