Data Science Tools for Finance#
By Jeremy Bejarano
This set of notes is designed to accompany FINM 32900: Data Science Tools for Finance.
Course Description “Data Science Tools for Finance” is a hands-on course centered on key data science tools in quantitative finance. Acknowledging the field’s wide scope, the course focuses on a common skill set across various data science subfields. That is, this course examines elements of the analytical pipeline, from data extraction and cleaning to exploratory analysis, visualization, and modeling, and finally, publication and deployment. It does so with the aim of teaching the tools and principles behind creating reproducible and scalable workflows, including build automation, dependency management, unit testing, the command-line environment, shell scripting, Git for version control, and GitHub for team collaboration. These skills are taught through case studies, each of which will additionally give students practical experience with key financial data sets and sources such as CRSP and Compustat for pricing and financials, macroeconomic data from FRED and the BEA, bond transactions from FINRA TRACE, Treasury auction data from TreasuryDirect, textual data from EDGAR, and high-frequency trade and quote data from NYSE. Prior experience at an intermediate level with Python and the PyData stack is assumed.
- Course Syllabus: FINM 32900, Winter 2024
- Homework 0: Setting up your computing environment
- Week 1: Introduction
- Homework 1
- Week 2: Build Systems and Task Runners
- Homework 2
- Week 3: Env Files, Secrets, Automating Queries, and the Basics of SQL
- Homework 3
- Week 4: Generating Reports, featuring Jupyter Notebooks and LaTeX
- Week 5: Unit Tests and Documentation with Sphinx
- Week 6: Bloomberg and Social Coding with GitHub
- Week 7: Reviewing Sphinx and Unit Tests
- Week 8: GitHub Actions and Publishing a Live Dashboard
- Homework 4
- Final Project
- Appendix