Project Structure: “Chartbook” Template#
I would like to defer much of this discussion to the discussion presented by Cookie Cutter Data Science.
The key take-aways from the Cookie Cutter Data Science project structure discussion are these:
Other people will thank you
You will thank you
Data is immutable (pull fresh when you can)
Analysis is a DAG (build system)
“Build from the environment up” (Use a virtual environment and start sparse)
Keep secrets out of version control. Use
.envfiles.
What is Cruft?#
Cruft extends Cookiecutter by adding ongoing template synchronization. It is fully compatible with existing Cookiecutter templates but adds the ability to:
Check if your project is up-to-date with the template (
cruft check)Update your project when the template changes (
cruft update)View differences between your project and the template (
cruft diff)
This means when the course template is updated with improvements or bug fixes, you can pull those changes into your existing project rather than starting over or manually copying changes.
Course Template#
For this course, use the following Cookiecutter template with Cruft:
Template: backofficedev/cookiecutter_chartbook
To create a new project:
pip install cruft
cruft create https://github.com/backofficedev/cookiecutter_chartbook
See the template’s README for detailed usage instructions and the structure it provides.
Historical Reference#
For historical reference, older versions of project templates I’ve used are available here:
These are now deprecated in favor of the Cookiecutter template above, which provides better maintainability through Cruft’s template synchronization features.