You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Module 2, we wrote a simple, centralized data_prep.py to get started.
In real projects, this often evolves into modular per-table scripts to better manage complexity and focus.
We'll see an example of that kind of evolution.
In Module 2, we had one file: scripts/data_prep.py.
In Module 3, we now use one file per data table:
scripts/data_prep/prepare_customers.py
scripts/data_prep/prepare_products.py
scripts/data_prep/prepare_sales.py
Why?
As data projects grow, it becomes easier to:
Focus on one dataset at a time
Avoid breaking other code when cleaning changes
Test and debug more easily
Let different team members work on different files
We move the old data_prep.py in an archive/ folder so you can compare and reuse as needed.
Module 3: Continuing Project Work
We don't need to create our .venv as we should already have it.
If not, go back to Module 1 and 2 make sure those steps are completed.
Now, we just follow our regular workflow. If we find we need additional external packages, we can always re-run the install from requirements.txt command as needed. In general, we:
Pull any recent changes from GitHub.
Activate the .venv.
Run scripts/data_prep.py.
Module 3: Mac/Linux Commands
Open your smart sales repository in VS Code.
Open a terminal in the root project folder.
Activate your .venv and run each file.