Quickstart¶
Setup and data¶
This repository uses uv for package and project management. Start by syncing dependencies:
Then preprocess the data to create dataset splits and pair datasets under data/:
Train and serve¶
Train a model with the default config:
If you want to run on CPU or tweak a parameter, override the config on the CLI:
Run the embedding API with Uvicorn serving FastAPI:
MODEL_PATH="models/all-MiniLM-L6-v2-mnrl-100k-balanced" uv run uvicorn src.mlops_project.api:app \
--host 0.0.0.0 --port 8000
Validate and iterate¶
Run the test suite with pytest:
Serve docs locally with MkDocs: