DCNTCourse

Data contracts for data platforms

Lessons8modules
Total80mfull study
Quick7mtrailer
Projects8docker labs

hello-data-contract · ODCS v3 lint + diff in 5 minutes

Author a minimal ODCS v3.1.0 contract YAML for a batch orders table, validate it with datacontract-cli lint, then diff it against a deliberately broken v2 contract to observe exactly which field change is flagged as breaking.

snap/data-contracts:hello
Stack
datacontract-cli@0.10.xpython@3.12odcs@3.1.0
Real-world use

Every data platform team that onboards a new dataset faces the same cold-start problem: no canonical contract template, no lint baseline, and no automated way to catch breaking changes before a PR merges. A new producer team hand-writes a schema YAML, a consumer team writes a different one, and within weeks the two diverge silently. By establishing a lint-passing ODCS v3.1.0 contract on day one — with a diff command that flags breaking field removals or type changes — the team gets a single authoritative artifact they can fork for every subsequent dataset, and a repeatable gate that catches breaking changes before they reach review.

Portfolio value

Hiring managers reviewing a data engineering portfolio rarely see candidates who understand schema governance at the contract level rather than the migration-script level. A project that produces a lint-passing ODCS v3.1.0 YAML — with a documented diff output showing exactly which field change triggers a breaking flag — signals that the candidate understands the Open Data Contract Standard as a Linux Foundation specification, knows how to operate datacontract-cli as a governance tool, and has internalized the producer-consumer contract model that underpins dbt Mesh, Schema Registry, and every downstream project in this course. It is the skeleton every other portfolio artifact in this course builds on.

Builds on lessons
Lesson 1
Build plan
  1. Author an ODCS v3.1.0 contract YAML for a batch orders table, populating the fundamentals block (id, version, status, owner), the schema block with at least five typed columns (order_id, customer_id, order_total, order_date, status), and a minimal quality block with one SQL-based completeness rule.
  2. Run datacontract-cli lint against the v3.1.0 contract and iterate on any validation errors until the command exits zero, documenting each error message and its fix in the project README.
  3. Create a second contract file representing a breaking v2 change — remove one required column and rename another — keeping the same dataContractSpecification version so the diff is a semantic comparison, not a spec-version upgrade.
  4. Run datacontract-cli breaking between the v3.1.0 baseline and the breaking v2 file, capture the stdout output showing which field changes are flagged, and annotate the output in the README to explain why each change is classified as breaking under the ODCS spec.
  5. Add a datacontract-cli diff invocation that produces a human-readable change summary between the two contract versions, then write a one-page README section mapping each ODCS fundamentals, schema, and quality block to its real-world governance purpose so the project functions as a reference template for downstream projects.