|
9 | 9 | Community-built skills that turn Claude into a senior data architect — for modeling, platforms, cloud, AI, and modernization. |
10 | 10 |
|
11 | 11 | [](https://github.com/wjlgatech/data-architecture/actions/workflows/ci.yml) |
12 | | -[](https://github.com/wjlgatech/data-architecture/tree/main/skills) |
| 12 | +[](https://github.com/wjlgatech/data-architecture/tree/main/skills) |
13 | 13 | [](https://github.com/wjlgatech/data-architecture/graphs/contributors) |
14 | 14 | [](CONTRIBUTING.md) |
15 | 15 | [](LICENSE) |
@@ -70,41 +70,51 @@ Once installed, Claude responds to built-in commands: |
70 | 70 |
|
71 | 71 | ## 🗺️ Roadmap |
72 | 72 |
|
73 | | -| Day | Module | Skills | Status | |
74 | | -|-----|--------|--------|--------| |
| 73 | +| Day | Module | Commands | Status | |
| 74 | +|-----|--------|----------|--------| |
| 75 | +| 0️⃣ | **Skill Orchestrator** | `discover-client`, `assess-maturity`, `orchestrate-engagement`, `translate-for-stakeholder`, `estimate-effort` | ✅ **Active** | |
75 | 76 | | 1️⃣ | **Intro to Data Architecture & Modeling** | `design-model`, `choose-architecture`, `kpi-catalog`, `audit-vault`, `dimension-map` | ✅ **Active** | |
76 | | -| 2️⃣ | **New Data Management** | `data-quality`, `master-data`, `governance`, `metadata-catalog` | 🔨 Building | |
77 | | -| 3️⃣ | **Cloud Data & Technology** | `platform-selector`, `snowflake-patterns`, `databricks-patterns`, `lakehouse-design` | 📋 Planned | |
78 | | -| 4️⃣ | **Data Intelligence & AI** | `ml-feature-store`, `rag-architecture`, `ai-governance`, `model-ops` | 📋 Planned | |
79 | | -| 5️⃣ | **Data Modernization** | `migration-planner`, `legacy-assessment`, `modernization-roadmap` | 📋 Planned | |
| 77 | +| 2️⃣ | **Data Management** | `design-mdm`, `check-data-quality`, `governance-check`, `lifecycle-plan`, `security-review` | ✅ **Active** | |
| 78 | +| 3️⃣ | **Cloud Data & Technology** | `design-cloud-platform`, `design-data-platform`, `design-ingestion-pipeline`, `design-api-layer`, `multi-region-plan` | ✅ **Active** | |
| 79 | +| 4️⃣ | **Data Intelligence, Analytics & AI** | `analyze-big-data`, `design-nlp-pipeline`, `build-mlops-pipeline`, `design-realtime-intelligence`, `responsible-ai-review` | ✅ **Active** | |
| 80 | +| 5️⃣ | **Data Strategy & GenAI** | `design-genai-architecture`, `data-strategy-alignment`, `build-data-product`, `modernization-roadmap`, `operating-model-design` | ✅ **Active** | |
80 | 81 |
|
81 | | -**We build one module per day. PRs are merged daily. Join and ship something.** |
| 82 | +**6 skills · 30 commands · full 5-day curriculum complete. PRs welcome to extend any module.** |
82 | 83 |
|
83 | 84 | --- |
84 | 85 |
|
85 | 86 | ## 📁 Repository Structure |
86 | 87 |
|
87 | 88 | ``` |
88 | 89 | data-architecture/ |
89 | | -├── skills/ # 🧠 Claude skills (one folder = one skill module) |
90 | | -│ ├── day1-modeling/ # Data modeling: Vault, Star, 3NF, AUDM |
91 | | -│ │ ├── SKILL.md # Main Claude instructions (paste into system prompt) |
92 | | -│ │ ├── metadata.json # Skill metadata, version, tags |
93 | | -│ │ ├── commands/ # Slash command definitions |
94 | | -│ │ └── references/ # Deep reference material |
95 | | -│ ├── day2-data-management/ # Placeholder — contribute here! |
96 | | -│ ├── day3-cloud/ # Placeholder — contribute here! |
97 | | -│ ├── day4-ai-analytics/ # Placeholder — contribute here! |
98 | | -│ └── day5-modernization/ # Placeholder — contribute here! |
| 90 | +├── skills/ # 🧠 Claude skills (one folder = one skill module) |
| 91 | +│ ├── skill-orchestrator/ # Meta-skill: client intake, maturity, engagement orchestration |
| 92 | +│ ├── day1-modeling/ # Data modeling: Vault, Star, 3NF, AUDM |
| 93 | +│ │ ├── SKILL.md # Main Claude instructions (paste into system prompt) |
| 94 | +│ │ ├── metadata.json # Skill metadata, version, tags |
| 95 | +│ │ ├── commands/ # Slash command definitions |
| 96 | +│ │ └── references/ # Deep reference material |
| 97 | +│ ├── day2-data-management/ # MDM, Data Quality, Governance, Lifecycle, Security |
| 98 | +│ ├── day3-cloud-data/ # Cloud platforms, Lakehouse, FHIR, multi-region |
| 99 | +│ ├── day4-analytics/ # Big data, clinical NLP, MLOps, real-time, responsible AI |
| 100 | +│ ├── day5-strategy/ # GenAI/RAG, data products, modernization, operating model |
| 101 | +│ └── index.json # Machine-readable skill registry |
99 | 102 | │ |
100 | | -├── schemas/ # 🔒 JSON schemas for CI validation |
101 | | -├── templates/ # 🧩 Copy-paste starters for new skills |
102 | | -├── examples/ # 📖 Real case studies |
103 | | -│ └── newlife-pharmacy/ # Pharma supply chain (Day 1 case study) |
104 | | -├── docs/ # 📚 Architecture decisions, specs |
105 | | -├── tests/ # ✅ Validation scripts (run by CI) |
106 | | -├── scripts/ # 🛠️ CLI tooling |
107 | | -└── .github/ # ⚙️ Workflows, issue/PR templates |
| 103 | +├── knowledge-base/ # 📚 Cross-skill shared domain knowledge |
| 104 | +│ ├── healthcare-standards.md # HL7 FHIR, ICD-10, LOINC, SNOMED |
| 105 | +│ ├── cloud-platform-patterns.md |
| 106 | +│ ├── analytics-patterns.md |
| 107 | +│ └── genai-data-patterns.md |
| 108 | +│ |
| 109 | +├── schemas/ # 🔒 JSON schemas for CI validation |
| 110 | +├── templates/ # 🧩 Copy-paste starters for new skills |
| 111 | +├── examples/ # 📖 Real case studies (interactive HTML) |
| 112 | +│ ├── newlife-pharmacy/ # Pharma supply chain — Day 1 |
| 113 | +│ └── newlife-hospital/ # Healthcare HIS — Days 2–5 |
| 114 | +├── docs/ # 📄 Architecture decisions, specs |
| 115 | +├── tests/ # ✅ Validation scripts (run by CI) |
| 116 | +├── scripts/ # 🛠️ CLI tooling |
| 117 | +└── .github/ # ⚙️ Workflows, issue/PR templates |
108 | 118 | ``` |
109 | 119 |
|
110 | 120 | --- |
@@ -197,22 +207,75 @@ npm run validate |
197 | 207 |
|
198 | 208 | --- |
199 | 209 |
|
200 | | -### Coming Soon |
| 210 | +--- |
| 211 | + |
| 212 | +### Day 3 · Cloud Data Platform — Azure Medallion Lakehouse |
| 213 | + |
| 214 | +> *NewLife Hospital — Multi-region healthcare data platform, FHIR R4 API, Medallion Lakehouse, 90+ countries* |
| 215 | +
|
| 216 | +**One-line verdict:** Azure Medallion Lakehouse (Bronze/Silver/Gold) on Delta Lake — the only pattern that handles FHIR R4 streaming ingestion, multi-jurisdictional data residency, and clinical AI feature serving from a single coherent architecture. |
| 217 | + |
| 218 | +| Dimension | Decision | |
| 219 | +|---|---| |
| 220 | +| Platform | **Azure** — ADF, Event Hub, Databricks, Delta Lake, Synapse, ADLS Gen2 | |
| 221 | +| Architecture | Medallion Lakehouse — Bronze (raw FHIR) → Silver (cleaned) → Gold (marts) | |
| 222 | +| APIs | FHIR R4 with SMART on FHIR OAuth 2.0, geo-load balancing, 99.9% SLA | |
| 223 | +| Multi-Region | Hub-and-spoke — 5 regional nodes, data residency enforcement per GDPR/PIPL/PDPA | |
| 224 | +| Security | Zero Trust, Private Endpoints, Azure Purview RBAC, field-level encryption | |
| 225 | +| Clinical AI | Predictive sepsis, NLP discharge summaries, imaging triage — all within Medallion Gold | |
| 226 | + |
| 227 | +**[▶ Open Interactive Solution →](https://htmlpreview.github.io/?https://github.com/wjlgatech/data-architecture/blob/main/examples/newlife-hospital/newlife-hospital-day3-solution.html)** |
| 228 | + |
| 229 | +--- |
| 230 | + |
| 231 | +### Day 4 · Data Intelligence, Analytics & AI |
| 232 | + |
| 233 | +> *NewLife Hospital — Clinical NLP, Medical Imaging AI, Real-time Sepsis Alerting, MLOps, $2M→$6M Year-1 ROI* |
| 234 | +
|
| 235 | +**One-line verdict:** Lambda architecture for batch + streaming analytics, with a unified MLOps platform (MLflow + Databricks) that governs clinical models from FDA SaMD Class II compliance to bedside alerting in under 60 seconds. |
| 236 | + |
| 237 | +| Dimension | Decision | |
| 238 | +|---|---| |
| 239 | +| Big Data | Lambda architecture — Spark batch (Databricks) + Kafka/Event Hub streaming | |
| 240 | +| Clinical NLP | spaCy + Med7 + BERT-clinical pipeline: 92%+ F1 on entity extraction | |
| 241 | +| Imaging AI | CNN + ViT ensemble, 3-stage review workflow, FDA SaMD Class II governance | |
| 242 | +| Real-time | NEWS2 sepsis score — Kafka → Feature Store → model inference → alert in <60s | |
| 243 | +| MLOps | MLflow + AzureML: Experiment → Train → Validate → Deploy → Monitor → Retrain | |
| 244 | +| Responsible AI | Bias audit, GDPR Art. 22 human-in-loop, FDA SaMD classification, explainability | |
| 245 | +| ROI | Year 1: $2M invest → $6M return · Year 2: $4M → $16M · Year 3: $8M → $40M | |
| 246 | + |
| 247 | +**[▶ Open Interactive Solution →](https://htmlpreview.github.io/?https://github.com/wjlgatech/data-architecture/blob/main/examples/newlife-hospital/newlife-hospital-day4-solution.html)** |
| 248 | + |
| 249 | +--- |
| 250 | + |
| 251 | +### Day 5 · Data Strategy, GenAI & Final Blueprint |
| 252 | + |
| 253 | +> *NewLife Hospital — RAG pipeline, Data Products, $127M NPV business case, 5-year operating model* |
| 254 | +
|
| 255 | +**One-line verdict:** A GenAI Clinical Intelligence Platform built on Retrieval-Augmented Generation, with PHI de-identification gate, vector store serving 200M+ patient records, and a federated data product marketplace — all governed by a CDO-led operating model with a measurable $127M NPV over 5 years. |
| 256 | + |
| 257 | +| Dimension | Decision | |
| 258 | +|---|---| |
| 259 | +| GenAI Architecture | RAG pipeline — PHI De-ID → Chunking → Embedding → Vector Store → LLM → Audit | |
| 260 | +| Vector Store | Azure AI Search (hybrid dense + sparse) — HIPAA-compliant, 200M+ patient records | |
| 261 | +| Data Products | Federated marketplace — 12 certified products across Clinical, Ops, Finance, Research | |
| 262 | +| Modernization | Legacy EHR → Cloud: Assess (3I) → Lift-and-Shift → Re-platform → Re-architect | |
| 263 | +| Operating Model | CDO → Data Domains → Product Owners → Engineers · Hub-and-Spoke federated | |
| 264 | +| Business Case | $127M NPV, 287% ROI, 18-month payback — board-ready financial model | |
201 | 265 |
|
202 | | -| Day | Module | Status | |
203 | | -|---|---|---| |
204 | | -| Day 3 | Cloud Data & Technology (Snowflake · Databricks · Azure Synapse · Lakehouse) | 🔜 Building | |
205 | | -| Day 4 | Data Intelligence, Analytics & AI (ML Feature Store · RAG · AI Governance) | 📋 Planned | |
206 | | -| Day 5 | Data Modernization (Legacy Assessment · Migration Playbooks · Modernization Roadmap) | 📋 Planned | |
| 266 | +**[▶ Open Interactive Solution →](https://htmlpreview.github.io/?https://github.com/wjlgatech/data-architecture/blob/main/examples/newlife-hospital/newlife-hospital-day5-solution.html)** |
207 | 267 |
|
208 | 268 | --- |
209 | 269 |
|
210 | 270 | ## 📖 Case Studies |
211 | 271 |
|
212 | | -| Case Study | Domain | Skills Used | Link | |
| 272 | +| Case Study | Domain | Days | Link | |
213 | 273 | |---|---|---|---| |
214 | | -| NewLife Pharmacy Supply Chain | Pharmaceutical D2P | Data Vault 2.0, KPI Catalog, 30+ KPIs | [View →](examples/newlife-pharmacy/) | |
215 | | -| NewLife Hospital Unified HIS | Healthcare MDM + Governance | Federated MDM, GDPR/HIPAA, Zero Trust | [View →](examples/newlife-hospital/) | |
| 274 | +| NewLife Pharmacy Supply Chain | Pharmaceutical D2P | Day 1 | [View →](examples/newlife-pharmacy/) | |
| 275 | +| NewLife Hospital — Data Management | Healthcare MDM + Governance | Day 2 | [View →](examples/newlife-hospital/) | |
| 276 | +| NewLife Hospital — Cloud Platform | Healthcare Lakehouse + FHIR | Day 3 | [View →](examples/newlife-hospital/) | |
| 277 | +| NewLife Hospital — Analytics & AI | Clinical NLP, MLOps, Sepsis AI | Day 4 | [View →](examples/newlife-hospital/) | |
| 278 | +| NewLife Hospital — Strategy & GenAI | RAG, Data Products, $127M NPV | Day 5 | [View →](examples/newlife-hospital/) | |
216 | 279 |
|
217 | 280 | --- |
218 | 281 |
|
|
0 commit comments