Production-Grade Data Platform for 211 Social Services

Task

Replace fragile per-tenant pipelines with a single multi-tenant data platform where every normalized artifact is a first-class, auditable obligation.

Technologies

Dagster, DBT, Snowflake, OpenSearch, MongoDB, Postgres, HSDS-style normalization, iCarol, WellSky, RTM source connectors.

Result

A production-grade social services data platform where failures are isolated per tenant, lineage is preserved end-to-end, and freshness is guaranteed by the asset graph rather than by human vigilance.

Executive Summary

Built a multi-tenant data platform serving 211 social service organizations across multiple U.S. states
Ingested heterogeneous source systems (iCarol, WellSky, RTM) with independent schemas and update cadences
Replaced long-running pipelines with asset-based orchestration in Dagster
Treated every normalized dataset as a durable obligation rather than a transient by-product
Isolated failures per tenant via partitioning along tenant and state boundaries
Embedded data quality checks directly in the asset graph so bad data cannot publish silently

1. What is a 211 Social Services Data Platform?

A 211 social services data platform is the technical layer behind 211 helplines and search portals that connect people in need with local resources: food banks, shelters, mental health providers, utility assistance, and similar services. The platform ingests resource data from independent organizations operating in different states, each on its own case management system (iCarol, WellSky, RTM, or similar), and exposes a unified, searchable interface to the public and to partner agencies.

Connect211 is a production deployment of this model across multiple U.S. states. The platform must answer a deceptively simple question: when someone searches for help right now, is the result correct, current, and trustworthy?

Why a data platform, not a website

The hard part is not the front-end search experience. It is keeping resource data accurate as independent publishers update at different cadences, in different schemas, with different quality guarantees. The platform is, at its core, a multi-tenant data orchestration problem with strict reliability obligations.

2. Initial State: Working Pipelines, Eroding Trust

Before the rewrite, the platform ran on pragmatic, pipeline-shaped data flows. Source data was fetched, normalized, and pushed into publishing stores through a sequence of tasks. It worked, organizations adopted it, and impact grew.

Key limitations that emerged with scale

Data sources evolved independently: schemas drifted, update cycles diverged, quality varied across contributors
Pipelines grew longer over time, and reprocessing became broader than necessary
Validation moved from design to manual oversight, making correctness depend on human vigilance
Availability began to depend on institutional memory rather than architecture
When updates went wrong, the cause was opaque, and trust shifted toward individual heroics rather than the platform itself

For a system intended to support people under real-world pressure, this state was unsustainable. The problem was not a lack of data or compute. It was a lack of structural guarantees about what the platform was producing.

3. Project Objectives

The goal was not a new dashboard. It was to rebuild the platform so that reliability, freshness, and lineage were properties of the architecture, not of the team.

Core objectives

Treat every normalized artifact as a durable obligation with explicit freshness, lineage, and reproducibility guarantees
Isolate failures per tenant so that one organization's data issue cannot block publishing for another
Embed data quality checks in the orchestration layer so bad data cannot reach the publishing surface
Preserve the narrative of the data: every artifact carries provenance, every update has a reason
Make the system operable under uneven, event-driven demand without manual coordination becoming the bottleneck

4. Target Architecture: Writers, Readers, and the Asset Graph

We redesigned the platform around a clear separation between ingestion (writers) and publishing (readers), with a shared system of record underneath and asset-based orchestration coordinating everything.

Writers: per-tenant ingestion and transformation

Each tenant runs its own writer project that:

Fetches raw data from the source system through a connector adapter (iCarol, WellSky, RTM, or other)
Persists raw data verbatim into a Snowflake source schema
Normalizes the data using DBT ELT projects into standardized intermediate models
Applies enhancements such as geocoding or translations in DBT STAGE projects

Readers: publishing to search and discovery

Reader processes react to completed writer runs, either bulk or incremental, and publish curated artifacts to:

OpenSearch for full-text and faceted search
MongoDB for flexible document storage
Postgres for relational views used by partner integrations

Snowflake as the system of record

Snowflake holds the intermediate and normalized datasets. Raw source data lives in source schemas. Normalized data lives in standardized models. Both are first-class assets with lineage, not transient by-products of a pipeline.

Dagster as the orchestrator

Dagster coordinates the materialization of assets rather than the execution of tasks. DBT is used explicitly for set-based transformations and is not used as an orchestrator.

DBT transforms data. Dagster decides what data must exist, when, and why.

5. Source Heterogeneity as the Core Constraint

The hardest constraint was not throughput or storage. It was heterogeneity. Each data provider differed along several axes simultaneously, and those differences dominated the design.

Where sources differed

Schema shape: even when nominally the same entities existed, fields varied in name, type, and presence
Semantics: identical field names often meant subtly different things across publishers
Update cadence: some sources updated continuously, others weekly or ad hoc
Quality guarantees: missing fields, stale records, and partial exports were common

Treating ingestion as a uniform, pipeline-shaped process led to brittle assumptions and cross-tenant coupling. The system only became manageable once heterogeneity was treated as fundamental rather than incidental. Every writer is a separate project, on its own schedule, with its own connector adapter.

6. HSDS Normalization as an Architectural Contract

Normalization into an HSDS-style model was not implemented as a downstream convenience. It became an architectural contract that all downstream consumers rely on, whether they know it or not.

What the contract guarantees

Stable fields with predictable types
Predictable relationships between entities
Documented semantics for every attribute

Because every reader depends on this contract, normalization cannot be best-effort or deferred to the end of a pipeline. The implementation enforces it through a strict layering:

Raw source data is written verbatim into Snowflake source schemas (no interpretation)
DBT ELT projects transform raw data into standardized intermediate models (the shared shape)
DBT STAGE projects apply tenant-specific adaptations while preserving the contract (the shared semantics)

This separation makes it explicit where interpretation happens. When a field is wrong in the normalized model, the question becomes which contract was violated, not what broke in the pipeline.

7. Transition to Asset-Based Orchestration

The shift to asset-based orchestration was driven less by tooling preference and more by a change in mental model. The question stopped being what jobs should run and started being what data must exist.

The shift in questions

What data artifacts must exist for this platform to be healthy?
What do those artifacts depend on?
How fresh do they need to be?
What constitutes success or failure for this specific artifact?

Dagster assets provided a way to encode those questions directly. A simplified example from a writer project shows how DBT models are treated as assets rather than opaque steps:

# writer-xyz/assets.py (excerpt)
from dagster_dbt import load_assets_from_dbt_project

dbt_elt_assets = load_assets_from_dbt_project(
    project_dir="dbt_elt",
    profiles_dir="dbt_elt",
)

This does not describe how DBT runs. It declares that the resulting tables are first-class assets with lineage and state. Once assets replaced jobs as the primary abstraction, freshness, lineage, and partial recomputation became explicit rather than implicit.

8. Partitioning for Failure Isolation

Partitioning was critical for isolating failures. We partitioned primarily along tenant and state boundaries rather than time, because operational reality showed that data issues almost always affected a single organization or region.

How partitioning works in practice

Separate writer projects per tenant
Independent schedules and sensors per tenant
Asset materializations scoped to each tenant's data domain

A failure in one writer no longer blocks publishing for others. Remediation is targeted and auditable.

This had a side effect that mattered just as much: it made the cost of adding a new tenant predictable. New organizations could be onboarded without increasing shared operational risk.

9. Data Quality Embedded in the Asset Graph

Data validation moved into the asset graph itself. Instead of post-hoc checks, validations became explicit dependencies. If a validation asset failed, downstream assets simply did not materialize.

An example pattern used across writers:

@asset
def validate_staging_tables(staging_tables):
    assert staging_tables.count_missing_ids() == 0

The check itself is intentionally simple. The key point is not the assertion but that failure is structural. The system records that an expected artifact does not exist, rather than silently publishing bad data. This shifted failure detection earlier and reduced the blast radius of errors.

What this changes in practice

Bad data is detected at the moment the contract is violated, not at the moment a user notices it
Publishing surfaces only see assets that have passed their dependencies
Every quality check is itself an asset with lineage, so its history is auditable

10. Operational Outcomes

Day-to-day operations changed in several concrete ways once the asset graph was in place.

On-call work shifted from rerunning pipelines to inspecting asset lineage
Partial backfills became routine rather than exceptional, scoped to a single tenant's partitions
Publishing delays were easier to attribute to specific upstream causes because the graph shows what depended on what
New tenants could be added without increasing shared operational risk
Auditability improved because every artifact carries its provenance and every update has a recorded reason

None of this eliminated operational effort. It made that effort more focused and less reactive.

11. Open Trade-offs and Unresolved Questions

Some challenges remain unresolved and shape the next iteration of the platform.

Where work continues

Cross-tenant schema evolution still requires coordination and discipline
Observability across Snowflake, DBT, Dagster, and downstream stores is fragmented
Cost attribution at the asset level is still coarse-grained
Human review remains necessary for certain semantic validations that cannot be expressed as assertions

With more time, the next investments are unified observability across the stack and more formal schema versioning across tenants.

12. Why These Lessons Matter Beyond This Platform

The lessons from Connect211 are not unique to social services data. Any platform operating in civic tech, govtech, environmental data, or public health shares similar constraints: multiple data producers, uneven quality, and real-world consequences for failure.

The core takeaway is not “use asset-based orchestration.” The takeaway is to treat data artifacts as obligations. Once that shift happens, many architectural decisions become clearer.

Frequently Asked Questions

What is a 211 social services data platform?

A 211 social services data platform is the technical layer behind 211 helplines and search portals that connect people in need with local resources. It ingests resource data from independent organizations across many regions and exposes a unified, searchable interface to the public and to partner agencies.

What is asset-based orchestration?

Asset-based orchestration is a model where the orchestration layer reasons about data state rather than task execution. The system declares what data artifacts must exist, what they depend on, and how fresh they must be, then ensures those obligations are met as upstream conditions change.

Why use Dagster for social services data?

Dagster's asset-based model fits social services data because the platform's value depends on durable, auditable artifacts rather than transient pipeline outputs. Lineage, freshness, and partial recomputation are first-class concerns, which matches the operational reality of multi-tenant civic data.

How does DBT fit with Dagster in this stack?

DBT is used explicitly for set-based transformations, not orchestration. Dagster loads DBT models as assets, which gives every transformed table lineage, freshness policies, and explicit dependencies without coupling orchestration to SQL execution.

Why partition by tenant rather than by time?

Data issues in multi-tenant civic platforms almost always affect a single organization or region. Partitioning by tenant and state means one tenant's failure cannot block another's publishing, and backfills stay scoped to the affected partitions.

Can this pattern work for other civic data platforms?

Yes. The same pattern applies wherever multiple independent producers publish into a shared, public-facing surface with real consequences for staleness or inaccuracy: govtech, environmental monitoring, public health surveillance, and similar domains.

Architecture diagram of the Connect211 multi-tenant data platform: source systems (iCarol, WellSky, RTM) on the left, per-tenant Dagster writer projects feeding Snowflake source and normalized layers in the middle, and reader processes publishing curated artifacts to OpenSearch, MongoDB and Postgres on the right

Multi-Tenant Architecture

Per-tenant writer projects, shared normalized layer, and reader processes that publish to OpenSearch, MongoDB, and Postgres without cross-tenant coupling.

Asset-Based Orchestration

Dagster coordinates data obligations rather than tasks, so freshness, lineage, and partial recomputation are explicit properties of the system.

Lineage and Auditability

Every normalized artifact carries provenance. Every update has a reason. Stakeholders can trace any change back to its source.

Failure Isolation

Tenant and state partitioning keeps one organization's data issue from blocking another's publishing.

Data Quality in the Graph

Validation assets are explicit dependencies. Bad data cannot reach the publishing surface silently.

Predictable Onboarding

New tenants are added as separate writer projects, without increasing shared operational risk.

Contractual Normalization

HSDS-style normalization as an architectural contract, enforced through a strict source to ELT to STAGE layering in Snowflake.

Heterogeneous Sources

iCarol, WellSky, RTM, and similar systems are handled as distinct connector adapters, not as a uniform pipeline.

Production Stack

Dagster, DBT, Snowflake, OpenSearch, MongoDB, and Postgres form the production backbone of the platform.

Show all use cases