Every data pipeline starts with a choice: how do we connect the components? The topology — the shape of connections between sources, processors, and sinks — determines how data flows, how failures propagate, and how hard the system is to change. Three patterns dominate: star, mesh, and event-driven (sometimes called event pipeline). Each works well in certain conditions and fails in others. This guide compares them from a workflow perspective, not just a diagram. We'll walk through where each pattern fits, where it breaks, and how to decide without over-engineering.
1. Why Topology Matters More Than You Think
Pipeline topology isn't just an architectural detail; it shapes how teams collaborate, how fast they can add new sources, and how much downtime they accept. In a star topology, a central orchestrator manages all data movement. Think of a single scheduler that pulls from databases, runs transformations, and loads into a warehouse. This pattern is intuitive and easy to debug — everything runs through one choke point. But that choke point is also a single point of failure and a bottleneck for scaling.
Mesh topology flips the model: each component communicates directly with others, often through a shared data layer like a message bus or object store. There's no central brain. Teams can add new pipelines independently, which speeds up development in large organizations. However, without careful governance, mesh can devolve into a tangled web of point-to-point connections that are impossible to monitor.
Event pipeline topology (event-driven) treats data as a stream of events. Components publish and subscribe to topics, and each piece reacts when relevant data arrives. This is the natural fit for real-time use cases — fraud detection, IoT, live dashboards. But it introduces complexity around ordering, state management, and exactly-once semantics.
We've seen teams adopt a pattern because it's popular ("everyone uses Kafka") only to discover it doesn't match their workload. The cost of a wrong topology isn't just rework; it's months of delayed insights and brittle systems. This guide gives you a decision framework based on data volume, latency requirements, team size, and tolerance for operational complexity.
Who should read this? Data engineers evaluating a new pipeline, architects designing a data platform from scratch, and tech leads who want to communicate trade-offs to stakeholders. We assume you know the basics of ETL, streaming, and message queues. What we add is a structured comparison with concrete scenarios — no invented statistics, just patterns we've observed across many real-world projects.
2. Foundations: What Each Topology Actually Does
Before comparing, we need a clear definition of each topology in terms of data flow, not just network diagrams. Let's strip away the buzzwords and look at the core mechanism.
Star Topology: Central Orchestrator
In a star, one coordinator — often an orchestrator like Airflow, Prefect, or a custom scheduler — controls the sequence of tasks. It knows every source, transformation, and sink. The orchestrator polls or receives triggers, then dispatches work to workers. All data passes through the orchestrator's metadata store or is referenced via pointers. This is the classic batch ETL pattern. It's easy to reason about: you can see the entire DAG, retry failed steps, and enforce dependencies. The downside: the orchestrator becomes a throughput bottleneck and a single point of failure. If it goes down, nothing runs. Scaling means scaling the orchestrator, which is not always linear.
Mesh Topology: Decentralized Peer-to-Peer
Mesh topology connects components directly, often via a shared data layer like S3, GCS, or a distributed filesystem. Each data producer writes to a known location; each consumer reads from where it needs. There's no central scheduler. Teams own their pipelines end-to-end. This pattern scales well because there's no single bottleneck. But it requires strong conventions: naming, schema evolution, and data quality checks must be standardized across teams. Without that, mesh becomes a nightmare of broken dependencies and silent data loss. It's common in organizations that use a data lakehouse with medallion architecture (bronze/silver/gold).
Event Pipeline Topology: Publish-Subscribe
Event pipelines use a message broker (Kafka, Pulsar, Kinesis) as the backbone. Producers publish events to topics; consumers subscribe and process in real-time or near-real-time. The broker decouples producers from consumers, allowing each to scale independently. This is the go-to for streaming data, but it also works for batch if you treat events as records. The challenge is state: event pipelines are inherently stateless between messages, so aggregations, joins, and windowing require external state stores or stream processors (Flink, Kafka Streams). Ordering is guaranteed only within a partition, so you must design your key strategy carefully.
Each topology has a sweet spot. Star is best for scheduled batch jobs with clear dependencies and moderate data volume (gigabytes per run). Mesh fits large organizations with many autonomous teams and data volumes that exceed a single orchestrator's capacity. Event pipelines are essential when latency matters — sub-second to minutes — and when data arrives continuously.
3. Patterns That Usually Work
After seeing dozens of pipeline implementations, certain combinations of topology and workload consistently succeed. Here are three patterns that rarely disappoint.
Pattern 1: Star for Compliance-Heavy Batch
When you need auditable, repeatable runs — financial reporting, regulatory data submissions, payroll — star topology is your friend. The central orchestrator logs every attempt, every retry, and every parameter. You can prove that the data was processed exactly as designed. Teams often pair star with a scheduler that supports backfill and parameterized runs. The trade-off is that adding a new source means updating the orchestrator's DAG, which can be a bottleneck if many teams depend on the same scheduler. But for a single team with a clear SLA, star is simple and reliable.
Pattern 2: Mesh for Multi-Team Data Lakes
In organizations where each team owns its data products — think a marketing team producing customer segments, a finance team producing revenue aggregates — mesh topology scales naturally. Each team writes to a shared data lake with agreed-upon schemas. Consumers discover datasets through a catalog (like DataHub or Amundsen). No central team needs to approve every pipeline change. The key enabler is a strong data contract: each dataset must have documented schema, freshness SLA, and ownership. Without contracts, mesh becomes chaos. But with them, it enables parallel development and reduces coordination overhead.
Pattern 3: Event Pipeline for Real-Time Features
If your use case demands action within seconds — fraud alerts, personalization, inventory updates — event pipeline is the only realistic choice. The broker decouples ingestion from processing, so you can add consumers without changing producers. This pattern works well when data volume is high (millions of events per second) and each event is small. The operational cost is higher than star: you need to manage partitions, consumer lag, and exactly-once semantics. But for real-time, there's no substitute.
These patterns are not mutually exclusive. Many mature data platforms combine them: a star orchestrator for nightly batch, an event pipeline for streaming, and a mesh for team-owned datasets. The trick is knowing which part of your pipeline needs which shape.
4. Anti-Patterns and Why Teams Revert
Even experienced teams fall into traps. Here are three common anti-patterns and why they force painful rewrites.
Anti-Pattern 1: Star for High-Volume Streaming
Some teams try to force a star orchestrator to handle streaming data by running frequent short polls. This creates a thundering herd problem: every poll triggers a DAG scan, and the orchestrator's database becomes a bottleneck. We've seen Airflow deployments collapse under thousands of DAG runs per minute. The fix is to use an event pipeline for streaming and reserve star for batch. But teams often start with star because it's familiar, only to rebuild later at great cost.
Anti-Pattern 2: Mesh Without Governance
Mesh topology without a data catalog and schema enforcement quickly turns into a spider web. Teams create ad-hoc tables in the data lake, with inconsistent naming and no documentation. Consumers can't find datasets, or they assume a column means something different than what the producer intended. The result: data quality erodes, and trust in the data platform collapses. Reverting to a centralized model is painful because everyone has built dependencies on undocumented sources. The lesson: mesh requires investment in metadata management upfront.
Anti-Pattern 3: Event Pipeline for Infrequent Batch
Event pipelines add latency and complexity for workloads that only run once a day. If you process a daily CSV file, you don't need Kafka. Yet some teams adopt event-driven architecture because it's trendy, adding a broker, stream processors, and schema registry for a job that could run in a simple cron. The operational overhead (monitoring consumer lag, handling schema evolution, managing partitions) outweighs any benefit. These teams often revert to star or a simple script.
Another common mistake is mixing topologies without clear boundaries. For example, using a star orchestrator to trigger event consumers, creating a dependency that defeats the decoupling of event pipelines. Keep each topology's responsibility clear: star for orchestration, event for streaming, mesh for shared datasets.
5. Maintenance, Drift, and Long-Term Costs
Choosing a topology is not a one-time decision. Over years, pipelines drift as teams add features, change data sources, and lose institutional knowledge. Each topology has different maintenance profiles.
Star: Centralized Debt
In a star system, the orchestrator's DAGs accumulate technical debt. Jobs are added with hardcoded paths, inconsistent retry logic, and undocumented parameters. Over time, the DAG becomes a monolith that no one fully understands. Maintenance cost grows superlinearly with the number of tasks. The fix is to enforce code review and modular DAG design, but many teams skip this under deadline pressure.
Mesh: Governance Drift
Mesh topologies suffer from governance drift. Initial data contracts are well-defined, but as teams turn over, new producers skip the catalog and write directly to the lake. Consumers start to trust datasets less, so they build their own copies, leading to data duplication and inconsistency. The long-term cost is a loss of a single source of truth. Mitigation requires automated schema validation and periodic audits.
Event Pipeline: Configuration Sprawl
Event pipelines generate configuration sprawl: topics, partitions, consumer groups, stream processors, state stores. Each configuration parameter is a potential misconfiguration. Over time, unused topics accumulate, consumer groups drift, and schema versions multiply. The operational burden of keeping the broker healthy (rebalancing, monitoring lag, upgrading) is significant. Teams need dedicated platform engineers for event pipelines, which is a cost often underestimated.
All topologies require investment in monitoring, alerting, and documentation. But the type of investment differs: star needs DAG refactoring, mesh needs catalog hygiene, event needs broker ops. Budget for these ongoing costs when choosing a topology.
6. When Not to Use Each Topology
Knowing when to avoid a topology is as important as knowing when to use it. Here are clear red lines for each.
Avoid Star When...
- Data volume exceeds what a single orchestrator can schedule (e.g., >10,000 tasks per hour).
- You need sub-minute latency. Star adds scheduling overhead that makes real-time impossible.
- Multiple autonomous teams need to add pipelines without central approval. Star creates a bottleneck.
Avoid Mesh When...
- Your organization lacks a data catalog or schema enforcement tool. Mesh without governance fails.
- Data consumers need strict consistency guarantees (e.g., exactly-once, transactional reads). Mesh's eventual consistency model can cause issues.
- Team sizes are small (fewer than 3 teams). The overhead of contracts and discovery outweighs benefits.
Avoid Event Pipeline When...
- Your data arrives in daily batches and latency doesn't matter. The complexity is not justified.
- You don't have operational expertise for message brokers. Kafka requires dedicated ops.
- Your processing logic requires complex stateful joins across multiple streams. While possible, it's hard to get right.
These are not absolute rules, but they serve as warning signs. If you find yourself in one of these situations, consider a hybrid approach or a different primary topology.
7. Open Questions and FAQ
Even after comparing topologies, several questions remain open. Here we address common points of confusion.
Can I combine star and event pipeline?
Yes, and many do. For example, use a star orchestrator to run nightly batch jobs that produce aggregate tables, while an event pipeline handles real-time ingestion. The key is to keep the boundaries clear: the star should not orchestrate event consumers, and the event pipeline should not depend on the star's schedule. Use separate infrastructure where possible.
What about hybrid mesh-event?
This is common in data mesh implementations where each domain publishes events to a broker, and consumers subscribe. The broker acts as the shared data layer, but each team owns its topics. This works well if the broker is managed centrally and teams follow schema conventions. The risk is that the broker becomes a bottleneck or a single point of failure.
How do I choose between Kafka and Pulsar for event pipelines?
Both are viable. Kafka has a larger ecosystem and more tooling, but Pulsar offers better multi-tenancy and geo-replication out of the box. The choice often comes down to team expertise. If your team knows Kafka, stick with it. If you're starting fresh and expect many tenants, consider Pulsar.
Is star topology obsolete?
Not at all. For many batch workloads, star is the simplest and most reliable option. It's not obsolete; it's specialized. The mistake is using it for everything. Keep star for what it's good at: scheduled, auditable, moderate-volume batch.
How do I migrate from one topology to another?
Migration is risky. The safest approach is the strangler pattern: run both topologies in parallel, gradually moving workloads. Start with low-risk, non-critical pipelines. Invest in testing and rollback plans. Expect the migration to take months, not weeks.
8. Summary and Next Experiments
Choosing a pipeline topology is a strategic decision that affects how your team works, how fast you can deliver insights, and how much you spend on operations. Star, mesh, and event pipelines each have a place. The key is to match the topology to your data characteristics, latency needs, and team structure.
Here are concrete next steps to apply what you've learned:
- Audit your current pipelines. Map each pipeline to a topology. Identify mismatches — for example, a star pipeline that tries to handle streaming data, or a mesh pipeline without a catalog.
- Define your primary workload profile. Is it batch, streaming, or mixed? What is the average data volume per run? What latency do stakeholders expect? Write these down.
- Run a small experiment. If you're considering a new topology, pick a low-risk pipeline and rebuild it in the new pattern. Measure development time, operational burden, and stakeholder satisfaction.
- Invest in governance early. Whether you choose mesh or event, invest in schema registry, data catalog, and monitoring from day one. Retrofitting governance is much harder.
- Plan for evolution. No topology is permanent. Revisit your choice every 6–12 months as data volume and team structure change. Build in flexibility to migrate.
Remember, the goal is not to pick the perfect topology forever, but to pick one that works now and can evolve. Start with the simplest topology that meets your requirements, and add complexity only when the data proves it necessary. That's the pragmatic path through the process galaxies.
Comments (0)
Please sign in to post a comment.
Don't have an account? Create one
No comments yet. Be the first to comment!