Technical

Real-Time App Development: WebSockets & SSE Guide

Real-time app development: WebSockets vs SSE vs WebRTC, architecture patterns, scaling strategies, and live experience implementation.

Dragan Gavrić
Dragan Gavrić Co-Founder & CTO
| · 10 min read
Real-Time App Development: WebSockets & SSE Guide

Real-Time Application Development: WebSockets, SSE, and Building Live Experiences

Users no longer tolerate stale data. They expect dashboards that update without refreshing, messages that arrive instantly, and collaborative tools that show changes as they happen. “Real-time” has gone from a feature differentiator to a baseline expectation.

But building real-time systems is fundamentally different from building request-response applications. The architecture changes. The scaling challenges multiply. The failure modes are more complex. This guide covers the technologies, patterns, and trade-offs involved in building real-time applications that actually work at scale.

What Qualifies as “Real-Time”?

The term “real-time” gets thrown around loosely. In practice, it describes a spectrum:

  • Hard real-time (sub-millisecond). Embedded systems, trading engines, industrial control. Not typically web application territory.
  • Soft real-time (milliseconds to low seconds). Chat, live collaboration, gaming, financial dashboards. This is where most web real-time applications live.
  • Near real-time (seconds to minutes). Notification systems, analytics dashboards, feed updates. Tolerable latency is higher, but users still expect updates without manual refresh.

The technology you choose depends on where your use case falls on this spectrum, how many concurrent connections you need to support, and the directionality of data flow.

Technology Comparison

Four primary technologies power real-time web applications. Each has distinct strengths and trade-offs.

WebSockets

WebSockets provide full-duplex, persistent connections between client and server over a single TCP connection. After an initial HTTP handshake, the connection upgrades to the WebSocket protocol, enabling bidirectional data flow with minimal overhead.

Strengths:

  • True bidirectional communication. Client and server can send data independently at any time.
  • Low latency. No HTTP overhead per message after the initial handshake.
  • Wide browser support. Every modern browser supports WebSockets.
  • Binary and text data. Supports both formats natively.

Weaknesses:

  • Stateful connections. Each client maintains a persistent connection, consuming server resources. This complicates horizontal scaling.
  • No automatic reconnection. You need to implement reconnection logic, buffering, and state synchronization on the client.
  • Proxy and firewall issues. Some corporate networks and older proxies don’t handle WebSocket connections well.
  • No built-in message ordering or delivery guarantees. You implement these yourself or use a library that does.

Best for: Chat, multiplayer gaming, collaborative editing, any scenario requiring high-frequency bidirectional messaging.

Server-Sent Events (SSE)

SSE is a simpler, HTTP-based protocol for server-to-client streaming. The server pushes data to the client over a standard HTTP connection using the text/event-stream content type.

Strengths:

  • Simple to implement. Built on standard HTTP — works with existing infrastructure, load balancers, and proxies without special configuration.
  • Automatic reconnection. The browser’s EventSource API handles reconnection natively, including last-event-ID tracking.
  • HTTP/2 multiplexing. SSE over HTTP/2 avoids the per-domain connection limit that plagued HTTP/1.1.
  • Text-based. Lightweight, easy to debug.

Weaknesses:

  • Unidirectional. Server to client only. Client-to-server communication still requires standard HTTP requests.
  • Text only. Binary data must be base64 encoded, adding overhead.
  • Connection limits on HTTP/1.1. Browsers limit the number of concurrent SSE connections to 6 per domain (resolved with HTTP/2).

Best for: Live dashboards, notification feeds, stock tickers, progress updates — any scenario where the server pushes data and the client mostly listens.

Long Polling

Long polling is the oldest approach. The client sends a request, the server holds it open until new data is available, responds, and the client immediately sends a new request.

Strengths:

  • Works everywhere. No special protocol support needed. Functions through any proxy, firewall, or load balancer.
  • Simple fallback. Useful when WebSockets and SSE aren’t available.

Weaknesses:

  • Higher latency. Each response-request cycle adds overhead.
  • Higher server resource usage. Each held connection consumes resources, and the overhead is greater than WebSockets or SSE.
  • Not truly real-time. There’s an inherent delay between events.

Best for: Fallback mechanism when other technologies aren’t available. Rarely the primary choice for new applications.

WebRTC

WebRTC enables peer-to-peer communication between browsers, primarily designed for audio, video, and arbitrary data channels.

Strengths:

  • Peer-to-peer. Data flows directly between clients without traversing the server (after signaling).
  • Ultra-low latency. Sub-100ms for audio/video.
  • Built-in encryption. DTLS/SRTP encryption is mandatory.

Weaknesses:

  • Complex setup. Requires signaling servers, STUN/TURN infrastructure, and NAT traversal.
  • Not suitable for server-originated data. The peer-to-peer model doesn’t fit most server-push scenarios.
  • Scaling challenges. True P2P doesn’t scale linearly; large groups need SFU (Selective Forwarding Unit) servers.

Best for: Video calling, voice chat, screen sharing, peer-to-peer file transfer, low-latency gaming.

Quick Decision Matrix

Requirement Technology
Bidirectional, high-frequency messaging WebSockets
Server-push only, moderate frequency SSE
Audio/video, peer-to-peer data WebRTC
Maximum compatibility, simple needs Long Polling
Binary data streaming WebSockets or WebRTC
Works behind restrictive corporate proxies SSE or Long Polling

Architecture Patterns for Real-Time Systems

The choice of technology is only the first decision. How you architect the system around that technology determines whether it scales.

Publish/Subscribe (Pub/Sub)

The most common pattern for real-time systems. Publishers emit events to named channels. Subscribers listen to channels they care about. The publisher and subscriber are decoupled — neither needs to know about the other.

This pattern maps naturally to features like:

  • Chat rooms (each room is a channel).
  • Live dashboards (each metric or data source is a channel).
  • Notification systems (per-user and global channels).

Implementation typically uses a message broker (Redis Pub/Sub, Kafka, or RabbitMQ) to distribute messages across server instances.

Event-Driven Architecture

Rather than request-response, the system reacts to events. A user action produces an event, which triggers downstream processing, which may produce additional events. This pattern works well when real-time updates are a consequence of business operations.

For example: a user places an order. The order service emits an OrderPlaced event. The inventory service updates stock, the notification service sends a confirmation, and the dashboard service updates the live order count. All happen concurrently, driven by the event.

CQRS (Command Query Responsibility Segregation)

CQRS separates the write path (commands) from the read path (queries). This is particularly useful for real-time systems because:

  • The write model can be optimized for consistency and business logic.
  • The read model can be optimized for speed and denormalized for efficient querying.
  • Real-time updates flow from the write side to the read side via events, naturally fitting the pub/sub pattern.

Combined with Event Sourcing, CQRS provides a complete audit trail and the ability to replay events — useful for debugging and building new projections from historical data.

Message Broker Selection

When you have multiple server instances, you need a message broker to distribute real-time events across all of them. The choice depends on your scale and requirements.

Redis Pub/Sub

  • Best for: Low-to-medium scale real-time applications. Simple pub/sub with no message persistence.
  • Strengths: Extremely fast (sub-millisecond latency), simple to set up, widely supported.
  • Limitations: No message persistence. If a subscriber is disconnected when a message is published, it’s lost. No consumer groups or replay capability.
  • Scale: Handles millions of messages per second on a single node. For higher scale, Redis Cluster distributes channels across nodes.

Apache Kafka

  • Best for: High-throughput event streaming with durability requirements. Systems that need message replay and consumer groups.
  • Strengths: Persistent, ordered, replayable event log. Consumer groups allow multiple services to process the same stream independently. Handles millions of events per second.
  • Limitations: Higher operational complexity. Higher latency than Redis (milliseconds vs. sub-millisecond). Overkill for simple real-time features.
  • Scale: Near-linear horizontal scaling. Production clusters routinely handle billions of events per day.

RabbitMQ

  • Best for: Complex routing requirements. Systems that need message queuing with delivery guarantees.
  • Strengths: Flexible routing (direct, topic, fanout, headers). Delivery acknowledgment. Dead letter queues for failed messages.
  • Limitations: Lower throughput than Kafka for streaming use cases. More complex configuration for pub/sub patterns.
  • Scale: Adequate for most applications. Clustering provides high availability but scaling throughput requires more careful architecture.

Practical Guidance

For most real-time web applications, start with Redis Pub/Sub. It’s simple, fast, and handles the vast majority of use cases. Move to Kafka when you need durability, replay, or are processing hundreds of thousands of events per second. Use RabbitMQ when you need sophisticated message routing and delivery guarantees.

Scaling WebSocket Connections

Scaling WebSocket connections is the most common challenge in real-time system development. HTTP requests are stateless — any server can handle any request. WebSocket connections are stateful — a client is connected to a specific server instance for the duration of the session.

The Sticky Session Problem

If you have four server instances behind a load balancer and a user connects to Server 2, messages intended for that user must reach Server 2. There are two approaches:

  1. Sticky sessions. The load balancer routes each client to the same server for the duration of their connection. Simple to implement (most load balancers support it) but creates uneven load distribution and complicates failover.
  2. Broadcast via message broker. Every server subscribes to the same message broker. When a message needs to reach a specific user, it’s published to the broker, and the server holding that user’s connection delivers it. This is the preferred approach for most applications.

Connection Pooling and Limits

Each WebSocket connection consumes server memory and a file descriptor. A typical Node.js server can handle 10,000-50,000 concurrent connections (depending on message frequency and processing). Java and Go servers can often handle more.

Strategies for pushing limits:

  • Minimize per-connection state. Keep connection metadata lean.
  • Use binary protocols (MessagePack, Protocol Buffers) instead of JSON to reduce message size and parsing overhead.
  • Implement heartbeat/ping-pong to detect and clean up dead connections quickly.
  • Horizontal scaling with a message broker for cross-instance communication.

Connection Recovery

Connections drop. Networks are unreliable. Mobile users switch between WiFi and cellular. Your client must handle:

  • Automatic reconnection with exponential backoff and jitter to avoid thundering herd problems.
  • State synchronization on reconnect. The client needs to know what it missed. This can be a simple “give me everything since timestamp X” or a more sophisticated approach using sequence numbers.
  • Offline buffering. If the client generates events while disconnected, buffer them locally and send on reconnection.

Conflict Resolution in Collaborative Systems

Real-time collaborative editing — where multiple users modify the same document simultaneously — is one of the hardest problems in distributed systems. Two approaches dominate.

Operational Transformation (OT)

OT was pioneered by Google Docs. Each user’s edit is represented as an operation (insert, delete). When concurrent operations conflict, they’re transformed so that applying them in any order produces the same result.

OT works well but is complex to implement correctly. The transformation functions must handle every possible combination of concurrent operations, and the complexity grows with the number of operation types.

Conflict-free Replicated Data Types (CRDTs)

CRDTs are data structures that are mathematically guaranteed to converge when updated concurrently by multiple users. No central server is needed to resolve conflicts — the data structure itself ensures consistency.

CRDTs are used by Figma, and libraries like Yjs and Automerge make them accessible for application developers. They’re generally simpler to reason about than OT and work naturally with peer-to-peer architectures.

When to use which: For most new collaborative editing features, CRDTs (via Yjs or Automerge) are the pragmatic choice. They’re well-supported, battle-tested, and don’t require a central coordination server.

Real-World Use Cases

Live Dashboards and Monitoring

SSE or WebSockets push metric updates to connected dashboards. The architecture is typically: data source emits events to a message broker, a gateway service subscribes to relevant channels and pushes updates to connected clients.

Chat and Messaging

WebSockets for message delivery. Message broker for distributing across server instances. Database for persistence and offline message delivery. Read receipts, typing indicators, and presence require additional pub/sub channels.

Real-Time Sports Scoring

Live sports scoring demands low latency and high reliability — scores must reach every connected device within seconds, and missed updates are immediately visible to users.

When we built the BELGRAND ScoreMaster mobile application — a system for controlling sports LED scoreboards via iOS and Android devices — real-time communication was the central technical challenge. Score updates entered on one device needed to propagate to the scoreboard display instantly and reliably, even in environments with inconsistent network connectivity (sports venues aren’t known for their WiFi). The solution combined persistent connections with local state buffering to ensure no score update was ever lost, even during brief connectivity gaps.

Collaborative Editing

CRDTs or OT for document state. WebSockets for change propagation. Cursor positions and selections shared via lightweight presence channels. Undo/redo must work correctly in a multi-user context (each user’s undo stack is independent).

Financial and Trading Data

Ultra-low-latency WebSocket connections. Binary protocols for efficiency. Data fan-out to thousands of concurrent users. Market data typically uses a hierarchical channel structure (exchange > instrument > data type) to allow granular subscriptions.

Mobile Considerations

Real-time on mobile introduces challenges that desktop applications don’t face.

  • Battery life. Persistent connections drain batteries. Use heartbeat intervals appropriate for the use case — a chat app needs frequent heartbeats, a dashboard can use longer intervals.
  • Network transitions. Users switch between WiFi and cellular. Your reconnection logic must handle these transitions gracefully.
  • Background restrictions. iOS and Android aggressively suspend background connections. For background notifications, use platform push services (APNs, FCM) instead of maintaining WebSocket connections.
  • Bandwidth sensitivity. Mobile data is metered and variable. Compress messages, batch low-priority updates, and provide users control over data usage where appropriate.

Testing Real-Time Systems

Real-time systems are harder to test than request-response applications because timing, ordering, and concurrency matter.

Unit Testing

Test message handlers and business logic in isolation. Mock the transport layer. Verify that events produce the correct state changes.

Integration Testing

Test the full pub/sub pipeline: publish an event, verify it reaches subscribers across multiple server instances. Test reconnection scenarios. Test message ordering under load.

Load Testing

Tools like Artillery, k6, and Gatling support WebSocket load testing. Key metrics to measure:

  • Connection establishment time under load.
  • Message delivery latency at various concurrency levels (p50, p95, p99).
  • Memory and CPU usage per connection.
  • Message loss rate during peak load.
  • Behavior during server restarts (connection migration, message buffering).

Chaos Testing

Introduce failures deliberately: kill server instances, introduce network latency, saturate the message broker. Verify that the system degrades gracefully rather than failing catastrophically.

Cost and Infrastructure Considerations

Real-time infrastructure costs differently than standard web hosting.

Connection-Based Costs

WebSocket connections consume memory even when idle. A server handling 50,000 idle connections uses significantly more resources than one handling 50,000 HTTP requests per minute. Budget for concurrent connections, not just request throughput.

Managed vs. Self-Hosted

Managed real-time services (Pusher, Ably, PubNub) charge per connection and per message. At low scale, they’re cost-effective and eliminate operational overhead. At high scale (millions of connections), self-hosted solutions using open-source tools (Socket.IO, ws, or framework-native solutions) are dramatically cheaper.

The crossover point depends on your connection patterns, but for most applications, managed services become expensive above 10,000-50,000 concurrent connections.

Infrastructure Checklist

  • Load balancer that supports WebSocket connections (ALB, Nginx, HAProxy).
  • Message broker (Redis at minimum, Kafka for high-throughput).
  • Monitoring for connection counts, message latency, and broker health.
  • Auto-scaling based on connection count, not just CPU/memory.
  • Geographic distribution if users are global (multiple regions with broker replication).

Getting Started

If you’re adding real-time features to an existing application, start small:

  1. Identify the highest-value real-time feature. Usually it’s live notifications, dashboard updates, or chat.
  2. Choose the simplest technology that fits. SSE for server-push only. WebSockets for bidirectional.
  3. Start with Redis Pub/Sub for cross-instance communication.
  4. Implement robust reconnection on the client from day one. Network interruptions are not edge cases — they’re the normal state of mobile networks.
  5. Monitor connection counts and latency in production from the first deployment.

Real-time features dramatically improve user experience when implemented well. The key is matching the technology to the use case, planning for scale from the architecture level, and testing under realistic failure conditions. The users won’t notice the infrastructure decisions you made — they’ll just know the software feels alive.

Share

Ready to Build Your Next Project?

From custom software to AI automation, our team delivers solutions that drive measurable results. Let's discuss your project.

Dragan Gavrić

Dragan Gavrić

Co-Founder & CTO

Co-founder of Notix with deep expertise in software architecture, AI development, and building scalable enterprise solutions.