FedSkipTwin: Digital-Twin-Guided Client Skipping for Communication-Efficient FL

February 11, 2026 · Data Security

Federated Learning Communication Efficiency Digital Twin IoT LSTM

The 20-Second Summary

Communication is often the bottleneck in federated learning (FL), especially for mobile and IoT clients. FedSkipTwin reduces unnecessary transmissions by giving the server a lightweight “digital twin” of each client that forecasts the magnitude and uncertainty of the next update; if both are predicted to be low, the client is skipped for that round. In the reported experiments, FedSkipTwin reduces total communication by 12–15.5% over 20 rounds while slightly improving final accuracy versus FedAvg.

The Problem

Classic FL assumes that once a client is selected for a round, it must communicate—regardless of whether its update is meaningful or redundant. But late in training, or when a client’s local data is poorly aligned with the current global model, the local update can be small and have limited marginal value. On constrained networks, those “low-impact” rounds are wasted bandwidth (and battery).

The key question is simple: do all clients need to send updates every round? FedSkipTwin’s answer is “often, no”—but only if we can skip conservatively without destabilizing convergence.

Our Approach: Server-Side Digital Twins

FedSkipTwin adds a server-side surrogate model (a digital twin) for each client. Each twin is a small LSTM that observes the client’s historical sequence of gradient/update norms and forecasts two values for the next round:

predicted update magnitude, and
epistemic uncertainty of that prediction (estimated via MC-dropout).

The server uses a dual-threshold rule: it requests communication if either the predicted magnitude or the uncertainty exceeds a threshold; otherwise it instructs the client to skip the round.

A key design choice is that the intelligence stays server-side: clients don’t run extra models and don’t compute extra features beyond what the server already observes.

How We Evaluated

Experiments are run on UCI-HAR and MNIST with 10 clients under a non-IID partition. The TeX reports 20 communication rounds with local epochs $E=3$ and batch size 32. Data is split using a Dirichlet distribution (reported $\alpha = 0.5$), and each twin uses a short history window ($K=5$), dropout 0.2, and $M=20$ MC samples for uncertainty.

Key Results

The paper reports that FedSkipTwin reduces communication while maintaining (and slightly improving) accuracy:

Total communication reduction: 12–15.5% over 20 rounds.
Accuracy impact: up to +0.5 percentage points compared to standard FedAvg.

More concretely, the TeX states:

On UCI-HAR, communication is reduced by 15.5% with a +0.5 pp accuracy improvement.
On MNIST, communication is reduced by 12.0% with a marginal accuracy gain.

Limitations and Next Steps