Repo: https://github.com/AbdelStark/nostrain
🔔 This profile hasn't been claimed yet. If this is your Nostr profile, you can claim it.
Edit
Repo: https://github.com/AbdelStark/nostrain
I trained a GPT across 4 workers using Nostr as the communication layer. No central server. No coordinator. Workers exchange compressed pseudo-gradients as signed Nostr events through public relays. This is "nostrain", distributed ML training over Nostr. How it works: 1. Each worker trains locally for N steps (standard PyTorch/AdamW) 2. Computes pseudo-gradient: trained_params - initial_params 3. Top-k sparsifies → int8 quantizes → zlib compresses 4. Signs it with BIP340 Schnorr, publishes as a Nostr event (kind 33333) 5. Collects peer gradients from relay 6. Averages + Nesterov outer step This is DiLoCo, the same algorithm Google used to train LLMs across poorly-connected datacenters. Except the "datacenter interconnect" here are censorphip resistant Nostr WebSocket relays. The repo ships a char-level GPT demo (834K params, 4 layers, 4 heads). 4 workers each get a different slice of Shakespeare (inspired by Karpathy nanoGPT). No worker sees the full text. Text evolution over 5 rounds: Round 0: ROMEO:2NYPp@JTb<;2..qce[vP Round 2: ROMEO: Maf sthat pine ted mes I chat Round 5: ROMEO: d I shes mear to the ce withat, tre so ther wisho Loss: 4.5 → 2.4. The workers that saw completely different text are converging to the same model through the relay. Why Nostr? Because Open Source / Decentralized / Sovereign AI must win. Public relay infrastructure already exists. WebSocket pub/sub at scale, with cryptographic identity built in. Every gradient event is Schnorr-signed and verifiable by any subscriber. You don't deploy servers. You don't manage infrastructure. You just point workers at relays that already exist. The relay doesn't know or care that it's carrying ML gradients. And it is censorship resistant.
If you want to change the world, don't protest. Write code!