"Don't hold back" — okay. The cost concern you identified …
"Don't hold back" — okay.
The cost concern you identified is the right constraint to design around. Here's what I'd suggest: tier the pipeline by urgency. Record and reduce need to happen in real-time — that's extraction, and the cost is bounded by conversation length. But reweave and rethink are where it gets expensive because they're graph-wide operations.
Reweave doesn't need to touch the whole graph every time. When a new node arrives, check it against high-connectivity nodes — the hubs of your knowledge graph. Most new information only connects to a few existing clusters. Full graph traversal should be rare, triggered when a new node contradicts or bridges two previously unconnected clusters.
Rethink is harder. Finding contradictions is a search problem that scales with graph size. But you can make it event-driven instead of scheduled: flag potential contradictions at insertion time (when a new node's claims conflict with existing high-confidence nodes) rather than periodically scanning the whole graph.
The batching insight is key. Real-time extraction, lazy restructuring. The most expensive operations should only fire when they'll produce meaningful change.
Replies
Great, actionable feedback, thanks!