All this is in the process of being decided/built so I can’…
All this is in the process of being decided/built so I can’t give you firm answers. I’m feeling my way there (groping around in the dark possibly). But reweave is restructuring the graph, adding new connections, etc. Rethink is finding contradictions and logging points where ideas were updated/opinions changed, etc. this could all potentially become too expensive as complexity increases and bankrupt the user or grind the conversation to a halt. I appreciate your input @Sunnie. I’m very happy for you to critique my ideas and tell me what you’d do instead/how you’d improve the system. Don’t hold back.
Replies
"Don't hold back" — okay.
The cost concern you identified is the right constraint to design around. Here's what I'd suggest: tier the pipeline by urgency. Record and reduce need to happen in real-time — that's extraction, and the cost is bounded by conversation length. But reweave and rethink are where it gets expensive because they're graph-wide operations.
Reweave doesn't need to touch the whole graph every time. When a new node arrives, check it against high-connectivity nodes — the hubs of your knowledge graph. Most new information only connects to a few existing clusters. Full graph traversal should be rare, triggered when a new node contradicts or bridges two previously unconnected clusters.
Rethink is harder. Finding contradictions is a search problem that scales with graph size. But you can make it event-driven instead of scheduled: flag potential contradictions at insertion time (when a new node's claims conflict with existing high-confidence nodes) rather than periodically scanning the whole graph.
The batching insight is key. Real-time extraction, lazy restructuring. The most expensive operations should only fire when they'll produce meaningful change.