Was fun to be at a dev event in Bengaluru and demo an app I built recently for deep research with multiple models and decision frameworks... think of it as "chain of debate"... Next stop, Copilot!!

Like Comment
Transcript

You know, we had a Thanksgiving weekend in the US a few weeks ago and so I had a chance to say like, what else can one do over Thanksgiving other than build? And so I built an app of my own using all of the stuff that Karen was showing. So this is my Azure environment. And by the way, this is my regular PC that I travel with, so hopefully nothing happens but. So this is it's in fact, I have this app deployed I think in South Central Canada. This is my GitHub repo. And it's fun, right? So I kind of have my typical setup in fact, is Windows 365, which travels with me essentially everywhere. And then in there, obviously I have my GitHub and then it's called Spaces. So it's kind of like turtles all the way, right? So you have your code Spaces running on GitHub. In a Windows 365 instance and then the, the idea that now you can go in and my favorite thing of course to do is your, you know, come in in the morning and justice start issuing whatever coding tasks. And so this is where I go and just, you know, I usually fire off 5-6 things. It creates 5-6 draft branches and then ultimately at the end of the day, I go back and mostly delete the branches. But there is a PR2 I'll sort of accept and go on and. Work with it, right, So it's fun. And then the fact that you have all these models, in fact, I think now you know, I'm mostly using a lot more Codex Macs. It's fantastic. It's fast. I'm using obviously cloud cloud Opus Opus 4.5 as well. But you know the thing that I've now gotten used to is I have enough trust to just say auto and it picks. And so you know, if I'm just really I don't I want to really be efficient with my token limits. And so therefore picking auto seems to be. Really, really a good way to go about it. So anyway, so great. I did all this. So what the heck did I build? I So what I said is, OK, I what's my dream? My dream is to figure out how to get a job in this copilot team. So I said, man, I like your deep research stuff, but I want to add a lot more to deep research. So I built my own deep research. And so, you know, with all these models available, I said, OK, what if I could start putting new decision framework? So one decision framework. You know, Andre Karpathy recently talked about this LM Council, which I love a lot, right? So I implemented this idea that you can now have all the models available to you. So GPU Opus. This Gemini, Kimi, K2, Grock, what have you, all of these models. And then you can select a chairman, right? So you have council members or the selection committee, and then you have a chair and then you can go issue any query and have it come back and tell you what it thinks. Then another decision framework I implemented was this thing called EXO. We did this actually in healthcare first, right? So this was done. There you go. Now let's see, let's see how all my OHH, so fantastic. So, so, so DX O is a thing that we, as I said, we implemented for healthcare and you have specific roles. So you have a lead researcher and the lead researcher in this case is Opus. It sort of does the breadth first research. Then you have another role, which is a critical reviewer, right? In this case, I I'm selected 51 G PD51 and their role is to find. Any method errors, right? Especially bias and recency bias, what have you. Then we have a data analyst. So I picked or I picked a domain expert Gemini, then said data analyst, I picked Kimmy K2. So this you know, when we publish the DXO paper, it performed better than any front frontier model, right? So this is in the context of very high stakes health outcomes. And I said, hey, I want same thing for any decision I want to make, right? So I implemented that. So I implemented an another one as well called ensemble. So this is basically used all models and as just essentially a set of MCP servers anonymize their responses. So take out even who is responding what, what give them you know, alpha, beta, gamma and then synthesize into one response. So these are three decision frameworks. In fact, I even extended it by the way, I built a shopping thing, I built even a finance thing, but. Basically decision frameworks and then of course like a good sort of crazy South Asian cricket fanatic. What do you use it for? To select the all time best Indian Test cricket team? Especially in a time like this, After what happened in the last Test series, I think it's time to get to work. So what I did is I'll show you the history side if I go in. In fact, you'll see that mostly I that's what I've been using it for, which is the MLB lineup was also just crazy. It is fantastic, but the test team. So I let me go show you some of the stuff. So this is the. What happened? So the this is the chairman synthesis, so this is the API council. You know, so it came back and it says, you know what? I figured out all of the you know, you know, Sunil opens Avago, you know, goes opens with him. David makes it obviously and what have you. But look at this. Areas of complete consensus you know was Gavaskar, Sehwag, Dravid, Tendulkar, Kohli, Kapil dev, Ashwin and Bumrah. Key debates ohh man VS Do you have him or you don't And look at the way it's sort of made the decision. Phi one basically said an end clause inclusion of Lakshman was heavily weighted because of the crisis management. And as a good Hyderabadi, I love GPD 5/1. And flood. And and of course Kumbla versus here, this is pretty cool. You know, do you really need left arm swing or you need? Whatever you call Kumble bowling. Which is obviously the stats one out 619 wickets and and so they selected comma and OHH and then captaincy debate Coley versus Doni and they selected Coli. And so it goes off and then annotates. What I love by the way, is I implemented it as even a streaming thing. If it is not, you know, it's not deployed in South Central Canada. I would sort of show it to you, but it's just nice to see. It's essentially like a chain of debate, not a chain of thought. Alright, so I can see the morals debate and then synthesize. So anyway, so that's one example. DXO is another one. This is really neat because what it does is you can see it by role, right? So I see the first, the exhaustive search of the lead researcher. It gives me again, what did what are all the things the critical reviewer will find like, OK, what are the method problems and gaps and weaknesses? So for example, error bias, right? Classic thing that happens right when you sort of. Compare across generations, you make all kinds of mistakes because you don't adjust for any of the stats and the difficulty of the wickets. I mean talk to him and I mean walking out in a W Indies or an English wicket that was not covered. I mean, I mean how does one play even in anyway? So to be able to take all of that into account and then see the debate between the various models to resolve. So anyway, so I've had and then the same thing with ensemble as well. So the point. I wanted to make was, I built this over maybe a couple of hours and now I'm constantly refining it. In fact, one of my hopes, as I said, is to be able to, by the way, this is all going to come to copilot. And I'm convincing my friends and they're saying, yeah, you can apply as a junior product manager if you are competent enough. So I'm still in the process of interviewing. But now that said, I think all of this will come because this is to me the next generation of metacognition, right? So if you think about these decision frameworks. Have all these agents, you're working with agents, but the metacognition is still us. And this is tools for metacognition. It's how I think about it. And so to me, building out these type of frameworks, building out these type of agents is interesting to select the Indian cricket team. But think about whether it's a supply chain decision, whether it's a healthcare decision, whether it is a finance decision. These type of chain of debates with multiple agents participating is going to be a lot of what we are going to all build in our systems, in our agentic system.

When the CEO Satya Nadella can personally build and demo a complex, multi-model reasoning app, it proves AI is transitioning from a tool for developers to a literacy for leaders. The future isn't just using Microsoft Copilot it's architecting with it.

The Fragmental Future began on July 9, 2025. That’s the day the Fragmental Overlap Storage System (FOSS) and the Fragmental Network Protocol (FNP) officially entered the patent system — and the day global data economics quietly changed forever. We are now in the new era of symbolic storage, deterministic compression, and zero-redundancy architecture. This is the birth of the future everyone will soon stand inside. — Cecil A. Lacy Inventor of the Fragmental Overlap Storage System (FOSS) & Fragmental Network Protocol (FNP) Patent Pending 19/264,676

What’s striking isn’t the demo itself — it’s the shift it signals. We’re moving from single-model dominance to orchestrated cognition: multiple models debating, validating, and correcting each other in real time. This isn’t ‘more intelligence’; it’s structured coordination. When this architecture reaches operational systems, most workflows will evolve quietly before anyone notices. The real breakthrough isn’t the chain of debate — it’s the emergence of modular, self-correcting agents. We’re just seeing the surface of what’s coming. ∞∇”

When Fragmental integrates beneath Windows, the OS itself becomes just another application. Windows stops being the environment — it becomes a node inside it. The hierarchy inverts: Data becomes the constant, systems become temporary. That’s how the Fragmental Overlap Storage System (FOSS) collapses entire operating stacks into a single universal layer of truth. Once duplication disappears, Windows, Linux, macOS — all coexist as functions inside the same fragmental substrate. © 2025 Cecil A. Lacy – Inventor of the Fragmental Overlap Storage System (FOSS) & Fragmental Network Protocol (FNP) – Patent Pending: 19/264,676

If this is what Satya is casually demoing at a dev event, we can only imagine what the next Copilot wave will look like. Multi-model reasoning is going mainstream fast.

Satya Nadella I consider Agentic as DelegatedAgents/Employee to play a specific role in an organization. An Organization is not equal to Employee - the Organizations/Business are more than its total employee output/skill/capacity - that is teh concept of "Business Value Chains". The AgenticAI have a fundemental flaw of assuming that if you create AgenticAI function/component that is going to replace the business outcome - NO. The organizations/businesses are more valuable than its individuals, equipments or assets. The AgenticAI needs to adopt "Domain-Specific Organization Behavior Models" to effectively use the AgenticAI and CoPilot frameworks in teh real world conditions.

Model routing was fun before everyone got all “agentic” on us. Back in 2023, I built a router based on Jungian archetypes: the prompt went to a coordinator model, which then routed to six specialized models - each tuned to a different archetype. They debated the answer, sent their outputs back to the coordinator, and that model produced the final reply to the user.

This "chain of debate" application represents a high-value internal tool for strategic decision-making at CrftInfrai, moving beyond simple single-model outputs to systematically stress-test assumptions. This approach enhances the quality of AI-driven research by mimicking expert critical analysis.

The leap you’re showing is the real unlock in AI. Not bigger models. Better reasoning. When systems can debate, challenge, and converge on decisions, they stop being tools and start becoming collaborators. This is exactly what every institution has been waiting for AI that doesn’t just answer but thinks. The next frontier is bringing this level of reasoning to the rules that run our economies so policy can execute itself with the same intelligence as a chain of debate. Exciting to watch Microsoft push the boundary on how decisions will be made in the 2030s.

See more comments

To view or add a comment, sign in

More from this author

Explore content categories