AI super-scientist Kosmos completes half a year’s worth of human research in just 12 hours, with an accuracy rate of 79.4%.

The non-profit research institution FutureHouse recently unveiled an AI super-scientist system named “Kosmos.” A single run of this system can accomplish a research workload equivalent to that of a human research team over half a year in just 12 hours.

Sam Altman, CEO of OpenAI, commented on this, saying, “This is incredibly exciting! I anticipate that we will see more similar projects in the future, and this will be one of the most important application directions for artificial intelligence.”

The AI scientist “Robin,” developed by FutureHouse in its early stages, had significant limitations, particularly in handling vast amounts of information. Due to the context length limitations of language models at the time, it struggled to achieve multi-level logical reasoning during the inference process, directly impacting the depth and complexity of its scientific discoveries.

Kosmos’s major breakthrough stems from its adoption of a “Structured World Model.” This architectural innovation enables the system to efficiently integrate information from the trajectories of hundreds of intelligent agents, maintaining a high degree of alignment with core research objectives even when processing massive datasets on the scale of tens of millions of tokens.

Kosmos employs an autonomous cyclic working architecture that can initiate literature retrieval and data analysis tasks in parallel, continuously update its internal knowledge graph, and intelligently plan the direction for the next round of exploration.

According to statistics, on average, Kosmos completes 166 rounds of data analysis and 36 iterations of literature review in a single run. All conclusions can be traced back to specific code snippets or original sources, supporting complete audit verification.

During a continuous 12-hour operation, the system can read 1,500 academic papers, generate and execute 42,000 lines of analysis code, and output a complete, traceable research report. Its comprehensive processing capabilities have surpassed all known intelligent agent systems to date.