New Atlantic works with university labs to turn the full context around important scientific work into useful training and evaluation environments for scientific research agents.

The work behind a scientific result is often as valuable as the result itself. A paper is the public summary, but the deeper scientific record usually lives elsewhere: raw measurements, failed runs, lab notebooks, instrument settings, analysis scripts, intermediate figures, negative results, discarded hypotheses, and the judgment of the people who knew which signals mattered. A useful research agent needs to work with messy evidence, make choices over time, test hypotheses, recover from dead ends, and understand why a result is trustworthy. Those abilities cannot be learned from polished publications alone.

University labs already contain the right kind of knowledge. They have the experimental context, the history of decisions, the tacit standards for what counts as a good result, and the data that shows how scientific work actually unfolds. New Atlantic helps preserve and structure that knowledge so it can support future scientific discovery.

What is a Reinforcement Learning environment?

An RL environment is a setting where an AI system can try to solve a task, receive feedback, and improve. It is a structured scientific challenge with rules, materials, possible actions, and a way to judge whether the agent did good work.

An agent might be asked to identify the next useful analysis for a dataset, reproduce a key result from a paper, choose which failed experimental branch to investigate, propose a correction to a pipeline, or infer which conditions explain a surprising outcome. The environment gives the agent the relevant materials and then scores its work against evidence that the university lab understands.

A static benchmark often asks for one final answer. An RL environment can measure the process: whether the agent asks the right question, uses the right evidence, avoids shortcuts, updates its plan when something fails, and reaches a result through a scientifically valid path.

How we build one with a university lab

We begin with the PI and university lab members. Together we choose a body of work that has real scientific substance: usually a published or near-published result with underlying data, enough surrounding context to explain how the work happened, and enough structure to turn parts of the research process into tasks.

Then we map the research record. We identify the datasets, notebooks, code, protocols, instrument outputs, annotations, intermediate tables, failed attempts, and private reasoning that make the result intelligible. We ask what a careful graduate student or postdoc would need to know to continue the work responsibly. That becomes the starting point for the environment.

Next we separate the materials into pieces an agent may see and pieces held back for scoring. The visible materials let the agent work on the problem. The held-out materials let us check whether the agent is actually reasoning well rather than memorizing the answer. When possible, we also create transfer tasks: related problems that test whether the agent can apply the same scientific logic in a new setting.

We then define the feedback. Some feedback is numerical: did the analysis reproduce the known trend, match a validation set, find the correct condition, or improve a model of the data? Some feedback is procedural: did the agent use the right files, respect the protocol, cite the evidence, avoid leaking held-out information, and produce work that a university lab member could audit? The goal is to make the scientific method learnable to the agent.

What the university lab gets back

A well-built environment also gives the university lab a cleaner and more durable version of its own research record. We organize materials that are often scattered across drives, notebooks, instruments, analysis folders, and individual memories. We make the logic of the project easier to inspect, hand off, reproduce, and extend.

The university lab can use the structured package to onboard new students, revisit old experiments, compare future results against prior evidence, preserve negative results that would otherwise disappear, and identify which parts of a project are ready for follow-on work. In many cases, the process also exposes practical gaps: missing metadata, fragile scripts, unclear sample names, undocumented instrument settings, or analysis choices that should be recorded more explicitly.

We make the judgment of researchers easier to preserve and reuse. The scientific value comes from keeping the data attached to the context that gives it meaning.

How universities stay in control

Universities need a careful path for this kind of work. We usually start with a scoped, non-commercial pilot so the university lab and institution can decide what data is included, what stays out, who can access it, and what review is required before anything moves downstream. This lets the scientific and institutional questions be answered before commercial terms are finalized.

If a package proves valuable, New Atlantic helps turn it into a licensable research environment. We handle much of the operational work: packaging, documentation, buyer readiness, technical delivery, and coordination with university offices. The university and originating researchers remain connected to downstream value through a revenue share.

Our mission

Frontier AI labs need richer scientific environments if they want to build agents that can contribute to real research. University labs and institutions hold the knowledge needed to build those environments, but they should not have to solve the packaging, legal, and commercial complexity alone.

New Atlantic exists to make that bridge. We help turn high-context academic research into durable scientific infrastructure: useful to the originating university lab, useful to the university, and useful in future scientific discovery.