codingagententerprise Status: active

Poolside

Enterprise code model built with reinforcement learning from code execution, designed for private deployment

Poolside is an AI company building proprietary code foundation models using reinforcement learning from code execution. Instead of training a model to predict the next token in a corpus of code, Poolside trains on whether code actually runs and produces correct output. The result is a model that's optimized for functional correctness rather than syntactic plausibility. Enterprise customers get private deployments, custom integrations, and a model that's been built for software engineering from the training objective up. As of 2026, Poolside has moved from research into live enterprise deployments across large software organizations.

Every major AI coding tool on the market today is, at some level, an API wrapper with a good UX on top. GitHub Copilot routes through OpenAI. Many agents run on Claude. The model at the center is someone else's. Poolside is building something different: a code foundation model with its own training methodology, designed to be owned and operated by the enterprises that use it.

The company was founded in 2023 by Eiso Kant and Grégoire Mialon, who previously worked at GitHub and Meta AI respectively, with serious AI research backgrounds. They've raised substantial funding from investors who expect this to be a multi-year infrastructure play. The pitch isn't "we made GPT-4 better at coding." It's "we rethought what training for code should look like."

The training methodology argument

Poolside's central claim is about how you train a code model rather than what you do with it afterward. Standard large language models are trained on next-token prediction across a massive corpus. For code, this means the model learns to predict what tokens typically follow other tokens in code. It gets good at recognizing patterns. It produces code that looks like code written by humans.

The problem is that pattern-matching on syntax doesn't guarantee semantic correctness. Code can be syntactically valid, stylistically plausible, and completely wrong. It can compile and still fail every test. It can pass tests that don't cover the edge case that matters. Training on token prediction doesn't directly reward any of those outcomes.

Reinforcement learning from code execution changes the reward signal. The model writes code, the code runs, and the model gets feedback on whether it worked. Did it compile? Did the tests pass? Did the function return the correct output for a given input? These are direct signals about correctness rather than indirect signals about pattern similarity. Over time, training on this signal pushes the model toward producing code that actually works, not just code that looks like code that works.

This is a meaningful distinction. The models that hallucinate library functions that don't exist are doing so because they've learned patterns from real library functions and are applying them to generate plausible-sounding but nonexistent APIs. A model trained on execution feedback would have learned that calling nonexistent functions causes errors, and would have been pushed away from that behavior during training.

Whether Poolside's implementation delivers on this argument in practice is harder to verify from outside the company. Enterprise customers who've deployed it are the ones with ground truth, and they don't typically publish detailed comparative studies. What's available publicly are general impressions and the company's own benchmark claims.

Private deployment as a feature, not a compromise

The other core element of Poolside's product is private deployment. Enterprise customers get the model running on their own infrastructure, not routing their code through Poolside's shared cloud. This matters for a specific class of customer.

Financial institutions, defense contractors, healthcare companies, and large technology firms with strict data governance requirements often can't send production code to a third-party cloud service. The legal and compliance arguments vary by industry and jurisdiction, but the result is consistent: a significant segment of enterprise software organizations has limited or no ability to use GitHub Copilot, Claude Code, or similar tools for their most sensitive codebases.

Private deployment solves this. Your code runs through a model operating on your infrastructure. Poolside provides the model weights and the integration support. You control the environment. Your code never leaves your network.

This is the same pitch that Cohere has made for general language models, and it works for the same reason: there's a large pool of enterprises that need AI capabilities but can't accept third-party data handling at the model layer. Poolside is targeting that pool specifically for the software engineering use case.

The trade-off is that you take on infrastructure responsibility. Running a frontier-scale model on your own hardware isn't free or simple. You need the compute, the ops capacity, and the integration work. For large enterprises where those capabilities exist, this is acceptable. For smaller companies, it's a significant burden.

How it fits against alternatives

Poolside competes at a different layer of the stack than most tools in this directory. When you compare it to Devin, you're not really comparing like for like. Devin is a product: you sign up, give it a task, and get a result. Poolside is infrastructure: you license the model, deploy it yourself, and build product on top of it.

The more accurate comparison is to Augment, which also targets enterprise software teams with a premium product, and to direct model access from Anthropic or OpenAI for enterprises that want to build on top of those models while maintaining compliance through enterprise agreements.

For enterprises evaluating their AI coding stack, the decision often comes down to: do we want a turnkey product with shared cloud infrastructure and a per-seat pricing model, or do we want model ownership and private deployment with higher upfront integration cost? Poolside is firmly in the second camp. Amazon Q Developer is a reasonable comparison in the sense that it's also targeting large enterprise software teams with a product built for that context, though Q Developer is a product layer on AWS infrastructure rather than a model you deploy yourself.

SWE-bench and the benchmark question

Poolside has shared performance data on SWE-bench, the standard benchmark for autonomous software engineering tasks. Their numbers position the model competitively with other frontier code models. SWE-bench is a useful signal because it tests whether a model can actually resolve GitHub issues, not just generate plausible-looking code, and it's a harder task than most benchmarks.

The caveat with all SWE-bench numbers is that they're measured under specific conditions: particular model configurations, specific prompting strategies, defined task categories. Real-world performance on your codebase may differ. SWE-bench results are better than nothing for comparison, but they're not a guarantee about production performance on your specific use cases.

What would be more informative is seeing data from production enterprise deployments: what percentage of tasks complete without human intervention, what's the error rate on specific task types, how does performance scale with codebase size. That data likely exists inside Poolside's enterprise customer relationships. It's not public.

The team and funding context

Poolside has raised significant venture funding from investors including Felicis Ventures and others with strong AI portfolios. The fundraising context is important because building and running frontier code models is expensive. The compute costs for training, the inference infrastructure for production deployment, the engineering team required to build and maintain it: these are costs that require either a lot of revenue or a lot of capital to sustain. Poolside has the capital side covered for now.

The founding team's background is relevant too. Eiso Kant built engineering culture and tooling at GitHub and has thought deeply about how software engineers work and what slows them down. Grégoire Mialon's research background at Meta AI brings the model sophistication needed to actually execute on the RL-from-execution training methodology. It's a credible pairing for what they're trying to build.

Current state and what you can actually do with it

If you're an individual developer, the honest answer is: nothing, today. Poolside isn't selling to individuals. There's no trial, no API key, no free tier. The website explains what they're building and invites enterprise contact. That's the product.

If you're evaluating AI coding infrastructure for an enterprise software organization, especially one with compliance requirements that rule out shared cloud AI services, Poolside is worth a conversation. The deployment model is differentiated, the training methodology thesis is coherent, and the team has the background to execute it. You won't be able to run a meaningful trial without engaging the sales team, but the engagement cost is low compared to the potential value if it's the right fit.

For organizations that don't have hard constraints on shared cloud AI services, the calculus is harder. You'd be giving up the simplicity of a product like Claude Code or Augment, the ecosystem breadth that comes with being on an Anthropic or OpenAI model, and the developer experience refinement that comes from a company focused on the product layer. What you'd gain is model ownership, private deployment, and a training methodology that targets correctness. Whether that trade is worth it depends on your specific context.

The bottom line

Poolside is making a long-term infrastructure bet on proprietary code models and private enterprise deployment. The training methodology thesis, reinforcement learning from code execution rather than token prediction, is coherent and worth taking seriously. The private deployment value proposition is real for the enterprises that need it.

It's not a tool for developers looking for something to use this week. It's not open source, not publicly accessible, and not cheap to operate. For the specific category of large enterprise software organization with strict data governance requirements and the budget for frontier AI infrastructure, Poolside is one of the most interesting options available. For everyone else, the more accessible alternatives in this directory will serve you better today.

Key features

Proprietary code model trained with reinforcement learning from code execution
Private deployment on customer infrastructure
Coding agent workflows for enterprise software teams
Training methodology optimized for functional correctness, not just token prediction
Integration into existing enterprise development pipelines
Purpose-built for large-scale software engineering teams

Pros and cons

Pros

+ Training methodology targets functional correctness, not just pattern matching
+ Private deployment means your code never touches shared infrastructure
+ Purpose-built for software engineering rather than adapted from a general model
+ Enterprise-grade support and integration assistance
+ Strong investor backing from major AI-focused funds

Cons

− No public access, free tier, or self-serve API
− Custom pricing with no public numbers
− Limited independent benchmarking data available
− Enterprise-only focus excludes individual developers entirely
− Smaller ecosystem compared to models built on OpenAI or Anthropic APIs

Who is Poolside for?

Large engineering teams that need a code model they can deploy on their own infrastructure
Enterprises in regulated industries where shared cloud AI services aren't viable
Software organizations building internal AI developer tools on a proprietary model base
Companies with large, complex codebases where training-for-correctness matters

Alternatives to Poolside

If Poolside isn't quite the right fit, the closest alternatives are devin , claude-code , augment , and amazon-q-developer . See our full Poolside alternatives page for side-by-side comparisons.

Frequently Asked Questions

What is Poolside AI?

Poolside is an AI company that builds code-specialized foundation models using reinforcement learning from code execution. The key distinction from other code models is the training signal: instead of predicting the next token, the model is trained based on whether the code it writes actually runs correctly. Poolside targets enterprise customers with private deployments, not individual developers with a self-serve product.

How is Poolside different from GitHub Copilot or Claude Code?

GitHub Copilot and Claude Code are products built on top of large foundation models. Poolside is building the foundation model layer itself, with a training methodology specifically designed for code correctness. Copilot and Claude Code are things you use as a developer. Poolside is infrastructure that an enterprise licenses and then integrates into its own tools and workflows. They're at different layers of the stack.

What does reinforcement learning from code execution mean in practice?

Most AI models are trained to predict the most likely next token given a training corpus. For code, that means learning statistical patterns from existing code. Reinforcement learning from code execution adds a different signal: does the code actually work? The model gets rewarded when it produces code that compiles, passes tests, and produces correct output. This pushes training toward functional correctness rather than just plausible-looking code.

Can individual developers access Poolside?

No. As of May 2026, Poolside is exclusively enterprise-focused. There's no API you can sign up for, no free tier, and no individual developer product. If you're an individual looking for an AI coding tool, you'll need to look at alternatives like Cline, Claude Code, or Cursor.

Is Poolside open source?

No. Poolside's models and infrastructure are proprietary. The company is building a commercial enterprise product, not contributing to the open-source AI ecosystem. The closed nature is part of the private deployment value proposition.