AI safety through the people who build it
In unmapped territory you steer by a fixed star, not by trusting the ship to find its own way.
The companies building the most powerful AI are small, opaque, and concentrating the most consequential decisions in very few hands, which the AI safety field sees as one of the most serious near-term risks.
Regulation, audits and voluntary commitments all help, but none bites without people inside these companies who have the standing to invoke them.
Polaris Collective is exploring how to build that standing: a cross-lab coalition that would give safety researchers and engineers inside frontier labs binding authority over consequential decisions, not consultative status. The work has to be external, because no single lab can move alone without a competitive disadvantage, the collective-action problem the labs themselves cite.
We suspect safety in this domain could be, at heart, a problem of collective intelligence: no single lab, board or regulator holds the full picture. Drawing on the tradition of citizen assemblies, one direction we are considering is whether a representative, deliberative body of safety practitioners could be convened across labs to weigh consequential choices together.
We are still in the process of exploring which ideas would make the most sense to pursue. Some that we are weighing include:
- Opening up model specs to more democratic oversight, with structured input from in-firm safety staff and affected publics into how they are drafted, assessed and held to.
- Supporting work on scalable human oversight of increasingly capable systems.
- A stop-work or stop-deployment authority that safety teams inside labs could invoke when warranted, backed externally so it does not rest on any single individual.
None of this is settled. We are at the start of this. The year ahead is a structured test, run in the open, of whether any of it is worth building and which mechanism a lab would actually adopt.
What we're looking for
Polaris Collective is gathering feedback and interest as it explores this, and the most useful input comes from the people closest to the problem. If you work inside a frontier lab, we would value a confidential conversation about whether this should exist and how it should work. Interest and scepticism are equally useful, and anything you share is treated in confidence.
We are also looking for introductions to funders working on power concentration or AI-governance capacity, advisers from AI safety, labour organising, democratic theory or deliberative democracy and citizen assemblies, and connections to the in-firm safety community.