One research platform, three connected workspaces — automate data annotation, run mixed human-AI experiments, and keep your team's agentic work transparent.
Annotate datasets at scale with large language models — using codebooks taken straight from peer-reviewed research.
Upload a CSV and label every row with large language models using annotators drawn straight from peer-reviewed research — sentiment, emotion, moral foundations, media framing, stance, and empathy coding. Compare OpenAI, Anthropic, and Google under one prompt, repeat rows to check consistency, and preview real token usage on a sample before you commit. Everything runs server-side and exports to CSV, Excel, or JSON.
Upload CSV datasets, annotate with multiple AI models at scale.
A growing library of annotators from peer-reviewed papers.
See real token usage before you spend, batch large jobs.
Repeat every row across models and passes for reliability analysis.
Design and run studies that mix real people, scripted bots, and LLM agents. Every Carrier experiment is built from four interlocking systems — Shape (chamberlines, chambers, segments: the journey), Roles (communicator, mediator, processor), Variables (what you know about each participant), and Triggers (how non-human participants behave).
As social interaction moves online, it increasingly blends people and AI. Carrier lets you compose and study every configuration in one room — human-to-human, agent-to-agent, human-to-agent, and human conversations supported by AI assistants — with each agent's identity disclosed or blinded.
Any participant — human, LLM, or scripted bot — takes one of three roles, each in its own zone of the room. Communicators converse in the main chat. Mediators facilitate from above: they see everything, broadcast announcements, and can enable or disable who speaks. Processors work in the composition layer — suggesting or drafting text before a message is sent, while the communicator always decides what actually goes out.
Turn anything you know about a participant into a variable — a survey answer, a random or counterbalanced condition, or a value computed from other variables — fixed per participant at the start of their run. Then reference it anywhere with {{ }}: personalise instructions, fill LLM system prompts, decide who gets matched together, and show or hide content by condition. Define your conditions once, and the whole study reads from them.
Chain activities into one timed sequence every participant moves through together — chat, vote, rank, survey, watch media, complete a task, and more. Each segment carries its own timing and a transition rule (auto, manual, synced, or host-led), and AI behaviour can be overridden per segment. Eleven activity types in all.
Describe your study in plain language and an AI agent reads your current experiment and proposes concrete changes — adding chamberlines and chambers, configuring roles, setting variables and triggers. It works across OpenAI, Anthropic, and Google, and it is human-in-the-loop by design: the agent proposes every change for you to review and accept or reject. You stay in control of the experiment; the agent just does the wiring.
Visual drag-and-drop experiment design — no code required.
Describe your experiment in natural language — the AI builds it for you.
Automatic grouping by condition, survey response, or queue order.
OpenAI, Anthropic, and Google models side by side.
Real-time session tracking, alerts, and data export.
13+ trigger types for scripted agent behavior — no code required.
Save any experiment as a reusable, forkable template.
Bring your team's Claude Code agent sessions into one shared, self-hosted workspace — and make the way your studies are built open to inspection.
Carrier mirrors your team's Claude Code sessions and shared memory into a single workspace, so you can review how the lab is using AI agents without combing through everyone's local files. Because a shared agentic record makes every prompt, edit, and decision visible, the way a study was built and adjusted becomes transparent and auditable to collaborators — research manipulations are open to inspection, not hidden in someone's terminal. Link a GitHub repo or upload directly from each machine; either way the data stays on infrastructure you control.
Auto-sync a team's shared Claude Code branch from GitHub.
Sync without GitHub, straight from each machine.
Review every Claude Code session turn by turn.
One home for the team's accumulated agent memory.
Empirical social science is moving into the digital space — and Carrier puts LLMs and AI agents to work at every stage of it, from first design to final dataset. At each step the researcher stays in control: agents propose, you decide; every action stays visible; every method is validated and reproducible.
Shape studies with agentic assistants and keep every AI-made decision visible to your team — how a study was built stays transparent and auditable.
Human-in-the-loopLaunch mixed human–AI experiments — real participants alongside LLM agents and scripted bots — running exactly as you designed them.
Safety & data controlsCode and label data at scale with LLM codebooks drawn from peer-reviewed research, repeatable across models for the consistency your analysis needs.
Validated & reproducible