The Codex App · A Working Guide
A Working Guide · April 2026

The Codex App

Every feature, pattern, and best practice for getting real work done with OpenAI's coding agent — for engineers and non-engineers alike.

Built from three primary sources · OpenAI Academy — Introduction to Codex (Derrick Choi, 56 min)
· Automate your workflows with the Codex App — @Dimillian
· My Codex threads are alive — @nickbaumann_

Each chapter folds together the Codex team's official positioning (what the product is) with field reports from people running Codex daily (what actually works).

Academy from the OpenAI Academy session   Dimillian workstream automations   Baumann monothread patterns

Chapter 01The shift: from writing code to delegating outcomes

Software tools have moved through three phases — suggestion, collaboration, and now agentic delegation. You describe the outcome; the agent handles the steps in between.

Software engineering has evolved rapidly — code complete, pair programming, agentic delegation
The three stages of AI-assisted software: suggestion → pair programming → full delegation.

Why this matters for everyone, not just engineers

Most computer work today is knowledge work, not software engineering: organizing files, summarizing documents, analyzing data, preparing reports. Codex is designed to automate that loop — the download / rename / upload cycle that fills a week.

How most computer work gets done — most digital work is still manual
The manual loop Codex replaces.
Why this matters for you
Start from outcomes, let Codex handle execution, stay in control.
Mindset change

You do not need to master every technical concept before starting. You need a problem and an outcome. The agent closes the gap between idea and working result.

Two ways an AI can drive a computer

Two ways an AI can use a computer: through code vs through the interface
Codex is optimized for the left column — writing scripts, using APIs, reading and changing files. Faster, more reliable, more scalable than clicking buttons in a browser.

Chapter 02Anatomy of Codex — Model, Harness, Surfaces

When people say “Codex” they mean three things working together.

Model

The intelligence — the brain. It plans, reasons, reads what exists, makes changes, and checks results. Longer runs usually mean deeper reasoning.

GPT-5.3-CodexCodex Spark

Harness

The safety layer. It gives the model a controlled way to read files, edit them, run commands, validate output, and stay inside boundaries you set.

SandboxPermissionsPlan mode

Surfaces

The ways you interact with the same agent. Same brain, different distance from the work.

Codex AppIDECLI

Models and reasoning depth

Reasoning effort dropdown — Low, Medium, High, Extra High
Four reasoning levels. Medium is default; Extra High is for genuinely complex problems.

GPT-5.3-Codex

The state-of-the-art model for code and agentic tasks. Four reasoning levels:

  • Low — fastest, simple asks
  • Medium — default, balances speed and depth
  • High — more thinking for harder problems
  • Extra High — deep reasoning for genuinely complex tasks

Codex Spark

A blazing-fast sibling model. Writes a 522-line Space Invaders clone in roughly 5 seconds — good for rapid prototyping and multi-turn UI tweaks. Currently on Pro tier, rolling out to others.

Codex vs ChatGPT — when to reach for which

When to use Codex vs ChatGPT today
Both converge over time. For now: chat answers in ChatGPT, local file work in Codex.

Chapter 03Getting started: download, sign in, first project

Setup takes about three minutes. You need a ChatGPT account and a Mac. Windows is in invite-only early testing and shipping broadly “very, very soon.”

openai.com/codex — download for macOS
Visit openai.com/codex and click Download for macOS.

Four steps to your first project

  1. Download & install the app from openai.com/codex.
  2. Sign in with your ChatGPT credentials. Codex is available on all tiers, including Free for a limited time.
  3. Select a workspace — Personal or Work. Both work identically; the choice affects which apps/integrations show up.
  4. Create an empty folder on your Mac called something low-stakes like first-project and add it from the app sidebar.
Tip — first touch

Don't point Codex at your most important folder on day one. A dedicated sandbox folder lets you experiment without worrying about anything important being touched.

Downloading and signing in

Signing into Codex with ChatGPT credentials
Use your existing ChatGPT account — the same workspace selector you see in ChatGPT appears here.

Platform availability

macOS

General availability. All ChatGPT tiers including Free (limited time).

Windows

Invite-only early testing; broad release imminent. Search Codex Windows on X for updates.

IDE & CLI

Same underlying agent, different surface. For users who want to stay closer to code.

Chapter 04The interface: threads, reasoning, permissions

The Codex app has a lot of surface area. You don't need all of it on day one — these are the controls that actually matter.

Codex app interface
Left rail: New thread, Automations, Skills & Apps, and your projects. Each project has its own thread list.

The controls you'll actually touch

The sandbox: the single most important safety concept

Default permissions vs full access
By default, Codex can only read and write inside the project folder you gave it. Full access is a conscious toggle, not the norm.
How it works

Codex ships in a sandbox. Under Default permissions, it can only run commands and edit files inside the project folder. When it needs anything beyond that, it pauses and asks — in the chat, in plain English. You always approve explicitly.

Plan Mode — see the intent before execution

On Mac, press Shift + Tab (or the + button) to toggle Plan Mode. Codex will ask follow-up questions and output a written plan of what it's about to do before running anything. Great when you're unsure what a prompt will trigger.

Rule of thumb

Use Plan Mode when the task touches files you care about. Drop it once you're confident in how Codex interprets your asks.

Chapter 05Four live demos: what Codex actually does

None of these are engineering tasks. They're operations, analytics, file hygiene, and rapid prototyping — the shape of most real knowledge work.

Demo 1 — 50 messy CSVs → one clean report

A marketing folder with fifty weekly CSVs, inconsistent column names, and no standardized schema. The prompt:

Prompt

“Combine all CSVs in this folder into one clean spreadsheet, standardize column names where possible, flag anything weird or inconsistent, and generate a one-page summary in .docx format of the main trends and anomalies. Save all the outputs back in a folder called outputs.”

Codex's CSV summary output
Codex produced a 600-line Python script, a standardized Excel file, a .docx change log, an issue report, and an HTML summary page.

Demo 2 — 200+ messy trip files → categorized sub-folders

A folder from a Mexico trip with photos, screenshots, PDFs, receipts, and scans. Codex reads file contents (not just extensions), renames consistently, and sorts into Travel / Finance / Notes / Reports. Ambiguous items land in a needs-review folder.

Codex organizing a trip folder
Codex organized photos, receipts, and PDFs into sensible sub-folders by reading their content.

Demo 3 — Interactive dashboard from the same data

A follow-up turn, not a new thread. Codex reused the cleaned dataset from Demo 1 and produced an interactive HTML/JS dashboard with filters for month, channel, country, and device.

Live interactive dashboard
From raw CSVs to interactive dashboard in one conversation.

Demo 4 — Space Invaders in 5 seconds (Spark)

522lines
~5sfirst render
3tweaks
Multi-color Space Invaders generated by Codex Spark
Codex Spark wrote full-screen Space Invaders in seconds, then iterated on multicolor enemies, scoreboard, and background stars by conversational request.
What to take from this

The pattern in all four demos: describe the outcome, not the steps. Codex inspects the files, picks an approach, writes the code, validates, and returns results. You iterate in natural language like you would with a teammate.

Chapter 06Skills: turning prompts into playbooks

Skills are saved workflows. If you keep writing the same prompt (“summarize this doc, remove jargon, save as .docx”), make it a Skill and call it with $skill-name.

Skills panel
The Skills & Apps tab ships with a growing library. Many are purpose-built for non-engineers.

Bundled skills worth knowing

Slide Generator

Turns a URL, a doc, or a brief into a deck using the OpenAI template.

Image Gen

Embedded image generation that Codex can call during any task.

Docx editor

Edit or review Word files in place — redlines, structural rewrites, formatting.

Sora

Video generation as a callable step inside any thread.

Transcribe

Audio/video → text as a pipeline step, not a destination.

Skill Creator + Installer

Skills that create and install other skills. Describe the playbook, get a reusable shortcut.

Make your own skill, without writing code

Typical “skill creator” prompt shape:

Prompt template

“Create a skill that [rewrites technical docs]. It should [remove jargon and reword in a digestible way for non-technical readers], and save output as [.docx]. Then install it on my computer.”

Once installed, you invoke it with $skill-name in the composer and hand it the new input file or URL.

Apps — connect the tools you already use

In Settings → Apps (or Skills & Apps), toggle on integrations you already use in ChatGPT. The same connectors work inside Codex:

Slack Gmail Google Calendar Google Drive Microsoft Office GitHub Notion Salesforce Obsidian

Chapter 07Automations: scheduled work in the background

Anything you do on a cadence — every Monday, every morning, every hour — is a candidate for an Automation. Codex runs it in the background and leaves the output where you asked.

Practical Automations with Codex
Six concrete automation shapes, straight from the Codex team.

Six automations worth copying

Weekly CSV rollup + summary

“Combine new CSVs in this folder into one clean spreadsheet, flag issues, and generate a one-page summary.”

New file inbox cleanup

“Look at newly added files in my Downloads or project folder, rename them consistently, and create a changelog.”

Daily doc summary

“Summarize any new documents dropped into this folder and produce a short brief with action items.”

PDF extraction pipeline

“Extract key fields into a CSV and flag anything low-confidence.”

Weekly report generator

“Create a weekly report from files in this folder and save it as markdown plus a slide-ready summary.”

Plain-language doc rewrite

“Rewrite any new technical docs in this folder for non-technical readers.”

How to create one

  1. Click Automations in the sidebar, then New automation.
  2. Name it and pick the project it should run in.
  3. Paste the prompt. Keep it explicit about inputs, outputs, and where to save.
  4. Choose cadence — every Wednesday at 9am, every hour, etc.
  5. Save. Codex will run on schedule and post results back to the thread.
Pro move

Point the prompt at a consistent folder and let the same automation run forever. If you move the folder or change the schema, update the automation prompt — don't start over.

Chapter 08Thread automations & the monothread pattern

A step beyond scheduled prompts: an automation that re-enters the same thread on a schedule. The context, corrections, and learnings from past runs are already there — so the natural prompt becomes very short.

“A thread automation is an interval trigger on an existing Codex thread. It is not just a scheduled prompt, because the automation runs in the same thread with the context and corrections already there. That makes the natural prompt very simple: “Keep an eye on this for me.”Baumann @nickbaumann_

Why long-lived threads suddenly work

A lot of agent-product design assumes long threads eventually degrade — which pushes you toward spinning up new chats and writing context summaries. The Codex team's recent work on compaction weakens that assumption. A thread that stays useful across many turns should keep working on the same recurring task.

Principle

With good context compaction, a thread's value increases over time. The thread remembers what you corrected, what you ignore, which sources usually matter. Each run is a smarter run.

What you can now do with a single pinned thread

Example: the workstream thread

“In my case I have something I call ‘workstream’ automation. It tracks all my current projects in a Notion database. Each item has a full page of details with a current snapshot and the actual tasks to execute for me and for Codex. At a glance I can see if Codex is waiting for me or if Codex needs more runs.” Dimillian @Dimillian

Typical shape of a workstream thread:

Reciprocal to-do list

The thread ends every run with two lists: things Codex left for you and things you left for Codex next run. That's the mechanism that makes the loop stable over weeks.

Chapter 09The proactive teammate workflow

The most useful Codex thread in the wild today is a “teammate” thread that watches Slack, Gmail, GitHub, Calendar, and your docs on an hourly cadence and surfaces only what matters.

“Every hour, it checks my Slack, Gmail, and PRs I wrote or am watching. It turns the noise into clean signal I can act on. My Codex usage has shifted from starting lots of short-lived chats to keeping a smaller number of threads alive around recurring workstreams. I have become monothread-pilled.Baumann @nickbaumann_

The one main + many subagents pattern

Main teammate thread

Orchestration and judgment. It wakes up on an interval, reads the smallest useful live signal, uses a specialist sub-agent only when the lane matters, and decides whether to notify you or stay quiet.

Long-lived sub-agent threads

Depth in their specialty: one for PRs, one for inbox triage, one for calendar, one per active project. Spawned by the main thread as new workstreams appear.

The behaviour you actually want

The notification bar

Less “something changed,” more “this changes what you should do.” During launch week, a teammate thread might watch merged use-case PRs, a migration PR going green, new Slack DMs, and calendar moves — and stay silent on everything else. Once docs merged, it stopped rechecking them. While the migration PR stayed blocked, it stayed quiet.

Two high-leverage sub-agents to start with

Inbox triage thread

Ask Codex to find messages that actually need replies and draft responses in your style without sending. Once the thread has seen what you approve, reject, shorten, and rewrite, it's a much better place to add an automation that watches the inbox.

PR watcher thread

Keep the same thread around after Codex helps with a change. It already knows the intent, files touched, tests, tradeoffs, and reviewer concerns — so you can ask it to watch CI, reviews, mergeability, and conflicts. Watching is recurring; merging/pushing stays explicit.

How to bootstrap your own

  1. Start one thread around a real recurring task.
  2. Correct it a few times. Tell it what to ignore. Tell it what needs approval.
  3. Put that thread on an automation.
  4. When it gets something wrong, steer the automation prompt from inside the chat.
The minimal version

You don't need an elaborate orchestrator. You can just ask: “Keep an eye on this for me.”

Chapter 10The 4-step framework for reliable results

Whether you're on your first prompt or your hundredth automation, the same pattern keeps results good.

1 · Pick something repetitive

Start with a workflow you already know well, that's easy to verify and easy to judge. Preferably something annoying that happens over and over.

2 · Give Codex the right materials

Put the files, docs, notes, and CSVs into the workspace you pointed it at. Context is where quality comes from.

3 · Describe the outcome clearly

Be specific about the goal, what good looks like, examples, what not to do, and how Codex will know when it's done. You don't need to spec every step.

4 · Iterate until it works

The first result is the starting point, not the answer. Review, ask it to explain its logic, point out gaps, refine the prompt. Talk to it like a teammate.

Step 4 — Iterate until it works
Step 4, expanded: review → ask for logic → point out gaps → refine → run again.

The team adoption pattern

How one team at OpenAI got hooked on Codex
Individual leverage is good. Team leverage is where this actually compounds.
How adoption really spreads

Not through a mandate. Through one useful win that makes the rest of the team curious. The OpenAI marketing team adoption story is canonical: someone turned a painful SQL / Python workflow into an interactive HTML dashboard, others saw it, and within a week they were each building their own.

Chapter 11Advanced moves & quick reference

A compact reference of everything worth knowing — prompting habits, power-user moves, and the answers to the questions that always come up.

Prompting habits that matter

Power-user moves

Plan Mode

Shift + Tab on Mac. Codex drafts a plan and asks follow-ups before touching anything.

/review

Ask Codex to review its own code or a repo. Useful once work moves from local folder to a shared repo.

Pinned threads

Pin recurring threads in the sidebar so long-lived workstreams stay put while one-offs churn.

Sub-agent spawn

In long-lived threads, let Codex spin up specialist sub-agent threads for new workstreams as they appear.

Voice dictation

Mic icon in the composer. Perfect for long prompts where typing breaks your train of thought.

Edit an automation from chat

Don't hand-edit the automation prompt — ask Codex in the thread to update it. Lower friction, keeps context.

Beginner mistakes to avoid

Non-technical skills that transfer most

A minute of timeline context

Key takeaways
The December 2025 GPT-5 release was the moment true delegation started to feel reliable: code quality, instruction following, and long-task stability all crossed the usability bar.

The one thing to take away

You don't need to become an engineer to get value from Codex. Start from one messy, real workflow that you do day after day. Get that one useful win under your belt — then let Skills and Automations turn it into leverage for your whole team.

Download: openai.com/codex
Docs: developers.openai.com/codex


Built from the OpenAI Academy session Introduction to Codex and field reports from @Dimillian and @nickbaumann_.