This is a persona page for engineers who hold the pager. It describes how Cosyra fits an incident-response workflow. For the product overview see the homepage; for other ways Cosyra is used see use cases.
Short answer. When production breaks and your laptop is not at
hand, bedside, airport, in-laws' kitchen, you need a real terminal on your phone,
not a GitHub mobile app. Cosyra is a mobile cloud terminal with
a full Ubuntu 24.04 environment: git, Node.js, Python, and tmux pre-installed,
kubectl and cloud CLIs installable once during setup, your runbooks
cloned and waiting, and Claude Code pre-installed for parsing stack traces and
proposing fixes. Read the logs, grep the code, form a hypothesis, ship the targeted
patch. Same debugging loop as a laptop, smaller screen.
What mobile on-call actually looks like (the bad version)
Without a real mobile terminal, phone-based on-call is a series of workarounds:
- GitHub mobile app is read-only in practice, you can see the PR, you cannot reasonably edit a file or test anything.
- PagerDuty / Opsgenie mobile apps give you alert metadata, not the cluster.
- Browser-based log viewers (Datadog, Grafana, Cloud Console)
work on a phone but cannot run
kubectland cannot chain together with grep over the codebase. - SSH from Blink works if you already have a bastion set up and your VPN client is on the phone, but the setup and keychain maintenance is its own project.
The end state of all of these is "I will get to a laptop and handle this properly." That is 20 to 90 minutes of extra downtime nobody wants.
What Cosyra gives you at 2 a.m.
A persistent Ubuntu container that is already set up the way you left it last time. The tools that matter are already installed; the repos you would have cloned are already cloned; the kubeconfig you would have authenticated is still authenticated.
$ # Phone vibrates. Open Cosyra. Container resumes.
$ cd ~/runbooks && cat checkout-service-500-spike.md | head
# Runbook from last week's post-mortem, cached here.
$ kubectl logs -n checkout deploy/api --tail=200 | tail
ERROR Pool exhausted after 30000ms (waiting for connection)
ERROR Pool exhausted after 30000ms (waiting for connection)
ERROR Pool exhausted after 30000ms (waiting for connection)
$ # Feed it to Claude to form a hypothesis fast.
$ claude
> Here are the last 200 log lines from checkout-service api. What is
> causing Pool exhausted and what commit introduced it?
# Claude reads, greps git log, points at commit d8a3e11 two hours ago
# which raised DB_POOL_IDLE_TIMEOUT from 30s to 30000ms accidentally.
$ git show d8a3e11 -- src/db/pool.ts
# 30000 → 30 typo confirmed.
$ git revert d8a3e11 -m "revert: pool timeout typo"
$ gh pr create --fill --label incident
https://github.com/acme/checkout-service/pull/8919
$ # 02:29, PR up with the revert. Page acknowledged, patch in review.
Twelve minutes from pager-fire to PR-up, without getting out of bed. That is the kind of on-call response that keeps customer-facing errors measured in minutes instead of hours, and it is reproducible because the environment is persistent, not reconstructed from scratch on each page.
Set it up before the first 2 a.m. page. Clone your main service repos, clone your runbooks, save your kubeconfig, paste your API keys. Ten minutes of prep now, twenty minutes saved on every future incident.
What to pre-install in your on-call container
One-time setup so the 2 a.m. version is just "open app, triage."
- Cluster tools:
kubectl,helm, and whichever cloud CLI your infra runs on (aws,gcloud,az). Install once during setup using the container's package manager; persists with your 30 GB of container storage. - Your runbook repo cloned somewhere predictable like
~/runbooks. Post-mortems accumulate signal; having last month's RCA onecataway is a real speedup. - The 3-5 service repos you page on most, cloned in
~/work. Do not wait to clone them during an incident. - Your cluster's kubeconfig(s) in
~/.kube/config. Cosyra containers are per-user isolated, so the config sits in your container, not on your phone's filesystem. - Claude Code is pre-installed;
claude --helpto confirm it is on$PATH. For incidents it is especially good at reading long stack traces and pinpointing likely root-cause commits.
What this replaces
- "Give me 20 minutes to get to my laptop." Those minutes are customer-facing downtime. The phone-to-laptop walk stops being part of your MTTR.
- Home VPN + SSH bastion on a mobile client. Still valid setup, still works, but it is now setup you maintain instead of setup that maintains itself. Cosyra is the managed version of that pattern.
- "I'll ack the page and check logs from the browser." You can, but you cannot ship a fix from a browser log viewer.
What this does not replace
A phone terminal is not the same as a laptop. Be honest with your team:
- Long multi-file rewrites during the incident. Fine for small reverts and targeted patches; do not try to rewrite auth at 2 a.m. on a 6-inch screen.
- Runbook writing. Writing the post-mortem can happen at a desk later. Cosyra is for the acute minutes.
- Access you would not give yourself on a laptop. If your team policy is "prod-write requires a work laptop," Cosyra does not let you work around that; it lets you do read-only triage and PR-level changes, which is usually enough for the acute window.
On-call engineer FAQ
Is it actually safe to respond to a production page from a phone?
Safer than the alternative in most cases. The alternative is either
ignoring the page until you reach a laptop or trying to debug through a
GitHub mobile app with no terminal access. A real Ubuntu terminal with gh, git, kubectl, and an AI coding agent lets you
run the same debugging loop you would run on a laptop, just on a smaller
screen. Your team's access rules still apply.
Can I install kubectl, aws-cli, or custom internal tools?
Yes. Cosyra runs a full Ubuntu 24.04 container, so the standard Linux package manager works for anything in the Ubuntu repos; npm, pip, and cargo cover the language ecosystems. Internal tools distributed as static binaries run the same way they would on any Ubuntu x86_64 machine. Installs persist with your 30 GB of container storage.
What about secrets?
Kubeconfig, SSH keys, and tokens live in the cloud container, not on the phone itself. The phone is a thin client rendering the terminal; secrets never touch its filesystem. Losing the phone means revoking Cosyra access, same as revoking SSO on a lost laptop.
Does it work in the middle of the night?
That is the core use case. App cold-start is a few seconds; Pro containers resume from hibernation in about ten. You land in the directory where you left the last incident, with the tools you installed last time still installed.
What if my company policy forbids prod-write from a phone?
Respect the policy. Cosyra still gives you read-only triage power, logs, git history, hypothesis formation with Claude Code, which usually covers the acute window, and hand off the actual prod change to someone at a laptop when policy requires.
tl;dr
When the pager fires and your laptop is not at hand, Cosyra gives you the
same tools, full Ubuntu, gh, kubectl, Claude
Code, your repos, in a native phone app that resumes from hibernation in
about ten seconds. Faster MTTR than "give me 20 minutes to get to the
laptop."