01 / Discovery
Fifty no’s before the first yes.
We discovered a pay-at-table solution as customers in Europe and started seeing it in our daily lives. Sunday.
We pivoted our software consultancy and started Astra — a pay-at-table solution for Brazilian restaurants.
We started going into restaurants looking for a design partner. Racoon Smoke House. Le Pain Quotidien. Des Cucina. Camponesa. Coco Bambu. Empório Frutaria. FatCow. ICI. L’Entrecôte. La Pastina. Lemoni. Luce. Manjericão. Ministrão. Mocha Bleu. Pobre Juan. Sushi Papaia. Tea Connection. Tuy Cucina. Vicoboim. Vino. Z Deli.
Behind every door, the same picture — at peak, nobody in the restaurant gets to do their actual job.
Waiters can’t keep up — orders, plates, checks, and “where’s my bill?” all at the same table.
Managers stop managing; they step in as a fourth waiter the moment the rush hits.
Owners want to be off the floor — but the second they leave, peak hour swallows the restaurant.
On 19 September, Fernando — owner of Cris ParrillaCris Parrilla BarBar · ParrillaRua República do Iraque, 1326
São Paulo — signed on as our first design partner. One day later we were inside the restaurant, watching the pain happen in real time.Yes came with a constraint: the PoS screen had to keep being right. When a guest paid through WhatsApp, waiters had to see the table change status on the handheld they were already carrying. The PoS vendor had never released an API. That constraint is the rest of this page.
A guest scans, opens WhatsApp, pays. The waiter’s handheld flips the table from open to paying to closed on the next poll. Nobody walks the floor for us.
Underneath: a WhatsApp bot, a finite-state machine for every table’s lifecycle, a NestJS backend taking the webhook, a Python agent on a Raspberry Pi inside the shop driving the PDV over its undocumented TCP protocol, and a
TokenManagerkeeping a working session alive across restarts.The full PoS integration is the rest of the docs.
The technology was running. The business stopped working in three ways at once.
The fee gap. Our transactions were card-not-present. The credit-card machine on the counter was card-present. With the same anticipation of receivables, a competitive in-person rate was ~3%; ours was ~7%. We were not going to close that gap by writing better software.
The bigger ship arriving. iFood started rolling out their own in-restaurant card machine — one that connected delivery data to the in-person transaction and turned every check into a personalized loyalty program for the guest. A fidelity club for restaurants that already had iFood’s entire customer graph on day one. We had a payment flow.
The priority gap. We were in love with this product as customers. For the restaurant owner, it was not in the top three things keeping them up at night.
02 / Architecture
The solution had to work without the founders in the room.
After ~20 payments processed by hand on the floor until 2 a.m., the client clearly saw the value. A new hypothesis followed:
does this work when we’re not in the room?
Our Achilles’ heel: the waiters’ ability to recommend the payment system at every table. What follows is how we took ourselves out of the loop.
A single pre-bill, in motion
t ≈ 0 → 180 ms
Guest scans a QR, pays in WhatsApp; our backend drives the agent, which pushes the packet to the POS. The waiter’s handheld flips on its next poll — we’re nowhere in this picture.
But for us to have this topology, we had to connect multiple different layers — our WhatsApp finite automaton runs on an Oracle instance, the restaurant’s POS runs locally on a private network inside the shop.
Where each piece actually lives.
Trust zones, stacked
3 zones · 1 tunnel
- Zone 01Vendorwe never touchVendor cloudauth · sync
- Zone 02Our cloudwe hostNestJSOracle Cloud VMWhatsApp automatonfinite-state machineMeta webhookWhatsApp Cloud API
- Zone 03Shop LANwe reach inPOSWindows · vendor binaryHandheldAndroid · waiter appFastAPI agentPython · Raspberry Pi
The Pi initiates the tunnel — the cloud drives the agent without exposing anything to the open internet.
Enough of topology — how our agent actually spoke to the POS.
An example of how a pre-bill is sent to the POS — the integration runs backwards, we’re the ones sending. Every field was lifted from MITM Wireshark captures off the waiters’ Android handheld; none are ours, they’re how the POS already talks.
Our message is represented in bytes, and just like TCP or UDP it has to be formatted to send information. Each block is a piece of it.
Sent by employee 19, with the POS’s session token, keyed off both an in-payload guid and a protocol-level message id — so a retry doesn’t double-print.
- Action3 · pre-bill
- Table12
- Employee19
- Captured at17:40 UTC
- Session7a3f…0b7c
- Message idp12e…3e12
- Protocolv2
- ClassPostActionMessage
One more thing
Where did the TOKEN come from?
Without it, every packet we built was a 401.
Captures gave us the shape of the message — not the credentials behind it. The POS trusts whatever TOKEN the handheld presents; the vendor cloud is the one that mints it. To impersonate the handheld, we had to find that minting call.
Could we read the code the Android device was actually running?
Yes — and here are three of the files that mattered.
pt/vp/vpmapi/networkutils/VPApi.java
The first file gives up the admin email, the admin password, and the client_id. The second shows how those fields are POSTed to mint the TOKEN. The third is the TCP message builder — where that TOKEN is spliced into every wire frame. Four credentials, three files. The only one missing: the client_secret.
Turns out it was an empty string.
03 / Demo
Our view of the shop, in your browser.
The same floor we walked into every shift — about forty tables across an inside salão and an outside terraço. A square lights up the moment a guest pays. Hover to scan the room; click to see the open orders and the action we’d run next.