14.4 million kirana owners trust the Soundbox for payments. Now it answers their real business questions, in Hinglish, hands-free.
The Soundbox already runs 10 to 14 hours a day on kirana counters. Merchants know it. They trust it. The only thing missing is intelligence.
Paytm has 30 days of transaction data for every merchant. But accessing it mid-sale requires unlocking a phone, opening an app, and navigating. That friction kills the insight.
Merchants already double-press the Soundbox button to hear their daily total. They do not want to open an app mid-transaction. Hands-free is the only UX that works.
"Customer bola payment nahi hua" is the number one merchant pain point. Resolving it currently means finding a phone, opening transaction history, and scrolling. It should take one sentence.
Merchants do not know their peak hours, slow days, or weekly trends unless they dig through the app. Decisions get made blind.
Paytm already has the behavioral data to say "you qualify for 80,000 rupees." Merchants never hear it unless they go looking.
A conversational AI layer added to the Soundbox the merchant already owns. No new hardware. No app download. No onboarding. Just ask.
Speak in Hinglish. The Soundbox listens, understands, and responds in under 3 seconds.
Web Speech API, hi-INReasons over 30 days of real transaction data. Not generic advice, not a lookup. Actual reasoning.
Paytm Inference APIResponds the way merchants actually talk. Warm, direct, specific. Like a trusted munshi, not a chatbot.
llama-3.3-70bBuilt on Paytm's own Inference API. P50 latency under 80ms on Groq LPUs.
"Aaj kitna hua?" or "Loan milega?" or "Customer bola payment nahi hua." Any natural query works.
Web Speech API, hi-IN localehi-IN locale handles Hinglish code-switching naturally. Typed input and quick-ask buttons serve as fallbacks for noisy environments.
90%+ accuracy in quiet conditionsPre-computed transaction summaries are injected into the system prompt. The model does not calculate; it formats exact figures into a Hinglish response. No hallucinations by design.
Paytm Inference API, llama-3.3-70b-versatile2 to 3 sentences. Specific numbers. Warm tone. Web Speech Synthesis reads the response aloud while the text appears on screen.
Web Speech Synthesis APITap any query to see the kind of response the CFO gives.
No backend. No build step. One HTML file. Open in browser, demo instantly.
The Soundbox generates behavioral data every day. Soundbox CFO closes the loop by returning that intelligence to the merchant.
85% of Paytm's merchant loans go to Soundbox users. The device already generates the data for loan approval. Surfacing eligibility proactively through voice converts passive data into active revenue.
Runs on the Soundbox that is already deployed. The AI layer is software only.
At 10 queries per day, 150 tokens each, on Paytm Inference pricing. Well within subscription economics.
"The AI Soundbox will become a CFO, COO, and CMO for every merchant, giving them intelligence previously only available to large enterprises."