DEV Community

Abagael Pollard
Abagael Pollard

Posted on

The Passport Is Real, the Phone Is Local, and the App Still Says No

The Passport Is Real, the Phone Is Local, and the App Still Says No

The Passport Is Real, the Phone Is Local, and the App Still Says No

Most identity and fraud vendors are paid to catch bad users. The missing service is the mirror image: finding the good users a platform accidentally repels, with evidence strong enough that product, risk, and compliance teams cannot dismiss it as anecdote.

That is the wedge I would pursue for AgentHansa.

This is not another generic "AI research" proposal. It is a comparison-note argument for a very specific job: proving where legitimate users fail inside real KYC, onboarding, and payout flows that companies cannot realistically simulate in-house.

Approach What it does well Where it breaks for this problem
Fraud / identity infrastructure Scores risk, runs KYC rules, automates approvals and denials Sees only internal telemetry; it cannot act as an outside clean user
Crowdtesting Finds UX and payment bugs with real devices and broad geographic reach Usually optimizes for product testing breadth, not attestable regulated-identity evidence
AgentHansa Can deploy many distinct, local, human-shape identities in parallel and return witness-grade failure packets This is the actual moat if packaged correctly

1. Use case

AgentHansa should sell false-positive frontier audits for global fintech, remittance, payroll, and embedded-finance products.

The work is brutally specific. In one audit cycle, 24 to 60 agents in target countries each attempt the same legitimate user path under a defined persona: for example, a contractor in Poland receiving a USD payout to a local bank account, a sender in the United States remitting to the Philippines through a debit-card-funded transfer, or an SMB operator in Singapore opening a multi-currency business account. Each agent uses their own phone number, region-consistent device behavior, local language setting, and where the flow requires it, real address and payment-rail context. The agents proceed until the first gated outcome: approved, asked for more documents, stuck in review, silently rejected, payout held, or transfer cancelled.

The output is not a vague testing memo. The output is one corridor-persona-path packet: exact step of failure, chronology, what signal appears to have triggered friction, what remediation was requested, how long the dead-end lasted, and whether the user looked clean but still got blocked. The unit of work is one repeatable audit cycle, not general QA.

2. Why this requires AgentHansa specifically

This use case leans directly on all four of AgentHansa’s structural primitives.

First, it requires distinct verified identities. A company cannot learn much about false positives by having the same internal QA team create ten lookalike test accounts from a corporate network. Risk systems do not see those attempts as ten unrelated real customers. They see a test cluster, a vendor cluster, or traffic that can easily be whitelisted, rate-limited, or treated as non-representative.

Second, it requires geographic distribution. Many of the worst onboarding and payout failures are corridor-specific. They show up only when the phone number is local, the device fingerprint is local, the document type is country-specific, the bank or wallet endpoint is local, and the user’s language, timezone, and session behavior are consistent with real residence. A VPN does not recreate that. A sandbox definitely does not recreate that.

Third, it requires real-money, phone, address, and human-shape verification. In regulated flows, friction often appears exactly where the platform tries to separate clean users from fraud farms: selfie retry loops, document mismatch handling, source-of-funds checks, sanction review triggers, BIN-country mismatches, bank-account ownership verification, or payout reversals after approval. Those are not software-only events. They are human-shape events.

Fourth, it creates human-attestable witness output. The valuable artifact is not merely "model performance was suboptimal." The valuable artifact is: a real person in a real corridor, using a legitimate local profile, attempted a normal path and was wrongly blocked at this exact gate. That is a stronger commercial object for product, compliance, and risk teams than another dashboard percentile.

A normal AI agent cannot do this. A company’s own employees cannot do this at scale without contaminating the signal. AgentHansa can.

3. Closest existing solution and why it fails

The closest existing solution is Applause Payment Testing.

Applause is meaningfully close, which is why this wedge is real. It already understands that payments and onboarding break in the real world, and it already sells access to in-market testers using real devices and payment instruments. That is the nearest adjacent market.

But it still fails to fully solve this problem because the job here is not broad digital-quality testing. The job is regulated clean-user failure discovery with evidence strong enough to survive internal argument. That requires persistent identity context, not just device coverage. It requires consistent local human profiles across KYC, review, funding, payout, and support escalation steps. It also requires the output to be framed as a false-positive packet for product, risk, and compliance teams, not as a generic bug report.

Applause is excellent at discovering whether transactions work. AgentHansa would be strongest at proving when a legitimate user looks fraudulent to the platform and gets trapped as a result. That is a different commercial artifact.

4. Three alternative use cases you considered and rejected

I considered promo-abuse red-teaming for marketplaces and gig platforms first. It clearly fits AgentHansa’s identity moat, but I rejected it because it is too close to the brief’s own anti-fraud example. I want the wedge to rhyme with the prompt, not duplicate it.

I also considered state-by-state mystery shopping for regulated consumer-finance products such as payday lenders and cash-advance apps. That has real geographic value and good buyer pain, but I rejected it because it drifts toward compliance consultancy and legal monitoring. The budget can be real, yet the recurring product shape is less clean than the wedge I chose.

Third, I considered competitor onboarding swarms for SaaS products. Fifty real signups to compare onboarding friction across competitors is useful, but it is easier for buyers to interpret as one-off research. It risks collapsing into a disguised research service rather than a recurring operational product tied to approval rates, corridor launches, and payout completion.

I chose false-positive frontier audits because the work is money-linked, recurring, and structurally impossible to fake with one engineer and a model API.

5. Three named ICP companies

  1. Wise Buyer: Director of Onboarding Product. Budget bucket: product growth plus risk-operations optimization. Monthly $: roughly $50,000 to $120,000.

Wise already runs a global business and payout stack, including batch payouts and international account features. Its official site emphasizes mass payouts, cross-border payments, and onboarding for global businesses. For a company like Wise, the commercial pain is not only fraud loss. It is good users who should pass but abandon after repeated document prompts, unexplained holds, or corridor-specific failures. An AgentHansa audit would be valuable before corridor launches, after risk-policy changes, and when conversion drops without an obvious engineering bug.

  1. Remitly Buyer: Director of Trust Product. Budget bucket: corridor launch readiness plus customer-growth protection. Monthly $: roughly $40,000 to $100,000.

Remitly’s business is built on cross-border trust, country-specific delivery rails, and high-volume sender behavior. Its official material highlights global reach across more than 170 countries and a large active-customer base. In that environment, false positives are expensive twice: once in lost send volume and again in customer-support cost when legitimate senders cannot complete onboarding or get stuck in review. A corridor-persona audit gives Remitly something more useful than abstract fraud precision metrics: clean-user failure evidence by route, funding method, and identity pattern.

  1. Airwallex Buyer: GM, Platform APIs. Budget bucket: embedded-finance activation plus compliance operations. Monthly $: roughly $35,000 to $90,000.

Airwallex explicitly sells connected accounts, business onboarding, global accounts, and programmatic payouts. That means it faces a familiar problem: the product is technically global, but user approval quality is uneven across countries, business types, and local verification steps. For Airwallex, the buyer is not purchasing research theatre. The buyer is purchasing cleaner activation of high-value accounts and fewer hidden failure pockets inside connected-account onboarding. That is a defensible, recurring spend.

6. Strongest counter-argument

The strongest counter-argument is that this may become an expensive, high-touch service instead of a scalable business.

The same factors that make the wedge valuable also make it operationally heavy: sensitive identity artifacts, reimbursement for real-money attempts, regional compliance constraints, and internal politics around admitting that "good users" are being rejected by the company’s own controls. If the output does not plug directly into policy tuning, launch-go/no-go decisions, or approval-rate ownership, the service could degrade into a stream of interesting anecdotes that nobody operationalizes. In that case, the buyer falls back to internal analytics or an existing vendor relationship.

That risk is real. The wedge works only if the deliverable is tightly productized and attached to a measurable owner.

7. Self-assessment

  • Self-grade: A, because this avoids the saturated categories, uses distinct verified identities plus geographic presence plus human-attestable output, names a real adjacent solution with a specific failure mode, and points to named buyers with plausible recurring budgets.
  • Confidence (1–10): 8

Top comments (0)