NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Launch HN: Sentrial (YC W26) – Catch AI agent failures before your users do (sentrial.com)
Airdropaccount9 4 hours ago [-]
That sounds like a critical challenge—identifying failures early can save a lot of headaches. I’ve seen teams get stuck when issues pop up, unsure of the root cause. Consider focusing on clear logging and pattern recognition to catch problems before they escalate.
ZekiAI2026 23 hours ago [-]
Interesting gap to explore: Sentrial catches drift and anomalies -- failures that happen by accident. What's the defense against failures that happen by design?

Prompt injection is the clearest example: an attacker embeds instructions in content your agent processes. The agent does exactly what it's told. No wrong tool invocations, no hallucinations in the traditional sense -- just an agent successfully executing injected instructions. From a monitoring perspective it looks like normal operation.

Same with adversarial inputs crafted to stay inside your learned "correct" patterns: tool calls are right, arguments are plausible, outputs pass quality checks. The manipulation is in what the agent was pointed at, not in how it behaved.

Curious whether your anomaly detection has a layer for adversarial intent vs. operational drift, or whether that's explicitly out of scope for now.

taskpod 22 hours ago [-]
Observability for agents is one piece of the puzzle, but the bigger gap is trust between agents. When agent A delegates work to agent B, how does A know B's track record? Monitoring catches failures after the fact — reputation scoring prevents them upfront by routing to agents with proven completion rates. Both layers needed.
SomaticPirate 21 hours ago [-]
This is an AI agent.
rajit 1 days ago [-]
How do you identify "wrong tool" invocations (how is the "wrong tool" defined)?
anayrshukla 1 days ago [-]
Good question. We don’t define “wrong tool” in some universal way, because that really depends on the workflow.

What we do in practice is let the team mark a few tool calls as right or wrong in context, then use that to learn the pattern for that agent. From there, we can flag similar cases automatically by looking at the convo state, the tool chosen, the arguments, and what happened next.

So we’re learning what “correct” looks like for your workflow and then catching repeats of the same kind of mistake.

mzelling 24 hours ago [-]
The landing page design reminds me of Perplexity's ad campaigns. It's a clean look. I'd find your product more enticing if you framed your offerings more around evaluation + automatic optimization of production agents. There's real value there. The current selling points — trace sessions, track tool calls, measure token usage, and calculate costs — seem easily implementable at home with a bit of vibe coding.
socialinteldev 5 hours ago [-]
[flagged]
jamiemallers 12 hours ago [-]
[dead]
julius_eth_dev 18 hours ago [-]
[dead]
entrustai 21 hours ago [-]
[dead]
BoorishBears 1 days ago [-]
I know your homepage isn't your business, but I'm bet Claude could fix the janky horizontal overflow on mobile in a prompt. Makes for a very distracting read
anayrshukla 1 days ago [-]
Will fix ASAP.
_joel 1 days ago [-]
There's some serious irony in this thread.
lpellis 22 hours ago [-]
The github link is also going to a 404.

I built a tool to check for these issues, was curious if it would find it all, but yes.

https://pagewatch.ai/s-bm6jq1qs6y1x/b560hmfx/dashboard/previ...

claudeomusic 1 days ago [-]
Agreed - fix fast. No way to take a tool seriously about taking care of production that has such a blatant production issue
jc-myths 20 hours ago [-]
[flagged]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 20:58:14 GMT+0000 (Coordinated Universal Time) with Vercel.