Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Hallucinations Undermine Trust; Metacognition Is a Way Forward (arxiv.org)

19 points by gmays 2 days ago | 7 comments

cadamsdotcom 16 hours ago [-]

Metacognition in our brains is of course part of cognition and it’s constrained by the time needed for a response and your energy budget (eg. it’s lower if you’re tired)

It seems humans subconsciously adjust these constraints constantly in real time too.

There’s just a ton of machinery before we get close to something we can call intelligence..

But in the meantime, we should be grounding a model’s responses by giving it context and letting it research with tools; and having secondary out of band processes that go over what the model emitted and enriching the UI with “warning: possibly untrue/nuanced” underlines, call-outs, sources, and outright killing some responses from ever making it to a user. Constrained output could also let the model self-enrich with sources; Anthropic and OpenAI both have API modes and UIs that do this.

spacebacon 2 days ago [-]

This repository empirically proves computational semiotics.

The only “metacognitive” (2nd order) and metapragmatic (3rd order) model I’m aware of.

ryandvm 2 days ago [-]

Unproductive tangent: Why do we call it "hallucinationing" instead of "bullshitting" when that is so clearly what it is?

If I'm talking to a guy that says, "I have a really fast metabolism, that's why I can eat whatever I want", he's not hallucinating - he's full of shit.

readthenotes1 2 days ago [-]

Avoiding crass language may be one reason.

Just think, one of the hullabaloos of today is because our ancestors were two Victorian to put "sex" on the birth certificate

cyanydeez 2 days ago [-]

bullshitting is an intent; hallucination is a misrepresentation of something.

I agree that the way models are trained, their basic presentation is bullshitting, but models dont have intent. Nor, do I think, they can be trained to have intent. They have a k value that generates non-determinism for them, that leads to the bullshitting.

But the intent part is what I don't think bullshitting properly describes. LLMs don't have intent.

I do think you could refer to their affect as narcissism. Every time I see them bullshit something, it's often because it thinks there's no way what _its doing_ is wrong; like when it tries to run pgsql using default settings (despite being told the credentials), it reports it can't connect because the database isn't running, then it goes and tries to run it. Sometimes it gets back correctly, sometimes not.

But it's not bullshitting because there's no intent it's essentially just the type of narcissism that everything else is wrong.

holtkam2 2 days ago [-]

IDK if the author's 'metacognition' needs to be a feature of the LLM itself.

I could imagine a harness that 1) reads LLM output 2) uses a research sub-agent to attempt to verify any factual claims 2) rephrase the main agent's output such that it conveys uncertainty if the factual claim cannot be independently verified

spacebacon 1 days ago [-]

It does need to be and doing so is possible at 0 cost to CE. The SRT does just that. It actually improves the CE.

Rendered at 22:11:08 GMT+0000 (Coordinated Universal Time) with Vercel.