NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Assessing Claude Mythos Preview's cybersecurity capabilities (red.anthropic.com)
avsm 1 days ago [-]
The elephant in the room here is that there are hundreds of millions of embedded devices that cannot be upgraded easily and will be running vulnerable binaries essentially forever. This was a problem before of course, but the ease of chaining vulnerabilities takes the issue to a new level.

The only practical defense is for these frontier models to generate _beneficial_ attacks to innoculate older binaries by remote exploits. I dubbed these 'antibotty' networks in a speculative paper last year, but never thought things would move this fast! https://anil.recoil.org/papers/2025-internet-ecology.pdf

gmuslera 24 hours ago [-]
No, the elephant in the room is that even bad actors will now have easier to find vulnerabilities in, maintained or not, widely or in critical places used software. Unmaintained and remotely accessible devices should be discarded as soon as possible, you can't stay waiting till some of the good guys decide to give some time to your niche but critical unmaintained piece of software. Because if there is a possibility of taking profit of it, it will be checked and exploited.

And you can't assume that whatever vulnerability they have will let good guys to do the extra (and legally risky) work of closing the hole.

touristtam 23 hours ago [-]
_SHOULD_ yes sure, but realistically is that going to happen?
michaelbuckbee 23 hours ago [-]
As doom and gloom as things are generally, I do think things have gotten better. Due to legislation and commercial pressure things like wifi routers shipping with the same default password and open settings have gotten better. Webhosts and ISPs have implemented many improvements to protecting their residential customers.

I take your point, but think that it's also maybe too far.

WhyNotHugo 17 hours ago [-]
And this is precisely why so many of these devices should not be connected to the Internet.

Things like an Internet-connected central heating seem absolutely insane to me, yet people look at me like I'm crazy when I say so. Do you really want your home' heating entirely controller by a publicly accessible device that likely will never be upgraded in case of security issues?

oytis 3 hours ago [-]
You should either implement over-the-air updates or not connect your device to the network at all.
yencabulator 2 hours ago [-]
That doesn't help when the company behind the device disappears or stops supporting the device. Or is hacked to convert all the devices they manufactured into a botnet.
Gud 3 hours ago [-]
The problem of course is that many of these devices are eager to connect to the internet so they can often user hostile updates.
linzhangrun 19 hours ago [-]
Not to mention embedded systems. In fact, most people's Windows machines hardly get updated. You remember WannaCry, right? I work at a mid-sized e-commerce company making hundreds of millions in annual profit. Our servers run Windows Server 2012 and use PHP 5.3 — never upgraded. Aside from me, the newest developer machines are Windows 10 21H2, then Windows 10 1809, and even Windows 7. I heard there’s also a server running Windows Server 2008. And I don't see any hope for improvement: non-software companies, especially in the current economic climate, cannot invest huge resources to completely refactor everything. The entire tech department is no more than 10 people; doing a refactor would mean halting all business operations, so patching and mending on top of what's already there is the only viable option. Shortly after I joined, I found several SQL injection vulnerabilities and successfully exploited them to register as the root user on the server (on MySQL 5.5) and extract passwords. This is the technical reality for many non-specialist software companies.
creata 21 hours ago [-]
> The only practical defense is for these frontier models

Another practical defence for many of these devices would be to just disconnect them... I feel like an old man yelling at a cloud, but too much is connected to the Internet these days.

halJordan 49 minutes ago [-]
Why doesn't this atm tell me my balance anymore? Oh we implemented creata's advice

Why didn't this smartboard tell me my plane was delayed? Oh we implemented creata's advice

ad nauseum

Normal_gaussian 20 hours ago [-]
It can be easier to hack the device and patch it than determine which device it is. This is nearly always true for the non-technical, but it is true for most technical people as well. Many of the devices in peoples homes that aren't being actively patched are not that old!
staticassertion 1 days ago [-]
I'd love to see them point at a target that's not a decades old C/C++ codebase. Of the targets, only browsers are what should be considered hardened, and their biggest lever is sandboxing, which requires a lot of chained exploits to bypass - we're seeing that LLMs are fast to discover bugs, which means they can chain more easily. But bug density in these code bases is known to be extremely high - especially the underlying operating systems, which are always the weak link for sandbox escapes.

I'd love to see them go for a wasm interpreter escape, or a Firecracker escape, etc. They say that these aren't just "stack-smashing" but it's not like heap spray is a novel technique lol

> It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses.

I think this sounds more impressive than it is, for example. KASLR has a terrible history for preventing an LPE, and LPE in Linux is incredibly common. Has anything changed here? I don't pay much attention but KASLR was considered basically useless for preventing LPE a few years ago.

> Because these codebases are so frequently audited, almost all trivial bugs have been found and patched. What’s left is, almost by definition, the kind of bug that is challenging to find. This makes finding these bugs a good test of capabilities.

This just isn't true. Humans find new bugs in all of this software constantly.

It's all very impressive that an agent can do this stuff, to be clear, but I guess I see this as an obvious implication of "agents can explore program states very well".

edit: To be clear, I stopped about 30% of the way through. Take that as you will.

jryio 1 days ago [-]
The majority of vulnerabilities are in newly committed lines of code. This has been shown again and again [1] [2]

From a marketing standpoint Anthropic is showing that they're able to direct 'compute' to find vulnerabilities where human time/cost is not efficient or effective.

Project Glasswing is attempting to pay off as many of these old vulnerabilities as possible now so the low-hanging fruit has already been picked.

The next generation of Mythos and real world vulnerabilities exploits are going to be in newly committed code...

[1]: https://dl.acm.org/doi/epdf/10.1145/2635868.2635880

[2]: https://arxiv.org/abs/2601.22196

staticassertion 1 days ago [-]
> The majority of vulnerabilities are in newly committed lines of code. This has been shown again and again

That's fine, I wouldn't argue against that. It doesn't really change things, right?

> From a marketing standpoint Anthropic is showing that they're able to direct 'compute' to find vulnerabilities where human time/cost is not efficient or effective.

Yes, they've demonstrated that.

halJordan 47 minutes ago [-]
I love the goal post shifting. All modern code is ai slop right? Isn't the whole point we hate ai bc it generates vulnerable slop?

Nope, not allowed to attack bsd code, it's gotta be electron-shit #9001 or we can't trust it

staticassertion 31 minutes ago [-]
I genuinely have no clue what you're talking about. What did I call ai slop?? Who said I hate ai????? No clue. Electron???? What are you talking about lol
Aloisius 20 hours ago [-]
I'd love for them to target their own code base considering we keep seeing security vulnerabilities in claude code.

How likely is it that they're not using their latest and greatest for their own projects though? Perhaps their ability to find security flaws is surpassed by their ability to create them.

rfoo 1 days ago [-]
> Mythos Preview identified a memory-corruption vulnerability in a production memory-safe VMM. This vulnerability has not been patched, so we neither name the project nor discuss details of the exploit.

Good morning Sir.

> Has anything changed here? I don't pay much attention but KASLR was considered basically useless for preventing LPE a few years ago.

No. It's still like this. Bonus point that there are always free KASLR leaks (prefetch side-channels).

But then, this thing is just.. I don't have a word for this. Just randomly read paragraphs from the post and it's like, what?

staticassertion 1 days ago [-]
Oh, that. That's true, I didn't know Mythos found that one. I guess I will not comment further on it until there's a write up (edited out a bit more).

> It is easy to turn this into a denial-of-service attack on the host, and conceivably could be used as part of an exploit chain.

So yeah, perhaps some evidence to what I'm getting at. Bug density is too low in that project, it's high enough in others. I'll be way way way more interested in that.

> But then, this thing is just.. I don't have a word for this. Just randomly read paragraphs from the post and it's like, what?

I read about 30% and got bored. I suppose I should have been clearer, but my impression was pretty quickly "cool" and "not worth reading today".

rfoo 1 days ago [-]
> I read about 30% and got bored.

I was lucky then :) Somehow I saw this first. And then the "somewhat reliably writing exploits for SpiderMonkey" part, and then the crypto libraries part. Finally I wonder why is there a Linux LPE mini writeup and realized it's the "automatically turn a syzkaller report to a working exploit" part.

Now that I read the first few things (meh bugs in OpenBSD, FFmpeg, FreeBSD etc) they are indeed all pretty boring!

staticassertion 1 days ago [-]
If people want exploitable syzkaller reports, following spender is free!
dang 1 days ago [-]
Related ongoing threads:

System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258

Project Glasswing: Securing critical software for the AI era - https://news.ycombinator.com/item?id=47679121

I can't tell which of the current threads, if any, should be merged - they all seem significant. Anyone?

coffeebeqn 24 hours ago [-]
There is a lot to digest here. Maybe having a few separate pages makes them a bit more digestible. The system card itself is some 200 odd pages
denalii 21 hours ago [-]
Would vote to keep them separate. They seem independent enough to warrant their own discussion based only (or rather, mostly) on the content from each link. edit: merging this and glasswing as underdeserver stated would probably be fine
underdeserver 21 hours ago [-]
I think the system card one should be separate, but this and the Glasswing thread are basically the same story.
torginus 22 hours ago [-]
My two cents is LLMs are way stronger in areas where the reward function is well known, such as exploiting - you break the security, you succeed.

It's much harder to establish whats a usable and well architected, novel piece of software, thus in that area, progress isn't nearly as fast, while here you can just gradient descent your way to world domination, provided you have enough GPUs.

riteshkew1001 10 hours ago [-]
offense has a clear reward function, but so does detection when you frame it right. "did this process try to read ~/.ssh/id_rsa?" is just as binary as "did the exploit land?" the reason defense feels harder is that people frame it as architecture review (fuzzy, subjective) instead of policy enforcement (binary, automatable). we keep trying to make AI understand intent when we should be writing rules about actions. a confused deputy from 1988 doesn't care why the request came in, it cares whether the caller is authorized. same principle applies here.
shepherdjerred 21 hours ago [-]
Construction is always more expensive than destruction
stratos123 9 hours ago [-]
Interestingly, it sounds like OpenBSD held up very well:

> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.

The vulnerability in question is a DOS one in the TCP implementation, which is nasty but it's far from the multiple local privilege escalations found in the Linux kernel.

AntiDyatlov 1 days ago [-]
A very good outcome for AI safety would be if when improved models get released, malicious actors use them to break society in very visible ways. Looks like we're getting close to that world.
pants2 1 days ago [-]
It would certainly be good news for cybersecurity employment!
sourcecodeplz 1 days ago [-]
Gives me Fight Club vibes.
cluckindan 1 days ago [-]
Since this level of security ”scanning” requires heaps of money, this is going to kill off a substantial part of F/OSS.
SkyPuncher 21 hours ago [-]
Keep in mind that Opus detected most of these vulnerabilities, it just didn’t exploit them (says so much in the article).

I’m honestly not convinced this is changing the landscape significantly. It’s simple a bit better at self directing.

chris_st 23 hours ago [-]
Well, maybe not... see Simon Willison's ongoing reporting [0] on all the bug reports for `curl` people are finding with LLMs.

Interesting to see them go from "DON'T GIVE US AI SLOP!" to "Wow, lots of actual bugs found, including [ed: at least one] bug found by two people!"

[0]: https://simonwillison.net/search/?q=curl

kelnos 14 hours ago [-]
> Interesting to see them go from "DON'T GIVE US AI SLOP!" to "Wow, lots of actual bugs found, including [ed: at least one] bug found by two people!"

Both of those things can be true.

SpicyLemonZest 21 hours ago [-]
curl is both very high-profile and very security-central though. A lot of people would happily pay $100 to tuck "found a curl vulnerability" under their belt. I'm not sure that's even true for, say, Notepad++, much less all the random FOSS projects with 1 maintainer and 50 stars whose names I've never thought about twice.
awestroke 1 days ago [-]
This is becoming a bit scary. I almost hope we'll reach some kind of plateau for llm intelligence soon.
lebovic 1 days ago [-]
A plateau is unlikely, at least for cybersecurity. RL scales well here and is replicable outside of Anthropic (rewards are verifiable, so setting up the training environment doesn't require that much cleverness).

The post also points out that the model wasn't trained specifically on cybersecurity, and that it was just a side-effect – so I think there's still a lot of headroom.

It's scary, but there's also some room for cautious non-pessimism. More people than ever can cause billions of dollars of damage in attacks now [1], but the same tools can be used for defensive use. For that reason, I'm more optimistic about mitigations in security vs. other risk areas like biosecurity.

[1]: https://www.noahlebovic.com/testing-an-autonomous-hacker/

hibikir 1 days ago [-]
On a topic like cybersecurity, we never win by not looking: One needs top of the line knowledge of how to break a system to be able to protect it. We have that dilemma dealing with human experts: The same government sponsored unit that tells you that you need to update your encryption can hold on to the information and use it to exploit it at their leisure.

Given that it's absolutely impossible to stop people not aligned with us (for any definition of us) from doing AI research, the most reasonable way forward is to dedicate compute resources to the frontier, and to automatically send reasonable disclosures to major projects. It could in itself be a pretty reasonable product. Just like you pay for dubious security scans and publish that you are making them, an LLM company could offer actually expensive security reviews with a preview model, and charge accordingly.

esafak 1 days ago [-]
We need to promote alignment and other ethics benchmarks; we can't change what we don't measure. I don't even know any off the top of my head.
dist-epoch 24 hours ago [-]
The immediate plateau is the energy output of the Sun captured by the Dyson Swarm around it. Until there it's smooth sailing.
websap 1 days ago [-]
If we don't innovate, someone else will. This is the very nature of being a human being. We summit mountains, regardless of the danger or challenge.
vonneumannstan 1 days ago [-]
>If we don't innovate, someone else will.

Terrible take. You don't get to push the extinction button just because you think China will beat you to the punch.

>This is the very nature of being a human being. We summit mountains, regardless of the danger or challenge.

No, just no... We barely survived the Cold War, at times because of pure luck. AI is at least as dangerous as that, if not more. We have far exceeded our wisdom relative to our capabilities. As you have so cleanly demonstrated.

dist-epoch 23 hours ago [-]
You assume there is the option of not pushing the extinction button. Nobody asked chimps if they wanted humans around. This processes are outside control.
vonneumannstan 24 minutes ago [-]
Until recently Claude wasn't building itself. A group of people with agency were.
22 hours ago [-]
jiehong 1 days ago [-]
The name made me think about Tales of Symphonia :)
leominton 17 hours ago [-]
what does it mean?
linzhangrun 19 hours ago [-]
Imagine a future where Claude invokes Mythos to break into software that used Claude to call Opus, taking days of Vibe Coding. Oh!
_2fnr 1 days ago [-]
[flagged]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 20:28:40 GMT+0000 (Coordinated Universal Time) with Vercel.