Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲TurboQuant: Redefining AI efficiency with extreme compression (research.google)

14 points by davidbarker 1 days ago | 1 comment

Reubend 1 days ago [-]

This looks great, but I'm wondering how effective this would be for full model weights rather than just the KV cache. Their paper only gives results for the KV cache use case, which strikes me as strange since the algos are claimed to be near optimal.

Rendered at 06:53:43 GMT+0000 (Coordinated Universal Time) with Vercel.