inari@piefed.zip to People Twitter@sh.itjust.worksEnglish · 14 days agoManagersmedia.piefed.zipimagemessage-square179linkfedilinkarrow-up1917arrow-down13
arrow-up1914arrow-down1imageManagersmedia.piefed.zipinari@piefed.zip to People Twitter@sh.itjust.worksEnglish · 14 days agomessage-square179linkfedilink
minus-squareKaligalis@lemmy.worldlinkfedilinkarrow-up15arrow-down1·14 days agoIt might not be as impossible as it sounds. Some of the “open” models are rumored to be able to code. The real problem is that you likely need something with 128 GiB VRAM to run them with a reasonably large context window.
minus-squareIratePirate@feddit.orglinkfedilinkarrow-up7·14 days agoAn Nvidia B200 (192 Gigs of RAM) sells somewhere between 30-50k a pop. That’s feasible for a company.
minus-squareKazumara@discuss.tchncs.delinkfedilinkarrow-up4·13 days agoAnd then you can serve one inference at a time. Hopefully your devs are well distributed over timezones :-)
minus-squareDiurnambule@jlai.lulinkfedilinkarrow-up5·13 days agoWonderfull idea, may be they can connect to the same PC, and we can call it main frame or something. xD
minus-squarebaronofclubs@lemmy.worldlinkfedilinkarrow-up2·13 days agoI don’t see why it wouldn’t be feasible to rent someone else’s computer to use for something like this, seeing how it could amortize costs over time.
minus-squaremindbleach@sh.itjust.workslinkfedilinkarrow-up4·14 days agoQwen’s 27B model from April outperforms its 397B model from February. Local and small were always going to win.
minus-squareDiurnambule@jlai.lulinkfedilinkarrow-up1·13 days agoQwen 3.6 ? It is unstable though. It go awry more often than the 3.5 of the same size.
It might not be as impossible as it sounds. Some of the “open” models are rumored to be able to code. The real problem is that you likely need something with 128 GiB VRAM to run them with a reasonably large context window.
An Nvidia B200 (192 Gigs of RAM) sells somewhere between 30-50k a pop. That’s feasible for a company.
And then you can serve one inference at a time. Hopefully your devs are well distributed over timezones :-)
Wonderfull idea, may be they can connect to the same PC, and we can call it main frame or something. xD
I don’t see why it wouldn’t be feasible to rent someone else’s computer to use for something like this, seeing how it could amortize costs over time.
Qwen’s 27B model from April outperforms its 397B model from February.
Local and small were always going to win.
Qwen 3.6 ? It is unstable though. It go awry more often than the 3.5 of the same size.