I’ve been using Qwen 2.6 locally to good effect still. What are you on now? 2.7 seems good too but I’m very cagey about funding AI companies and so have had limited access. Same deal with Deepseek4 which seems alright.
Recently? Gemma4 locally. Just got around to playing with the 12B Quantized Aware Training model today a little. Surprisingly powerful for only 12 billion parameters. What surprises me most is the efficiency stuff that’s packed into it. I’ve only got 8gb of VRAM, yet I can get 64k of context with faster-than-I-can read output. Before that I was using the 26B MOE model. Same context but faster at 20 tps.
We are starting to get to the era where people can run their own models that are more than capable for the average user on pretty standard hardware.
I run one one a laptop like 3B slow . Thinking of aiming at getting something to run 12B. Think I can steal the setup at work for rebuilding it locally and use myself. What are your setup for running 12 tb , is it vram that’s the bottleneck ? Can I budget down everything else and just get a good gpu? Also are there any recommendation for true jailbreaked AI , that I can tweak myself without built in censoring or safety guards?
Qwen was the gold standard for small local models for a minute, too. Gemma4 doing some good stuff with optimization.
I’ve been using Qwen 2.6 locally to good effect still. What are you on now? 2.7 seems good too but I’m very cagey about funding AI companies and so have had limited access. Same deal with Deepseek4 which seems alright.
Recently? Gemma4 locally. Just got around to playing with the 12B Quantized Aware Training model today a little. Surprisingly powerful for only 12 billion parameters. What surprises me most is the efficiency stuff that’s packed into it. I’ve only got 8gb of VRAM, yet I can get 64k of context with faster-than-I-can read output. Before that I was using the 26B MOE model. Same context but faster at 20 tps.
We are starting to get to the era where people can run their own models that are more than capable for the average user on pretty standard hardware.
I run one one a laptop like 3B slow . Thinking of aiming at getting something to run 12B. Think I can steal the setup at work for rebuilding it locally and use myself. What are your setup for running 12 tb , is it vram that’s the bottleneck ? Can I budget down everything else and just get a good gpu? Also are there any recommendation for true jailbreaked AI , that I can tweak myself without built in censoring or safety guards?