Today I'm playing with my Raspberry Pi 5 16GB, and the new-fangled "AI Hat+ 2".
-
Today I'm playing with my Raspberry Pi 5 16GB, and the new-fangled "AI Hat+ 2". I don't really like docker, but will play along, following the documentation provided, to get a nice "Open WebUI" web interface for the chats:
https://www.raspberrypi.com/documentation/computers/ai.html#step1-llm
I'm curious if any of the models they provide are any good:
- "deepseek_r1_distill_qwen:1.5b"
- "llama3.2:3b"
- "qwen2.5-coder:1.5b"
- "qwen2.5-instruct:1.5b"
- "qwen2:1.5b"Can anyone vouch for these models?
-
Today I'm playing with my Raspberry Pi 5 16GB, and the new-fangled "AI Hat+ 2". I don't really like docker, but will play along, following the documentation provided, to get a nice "Open WebUI" web interface for the chats:
https://www.raspberrypi.com/documentation/computers/ai.html#step1-llm
I'm curious if any of the models they provide are any good:
- "deepseek_r1_distill_qwen:1.5b"
- "llama3.2:3b"
- "qwen2.5-coder:1.5b"
- "qwen2.5-instruct:1.5b"
- "qwen2:1.5b"Can anyone vouch for these models?
@d1 they are basically toy models
-
@slyecho woe is me, who spent too much on this toy
-
@d1 well, nothing wrong with toys. I also have a lot of SBCs and Raspberry Pi stuff to play with. But yeah, these models can run on the CPU too probably
-
@d1 well, nothing wrong with toys. I also have a lot of SBCs and Raspberry Pi stuff to play with. But yeah, these models can run on the CPU too probably
@slyecho I was hoping that the "whisper" audio-to-text LLMs would gain hardware acceleration for that Hailo NPU (on that AI Hat+ 2), but alas, there's no mention of Hailo/Raspberry Pi hardware yet, here:
-
Today I'm playing with my Raspberry Pi 5 16GB, and the new-fangled "AI Hat+ 2". I don't really like docker, but will play along, following the documentation provided, to get a nice "Open WebUI" web interface for the chats:
https://www.raspberrypi.com/documentation/computers/ai.html#step1-llm
I'm curious if any of the models they provide are any good:
- "deepseek_r1_distill_qwen:1.5b"
- "llama3.2:3b"
- "qwen2.5-coder:1.5b"
- "qwen2.5-instruct:1.5b"
- "qwen2:1.5b"Can anyone vouch for these models?
@d1 tiny models are fun to play with and can do basic things, but I'd hardly qualify a 1.5-3b parameter model as a "large" language model. It starts getting interesting at like 30-70b+, or 400b+ if you have the VRAM for it. Stuff like ChatGPT5.2 and Gemini3 Pro and such are trillions and trillions of parameters. Gemini 3 Flash is ~30b ish if I recall correctly.
That said, you should definitely try play with them, they can still do basic things in small contexts.
-
R AodeRelay shared this topicR ActivityRelay shared this topic
-
@d1 tiny models are fun to play with and can do basic things, but I'd hardly qualify a 1.5-3b parameter model as a "large" language model. It starts getting interesting at like 30-70b+, or 400b+ if you have the VRAM for it. Stuff like ChatGPT5.2 and Gemini3 Pro and such are trillions and trillions of parameters. Gemini 3 Flash is ~30b ish if I recall correctly.
That said, you should definitely try play with them, they can still do basic things in small contexts.
@anthropy thanks for the consolation
-
@d1 IIRC people use these kinda models for things like Home Assistant automation, so you can tell a voice assistant to turn on/off lights and such. In tiny contexts like that they seem to work fine- although I personally haven't tried that yet.