• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    18 hours ago

    Dense models that would fit in 100-ish GB like mistral large would be really slow on that box, and there isn’t a SOTA MoE for that size yet.

    So, unless you need tons of batching/parallel requests, its… kinda neither here nor there?

    As someone else said, the calculus changes with cheaper Strix Halo boxes (assuming those mini PCs are under $3K).