Selfhosted LLM (ChatGPT)

autopilot@lemmy.world · edit-2 2 years ago

CeeBee@lemmy.world · edit-2 8 months ago

deleted by creator

laenurd@lemmy.lemist.de · 2 years ago

Note that when using llama-derived models, such as vicuna, you are bound by their license to only use them for “research” purposes.

If you want an unrestricted version, go for open-llama or RedPajama.

Falcon is less restrictive and only wants a cut of profits if they exceed 1 million dollars, but I’d wager that fully unrestricted is the way to go.

beesthetrees@feddit.uk · 2 years ago

Falcon has switched to Apache 2.0 and removed the commercial limit.

laenurd@lemmy.lemist.de · 2 years ago

Sorry, I must’ve missed that somehow, then my comment only applies to llama and its direct derivates.

pe1uca@lemmy.pe1uca.dev · 2 years ago

How do you know how much ram the model needs?

redcalcium@c.calciumlabs.com · edit-2 2 years ago

The model creator usually mentioned it in the readme:

You will need at least 16GB of memory to swiftly run inference with Falcon-7B.

Usually the models support CPU inference. Tremendously slow but works in a pinch.

CeeBee@lemmy.world · edit-2 8 months ago

deleted by creator