• 0 Posts
  • 1 Comment
Joined 2 years ago
cake
Cake day: June 19th, 2023

help-circle
  • 8B parameter tag is the distilled llama 3.1 model, which should be great for general writing. 7B is distilled qwen 2.5 math, and 14B is distilled qwen 2.5 (general purpose but good at coding). They have the entire table called out on their huggingface page, which is handy to know which one to use for specific purposes.

    The full model is 671B and unfortunately not going to work on most consumer hardwares, so it is still tethered to the cloud for most people.

    Also, it being a made in China model, there are some degree of censorship mandated. So depending on use case, this may be a point of consideration, too.

    Overall, it’s super cool to see something at this level to be generally available, especially with all the technical details out in the open. Hopefully we’ll see more models with this level of capability become available so there are even more choices and competition.