Llama 2: ChatGPT Rival Free for Commercial Use, With a Catch
Published on July 18, 2023.
Several hours ago, Meta AI released Llama 2. Trained on 40% more data and with twice the context size, Llama 2 seems like the most promising open-sourced model available. In fact, it can even be used commercially, but with a catch.
Changes from First Version
Facebook has previously made several large language models (LLMs). OPT and LLaMA (V1) are widely accepted as the best.
Pre-Trained Chat Model
Llama 2 includes a pre-trained chat model. Previously, universities, companies, and individuals fine-tuned the first version of LLaMA (V1) on chat datasets, usually generated by ChatGPT through a conversation-sharing platform called ShareGPT.
Llama 2's pre-trained chat model allows individuals to instantly chat with the model without going through the difficult, expensive, and resource-intensive process of fine-tuning a chat model, which can take anywhere from several hours to months.
Different Name Capitalization
LLaMA (V1)'s name had a strange capitalization, causing widespread confusion in the open-sourced community. Llama 2 drops the strange capitalization, only capitalizing the first letter of the word.
Larger Training Dataset
Llama 2 has a 40% larger training dataset, comprised of 2 trillion tokens. This makes the model score much higher in many evaluations.
Larger Context Length
Llama 2's context size is 2 times as large as the first version. Featuring 4096 tokens (a word is usually 1-2 tokens), Llama 2 can better remember things the user previously mentioned.
Higher Quality
The smallest model of LLaMA (V1), 7 billion parameters, scored an average 49.7 points (average of ARC, HellaSwag, MMLU, and TruthfulQA evaluations, according to the Open LLM Leaderboard), whereas Llama 2 scored an average of 54.4 points. In comparison, ChatGPT's score is around 85.
Different Model Sizes
Previously, LLaMA (V1) offered 4 different model parameter sizes (larger parameter count = higher quality), 7 billion parameters, 13 billion, 35 billion, and 65 billion. The full-sized 7 billion parameter model can barely be run on the most expensive MacBook, which makes it difficult for individuals without expensive equipment to use. Even with advanced optimizations (4-bit quantization), the models still perform extremely slowly.
Llama 2 offers only 3 parameter sizes, 7 billion, 13 billion, and 70 billion. The leap from 13 to 70 billion is dramatic, closing the opportunity of running high-quality models to many who would be able to run 35 billion parameter models but not 70.
In comparison, ChatGPT has 175 billion parameters.
Faster Application Acceptance
LLaMA V1 was notoriously slow to approve applications to download the model. For example, I signed up several months ago and haven't heard back from them since. The models were eventually leaked through BitTorrent and GitHub, and made available to the general public.
Llama 2 features a much faster acceptance rate. I applied several hours ago, and my request has already been approved.
Already Embraced by Community
Llama 2 has already been converted to a multitude of formats by the open-source community, including the GGML format which allows it to be run on consumer hardware.
Available for Commercial Use… With a Catch
Llama V2 is available for free to use commercially. However, in the fine-print of their license, they state:
If [...] the monthly active users [...] is greater than 700 million monthly active users [per month,] you must request a license from Meta...
I am not a lawyer, but it seems to me that in their fine-print, they prohibit services to use the model if they have over 700 million monthly users.
Download Llama 2 Yourself
You can apply to download Llama 2 from the official Meta website.