Decoding by Contrasting Layers (DoLa) is a technique suggesting a different approach to calculating next token probabilities in a transformer. It is described in this paper. What is interesting is that without any changes to the model, it is possible to make a code change to the decoder part of the transformer and get a noticeable boost in the model's factual knowledge and fewer hallucinations.
A few days ago a PR was merged into the Hugging Face Transformers library implementing this trick.
It happened that I had MT-Bench set up while tinkering with 1.6B model and conducting the evals. The LLM Judge relies on HF Transformers, so it was easy to do a quick trial of DoLa and see if it improves AI chatbot's overall performance (reasoning, coding, writing, etc.)
I installed the Transformers from sources (the new feature is not available at PiPY yet):
pip install git+https://github.com/huggingface/transformers
Made a change to gen_model_answer.py adding the
dola_layers
params
output_ids = model.generate(
torch.as_tensor(input_ids).cuda(),
do_sample=do_sample,
temperature=temperature,
max_new_tokens=max_new_token,
dola_layers='low'
)
- Ran MT-Bench with the params commented out, set to
low
andhigh
Here're the results:
Mode: single
Input file: data/mt_bench/model_judgment/gpt-4_single.jsonl
########## First turn ##########
score
model turn
stablelm-2-brief-1_6b_r57_no_dola 1 4.8375
stablelm-2-brief-1_6b_r57_dola_low 1 4.6125
stablelm-2-brief-1_6b_r57_dola_high 1 3.9500
########## Second turn ##########
score
model turn
stablelm-2-brief-1_6b_r57_dola_low 2 3.700
stablelm-2-brief-1_6b_r57_no_dola 2 3.475
stablelm-2-brief-1_6b_r57_dola_high 2 2.825
########## Average ##########
score
model
stablelm-2-brief-1_6b_r57_dola_low 4.15625
stablelm-2-brief-1_6b_r57_no_dola 4.15625
stablelm-2-brief-1_6b_r57_dola_high 3.38750
As you can see, while first turn score went down, the second score actually improved. Yet the results can't be representative, from my experience MT-Bench can have 10% score variation between runs. Overall, if there're any effects, they are marginal.