INDEX
Explanations
placeholders and instructions
New Auto-Interp
Negative Logits
᧐
0.46
(‘
0.40
ߋ
0.37
(“
0.37
(’
0.36
("0.36
('0.35
():
0.35
(`
0.33
ုန်
0.33
POSITIVE LOGITS
here
0.59
aqui
0.58
Aqui
0.55
ListOf
0.55
εδώ
0.55
هنا
0.54
}=(
0.54
here
0.54
இங்கு
0.54
tutaj
0.54
Activations Density 0.109%