INDEX
Explanations
front lines and frontrunners
New Auto-Interp
Negative Logits
in
1.20
ן
1.16
ע
1.16
of
1.14
is
1.13
л
1.08
и
1.04
v
1.03
ים
1.02
]
0.92
POSITIVE LOGITS
Авто
0.96
Ро
0.95
Три
0.95
Ни
0.95
Пер
0.94
ن
0.94
Сер
0.91
На
0.90
Ра
0.90
Ин
0.90
Activations Density 0.004%