INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
И
1.34
С
1.10
and
1.08
У
1.02
和
0.98
Ба
0.96
Я
0.94
Га
0.93
פ
0.93
Ш
0.93
POSITIVE LOGITS
zelfde
1.29
ında
1.22
ěji
1.20
ía
1.20
że
1.13
is
1.09
ే
1.09
jenigen
1.08
ী
1.07
はもちろん
1.06
Activations Density 3.996%