INDEX
Explanations
phrases related to positioning or locations
New Auto-Interp
Negative Logits
Chel
-0.65
씀
-0.63
מוצ
-0.61
dib
-0.61
تك
-0.59
GOB
-0.59
Bask
-0.59
ity
-0.58
Ott
-0.58
переди
-0.58
POSITIVE LOGITS
на
1.38
بوابة
1.21
На
1.07
На
1.06
na
1.03
auf
1.01
På
1.00
Na
1.00
Na
0.99
Auf
0.99
Activations Density 0.045%