INDEX
Explanations
comparative measurements and ratios in various contexts
New Auto-Interp
Negative Logits
inus
-0.15
zell
-0.14
mando
-0.14
raud
-0.14
zde
-0.13
reau
-0.13
gia
-0.13
ÃŃny
-0.13
sj
-0.13
ÅĽci
-0.13
POSITIVE LOGITS
pit
0.16
hang
0.15
irk
0.15
hen
0.14
รà¸ĵ
0.13
correct
0.13
monds
0.13
uries
0.13
غÙĨ
0.13
marker
0.13
Activations Density 0.040%