INDEX
Negative Logits
ruary
-0.62
asus
-0.59
ushima
-0.57
bda
-0.56
unts
-0.55
UNCH
-0.55
ÅŁ
-0.54
ffee
-0.54
steen
-0.54
asper
-0.53
POSITIVE LOGITS
worldly
1.17
wise
0.90
itarian
0.87
ities
0.70
soever
0.62
kin
0.61
swer
0.61
yne
0.60
Languages
0.60
mis
0.59
Activations Density 0.492%