INDEX
Explanations
words related to actions, experiences, and organizations
business or company descriptions
New Auto-Interp
Negative Logits
autorytatywna
-0.39
Widerspruch
-0.30
Einfluß
-0.30
Zugang
-0.29
twij
-0.28
ușor
-0.27
bouteilles
-0.26
Voraussetzungen
-0.26
rodillas
-0.26
voorbeeld
-0.26
POSITIVE LOGITS
فريبيس
0.76
niſſe
0.69
catering
0.63
<unused14>
0.63
<unused28>
0.63
<unused23>
0.63
<unused41>
0.63
<unused74>
0.63
<pad>
0.63
<unused3>
0.63
Activations Density 0.043%