INDEX
Explanations
negative qualifiers related to actions or behaviors
New Auto-Interp
Negative Logits
ONTAL
-0.16
etti
-0.15
ova
-0.15
krom
-0.15
ois
-0.15
es
-0.14
ansk
-0.14
lobby
-0.14
onas
-0.14
Ñģен
-0.14
POSITIVE LOGITS
isnan
0.18
necessarily
0.16
itlement
0.15
riger
0.15
aken
0.14
door
0.14
Doch
0.14
ori
0.14
qui
0.13
Äĩe
0.13
Activations Density 0.032%