INDEX
Explanations
negative constructions and expressions related to opposition or refusal
New Auto-Interp
Negative Logits
anel
-0.16
ibold
-0.16
ican
-0.15
tep
-0.15
ndon
-0.15
MÃ¼ÅŁ
-0.14
lili
-0.14
áy
-0.14
æk
-0.14
onis
-0.14
POSITIVE LOGITS
necessarily
0.23
sugar
0.17
rein
0.17
judgment
0.17
ever
0.17
judgement
0.16
conforms
0.15
conform
0.15
Limits
0.15
limit
0.15
Activations Density 0.196%