INDEX
Explanations
instances of the word "isn't" and its variations, indicating negation or contradiction
New Auto-Interp
Negative Logits
زاÙĨ
-0.14
spared
-0.14
lys
-0.14
wid
-0.14
ownik
-0.13
znam
-0.13
è͵
-0.13
Wid
-0.13
âľĶ
-0.13
ãĢIJ
-0.13
POSITIVE LOGITS
oad
0.16
DISCLAIM
0.16
Labels
0.16
aaaaaaaa
0.16
keh
0.15
âĨIJ
0.15
acci
0.15
igin
0.14
oha
0.14
undry
0.14
Activations Density 0.137%