INDEX
Explanations
instances where something is not happening or not being done
negation or denial statements
New Auto-Interp
Negative Logits
tein
-0.78
unks
-0.69
aley
-0.66
iesel
-0.66
hiba
-0.64
Line
-0.64
ogi
-0.63
company
-0.62
acker
-0.62
ãĤ·ãĥ£
-0.62
POSITIVE LOGITS
necessarily
1.54
anymore
1.02
etheless
1.02
icably
1.00
withstanding
0.97
icable
0.96
enough
0.96
eworthy
0.93
yet
0.87
epad
0.86
Activations Density 0.062%