INDEX
Explanations
instances of the word "be" with a high importance placed on the number 9
phrases indicating negation or the inability to perform an action
New Auto-Interp
Negative Logits
brace
-0.69
traverse
-0.69
Might
-0.66
srfAttach
-0.66
compose
-0.65
showc
-0.63
nod
-0.62
strous
-0.62
iop
-0.62
defy
-0.62
POSITIVE LOGITS
anymore
1.14
able
1.09
bothered
1.02
necessarily
0.96
counted
0.93
forgiven
0.93
harmed
0.90
anywhere
0.88
yet
0.85
entirely
0.84
Activations Density 0.102%