INDEX
Explanations
phrases related to criminal activities or legal matters
references to actions involving handling or transferring items
New Auto-Interp
Negative Logits
_.
-0.86
etc
-0.74
?".
-0.69
.*
-0.69
*.
-0.68
+.
-0.66
/,
-0.65
%.
-0.64
.?
-0.64
itself
-0.63
POSITIVE LOGITS
applause
0.83
stret
0.74
congratulations
0.74
composure
0.69
unbeaten
0.67
fumble
0.66
touchdown
0.66
arra
0.64
stitches
0.64
cheers
0.63
Activations Density 0.748%