INDEX
Explanations
references to physical actions or states of being
references to crime and legal consequences
New Auto-Interp
Negative Logits
assorted
-0.51
hopefully
-0.49
asionally
-0.47
doubtless
-0.47
unforgettable
-0.47
undoubtedly
-0.47
itton
-0.46
asso
-0.45
atari
-0.45
catentry
-0.44
POSITIVE LOGITS
anymore
1.73
nor
1.53
whatsoever
1.15
anywhere
1.07
anything
1.05
any
1.01
slightest
1.01
anybody
0.99
yet
0.90
nor
0.89
Activations Density 1.707%