INDEX
Explanations
References to individuals being arrested
instances of the word "arrested" in various contexts
New Auto-Interp
Negative Logits
enthus
-0.77
vironment
-0.75
emonic
-0.73
morrow
-0.70
clerosis
-0.68
Remastered
-0.68
lag
-0.67
WE
-0.65
urgical
-0.65
preference
-0.64
POSITIVE LOGITS
arrested
0.99
arrest
0.93
arrests
0.85
apprehended
0.83
wiret
0.76
nab
0.75
handcuffed
0.74
detained
0.72
Arrest
0.71
decriminal
0.69
Activations Density 0.018%