INDEX
Explanations
phrases related to criminal activity, investigation, and law enforcement
New Auto-Interp
Negative Logits
leon
-0.85
ando
-0.85
mone
-0.84
arde
-0.80
Thom
-0.79
eral
-0.77
unal
-0.75
opsy
-0.75
andr
-0.74
xon
-0.74
POSITIVE LOGITS
prevention
0.75
hots
0.75
concentration
0.71
activity
0.68
stalking
0.68
enthusiasts
0.67
susceptibility
0.66
density
0.66
detection
0.65
skeptics
0.65
Activations Density 10.926%