INDEX
Explanations
phrases related to law enforcement officers
mentions of the word "cop"
New Auto-Interp
Negative Logits
sbm
-1.01
FORE
-0.88
schild
-0.76
Downloadha
-0.76
veyard
-0.71
*/(
-0.70
xual
-0.67
hower
-0.65
WAY
-0.65
pport
-0.64
POSITIVE LOGITS
yrights
1.27
rodu
1.01
yright
0.99
ious
0.92
enhagen
0.91
yp
0.91
cop
0.89
icker
0.85
eland
0.79
rol
0.79
Activations Density 0.003%