INDEX
Explanations
references to police or law enforcement officers
references to the term "Cop" in various contexts
New Auto-Interp
Negative Logits
anwhile
-0.76
Downloadha
-0.75
WAYS
-0.73
Ń·
-0.72
WAY
-0.66
pport
-0.66
itably
-0.64
AAAA
-0.64
AAAAAAAA
-0.64
çĦ
-0.64
POSITIVE LOGITS
yrights
1.21
eland
1.03
tic
1.00
rodu
0.94
Cop
0.92
ulative
0.87
yright
0.86
Cop
0.83
aque
0.82
ilon
0.81
Activations Density 0.007%