INDEX
Explanations
references to being pulled over by police in various contexts
New Auto-Interp
Head Attr Weights
0:0.01
1:0.00
2:0.05
3:0.04
4:0.09
5:0.03
6:0.03
7:0.34
8:0.02
9:0.04
10:0.23
11:0.05
Negative Logits
bern
-1.78
inness
-1.78
izons
-1.76
roud
-1.73
yss
-1.72
uble
-1.69
vana
-1.67
brim
-1.65
vor
-1.63
rss
-1.62
POSITIVE LOGITS
revocation
1.84
roadside
1.78
unethical
1.74
Illegal
1.64
violating
1.63
negligent
1.62
violates
1.60
Spy
1.60
Craigslist
1.59
expired
1.56
Activations Density 0.002%