INDEX
Explanations
city-level or state-level directives or policies
New Auto-Interp
Negative Logits
Raz
-0.66
Ban
-0.64
unin
-0.63
OC
-0.63
tum
-0.63
resc
-0.61
uber
-0.61
surgical
-0.61
Trin
-0.60
Pyro
-0.59
POSITIVE LOGITS
;
1.44
.;
1.38
;"
1.29
;}
1.21
%;
1.15
';
1.08
;
1.06
âĢ¢
1.06
'';
1.04
;;
1.03
Activations Density 0.040%