INDEX
Explanations
mentions of political and governmental policies
references to governmental policies
New Auto-Interp
Negative Logits
CLASSIFIED
-0.76
lihood
-0.75
ITNESS
-0.75
Flavoring
-0.70
Vel
-0.70
issan
-0.69
parts
-0.68
ISH
-0.68
athan
-0.68
brother
-0.67
POSITIVE LOGITS
enacted
1.02
governing
0.93
implemented
0.88
affecting
0.87
imposed
0.85
restricting
0.83
instituted
0.83
policies
0.83
regulating
0.81
hops
0.80
Activations Density 0.040%