INDEX
Explanations
topics related to government policies and laws affecting society
New Auto-Interp
Head Attr Weights
0:0.17
1:0.01
2:0.13
3:0.08
4:0.07
5:0.04
6:0.04
7:0.02
8:0.09
9:0.07
10:0.11
11:0.11
Negative Logits
mediately
-1.52
ogle
-1.47
approx
-1.43
ipeg
-1.37
requently
-1.34
undred
-1.30
roximately
-1.28
trough
-1.28
itely
-1.28
ociated
-1.27
POSITIVE LOGITS
etc
2.21
etc
1.69
politics
1.49
iferation
1.47
blah
1.46
olitics
1.44
Achievement
1.43
anship
1.39
opathy
1.37
�
1.36
Activations Density 0.165%