INDEX
Explanations
topics related to social or economic issues
New Auto-Interp
Head Attr Weights
0:0.21
1:0.03
2:0.10
3:0.03
4:0.10
5:0.07
6:0.04
7:0.05
8:0.20
9:0.03
10:0.04
11:0.07
Negative Logits
hesis
-1.41
686
-1.33
acion
-1.31
":["
-1.31
Malays
-1.27
Liberation
-1.25
arten
-1.25
Nobel
-1.25
Assembly
-1.24
Bian
-1.22
POSITIVE LOGITS
whichever
1.71
UCHIJ
1.57
overrun
1.44
defy
1.37
into
1.36
arin
1.35
Unloaded
1.34
oward
1.33
cens
1.31
Suddenly
1.31
Activations Density 0.022%