INDEX
Explanations
proper nouns related to individuals in a political context
New Auto-Interp
Head Attr Weights
0:0.09
1:0.06
2:0.09
3:0.09
4:0.07
5:0.08
6:0.07
7:0.07
8:0.09
9:0.07
10:0.09
11:0.08
Negative Logits
tradem
-1.76
rings
-1.73
emia
-1.72
etry
-1.63
belonged
-1.61
ipment
-1.60
iture
-1.58
orable
-1.57
interacted
-1.56
theaters
-1.54
POSITIVE LOGITS
Surviv
1.90
lda
1.76
usterity
1.73
hindsight
1.68
GOODMAN
1.68
upbeat
1.66
hander
1.63
--------------------
1.58
fuss
1.57
dearly
1.57
Activations Density 0.000%