INDEX
Explanations
topics related to political figures and their actions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.21
3:0.17
4:0.06
5:0.05
6:0.05
7:0.04
8:0.07
9:0.10
10:0.09
11:0.07
Negative Logits
athlet
-1.51
dwar
-1.30
debilitating
-1.29
permitting
-1.28
culminating
-1.28
ukemia
-1.25
sponsoring
-1.25
nefarious
-1.24
owship
-1.22
weaving
-1.21
POSITIVE LOGITS
inventoryQuantity
2.18
エル
1.60
?]
1.48
の�
1.48
将
1.44
ヘラ
1.42
ーク
1.41
Adds
1.39
ーン
1.39
=(
1.38
Activations Density 0.016%