INDEX
Explanations
references to political actions, decisions, and evaluations involving government and legislation
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.12
3:0.34
4:0.07
5:0.04
6:0.04
7:0.05
8:0.04
9:0.07
10:0.08
11:0.06
Negative Logits
allegedly
-1.85
reportedly
-1.71
purportedly
-1.70
quished
-1.62
ドラ
-1.56
ostensibly
-1.46
claimed
-1.45
supposedly
-1.45
atars
-1.45
opolis
-1.41
POSITIVE LOGITS
misunderstanding
1.99
soType
1.82
misconception
1.81
horm
1.78
misunder
1.72
maybe
1.70
),"
1.69
laughs
1.68
miscon
1.68
)"
1.67
Activations Density 0.313%