INDEX
Explanations
numbers and technical terms or jargon related to defense, health, politics, and other complex subjects
references to government actions and policies
New Auto-Interp
Head Attr Weights
0:0.07
1:0.03
2:0.06
3:0.04
4:0.04
5:0.11
6:0.09
7:0.13
8:0.04
9:0.08
10:0.18
11:0.08
Negative Logits
-0.95
).[
-0.95
)].
-0.95
."[
-0.94
.[
-0.92
".[
-0.90
,[
-0.89
theirs
-0.85
"[
-0.85
respectively
-0.80
POSITIVE LOGITS
yip
1.25
maxwell
1.05
xual
1.04
DragonMagazine
1.00
��
0.98
arte
0.93
quished
0.93
ギ
0.92
ディ
0.90
PDATE
0.88
Activations Density 0.199%