INDEX
Explanations
key phrases related to significant concepts and themes
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.09
3:0.13
4:0.25
5:0.02
6:0.17
7:0.04
8:0.05
9:0.03
10:0.05
11:0.05
Negative Logits
ctors
-1.67
ophers
-1.40
trolls
-1.38
efficients
-1.37
guards
-1.29
ukemia
-1.27
iatrics
-1.25
appers
-1.25
gdala
-1.24
chers
-1.23
POSITIVE LOGITS
ディ
1.51
��
1.50
��
1.46
ゼ
1.37
marked
1.34
intended
1.31
��
1.30
lasting
1.30
ヴァ
1.27
Present
1.26
Activations Density 0.074%