INDEX
Explanations
relationships between characters, entities, and events
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.22
3:0.28
4:0.07
5:0.05
6:0.03
7:0.03
8:0.04
9:0.08
10:0.07
11:0.04
Negative Logits
.""
-1.68
respective
-1.51
iate
-1.46
acebook
-1.40
cients
-1.40
estine
-1.39
iations
-1.39
bearer
-1.36
raint
-1.36
anat
-1.35
POSITIVE LOGITS
��
2.16
��極
2.00
���
1.91
��
1.63
ァ
1.58
zinski
1.52
�
1.46
��
1.45
rique
1.43
ガ
1.41
Activations Density 0.004%