INDEX
Explanations
instances of contrast or opposing statements within the text
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.18
3:0.09
4:0.13
5:0.02
6:0.03
7:0.11
8:0.07
9:0.04
10:0.10
11:0.14
Negative Logits
hovah
-1.45
Impro
-1.40
hei
-1.38
successfully
-1.29
lessly
-1.25
vered
-1.22
exclusive
-1.22
Ashes
-1.20
fired
-1.18
cum
-1.18
POSITIVE LOGITS
ircraft
1.33
Lank
1.30
ichita
1.29
Zup
1.28
debian
1.24
Jav
1.24
Tiff
1.23
Cort
1.21
Flavoring
1.21
elsh
1.20
Activations Density 0.010%