INDEX
Explanations
scientific references and DOIs
references to academic articles or publications
New Auto-Interp
Head Attr Weights
0:0.09
1:0.03
2:0.12
3:0.13
4:0.10
5:0.07
6:0.07
7:0.03
8:0.12
9:0.08
10:0.07
11:0.03
Negative Logits
ographers
-1.36
tein
-1.33
alde
-1.30
Liber
-1.26
Lies
-1.26
Scores
-1.24
paio
-1.22
hower
-1.21
Editorial
-1.21
Literary
-1.21
POSITIVE LOGITS
env
1.39
gam
1.33
bring
1.17
ffect
1.16
pint
1.15
bd
1.15
7601
1.14
headset
1.14
upt
1.14
bro
1.13
Activations Density 0.001%