INDEX
Explanations
the presence of parentheses in the text
New Auto-Interp
Head Attr Weights
0:0.07
1:0.03
2:0.10
3:0.08
4:0.06
5:0.07
6:0.10
7:0.04
8:0.08
9:0.26
10:0.03
11:0.04
Negative Logits
rash
-3.23
dt
-2.96
Ry
-2.91
crates
-2.88
rack
-2.88
salts
-2.86
Volt
-2.83
Ry
-2.75
aurus
-2.73
frequency
-2.71
POSITIVE LOGITS
Perez
3.21
Atkinson
3.20
ogi
3.16
Hamm
3.02
iman
3.00
én
2.98
iframe
2.98
Pan
2.94
emb
2.90
hanging
2.88
Activations Density 0.000%