INDEX
Explanations
punctuation and certain grammatical structures in the text
New Auto-Interp
Head Attr Weights
0:0.10
1:0.02
2:0.05
3:0.04
4:0.03
5:0.03
6:0.22
7:0.04
8:0.06
9:0.27
10:0.04
11:0.04
Negative Logits
Carm
-4.23
Mellon
-3.85
gem
-3.80
Lloyd
-3.66
Oliv
-3.56
peach
-3.46
Elliott
-3.37
icker
-3.24
angel
-3.22
kl
-3.21
POSITIVE LOGITS
Sur
8.57
Sur
6.36
sur
6.21
SUR
5.92
sur
5.51
SAR
5.36
Surf
4.91
surfing
4.84
surf
4.49
Lur
4.02
Activations Density 0.001%