INDEX
Explanations
the letter 's' in different contexts within the text
New Auto-Interp
Head Attr Weights
0:0.03
1:0.09
2:0.17
3:0.13
4:0.01
5:0.03
6:0.11
7:0.10
8:0.10
9:0.08
10:0.06
11:0.04
Negative Logits
Shock
-1.16
claw
-1.14
Edit
-1.12
\/\/
-1.11
Cascade
-1.10
enlarge
-1.10
cation
-1.07
woods
-1.07
fuck
-1.05
attracts
-1.05
POSITIVE LOGITS
externalActionCode
1.30
ochond
1.18
ighed
1.17
llah
1.16
inki
1.14
Faul
1.12
Parables
1.12
arate
1.12
hett
1.10
iesel
1.09
Activations Density 0.019%