INDEX
Explanations
sections of text marked by slashes
New Auto-Interp
Head Attr Weights
0:0.14
1:0.14
2:0.06
3:0.07
4:0.05
5:0.08
6:0.06
7:0.04
8:0.12
9:0.07
10:0.06
11:0.04
Negative Logits
Sally
-1.74
Sno
-1.70
Chong
-1.70
Oprah
-1.65
Bigfoot
-1.65
Playboy
-1.65
Pepsi
-1.60
Frankie
-1.60
Ray
-1.56
fr
-1.53
POSITIVE LOGITS
esm
2.38
arten
2.19
thora
2.19
ommel
2.11
lement
2.02
erker
2.01
erd
2.00
alion
1.96
�
1.94
adian
1.93
Activations Density 0.000%