INDEX
Explanations
names or terms related to character identity or existence
New Auto-Interp
Head Attr Weights
0:0.10
1:0.02
2:0.28
3:0.09
4:0.14
5:0.06
6:0.02
7:0.02
8:0.06
9:0.09
10:0.05
11:0.02
Negative Logits
etheless
-1.43
anwhile
-1.33
sylv
-1.20
eric
-1.19
outp
-1.19
yrinth
-1.17
thora
-1.14
ollywood
-1.14
illum
-1.12
epad
-1.12
POSITIVE LOGITS
interstitial
1.40
lein
1.38
akis
1.32
gaard
1.30
inger
1.29
Redditor
1.23
cki
1.22
nails
1.21
Bender
1.20
fold
1.20
Activations Density 0.001%