INDEX
Explanations
references to the concept of "soul."
New Auto-Interp
Negative Logits
oons
-0.17
fsp
-0.15
ieties
-0.15
etro
-0.14
ungeon
-0.14
aldi
-0.14
olson
-0.14
antino
-0.14
rone
-0.14
oint
-0.14
POSITIVE LOGITS
stice
0.24
ution
0.21
utions
0.21
mate
0.21
ful
0.21
UTION
0.20
fulness
0.18
ard
0.18
mates
0.18
FUL
0.18
Activations Density 0.012%