INDEX
Explanations
themes of self-reflection and humility in personal identity
New Auto-Interp
Negative Logits
↵↵
-0.18
ç´
-0.17
ÑģеÑĢ
-0.17
andum
-0.17
emoc
-0.16
etus
-0.15
ugin
-0.14
åī£
-0.14
zer
-0.14
hape
-0.14
POSITIVE LOGITS
ego
0.36
self
0.35
Self
0.32
narciss
0.32
eg
0.31
ego
0.27
humility
0.27
self
0.26
Self
0.25
-self
0.25
Activations Density 0.194%