INDEX
Explanations
concepts related to collective memory and progress in historical narratives
New Auto-Interp
Negative Logits
fake
-0.15
à¹Ĥà¸ķ
-0.15
iene
-0.15
tostring
-0.14
fake
-0.14
aug
-0.14
su
-0.14
zung
-0.14
pty
-0.13
illusion
-0.13
POSITIVE LOGITS
sub
0.53
uncon
0.43
subconscious
0.41
unconscious
0.40
Sub
0.35
sub
0.35
.sub
0.31
Sub
0.30
_sub
0.30
(sub
0.28
Activations Density 0.482%