INDEX
Explanations
phrases indicating persistence or reminders
New Auto-Interp
Negative Logits
gest
-0.17
SRC
-0.16
IG
-0.15
synchron
-0.14
_nl
-0.14
ayer
-0.14
bor
-0.14
pch
-0.14
bob
-0.13
ÑĢина
-0.13
POSITIVE LOGITS
-scroll
0.15
remen
0.15
dale
0.14
_scal
0.14
remember
0.14
reading
0.13
thy
0.13
anes
0.13
ollect
0.13
ÑĨеп
0.13
Activations Density 0.025%