INDEX
Explanations
phrase related to memories or instructions to remember something
instances of the word "forget."
New Auto-Interp
Negative Logits
ccording
-0.70
uana
-0.67
ioxide
-0.66
uilt
-0.64
purported
-0.63
Purs
-0.62
inction
-0.62
ivil
-0.62
FO
-0.62
Ale
-0.61
POSITIVE LOGITS
fulness
1.10
ful
1.04
forget
1.03
fully
0.95
forgot
0.90
theless
0.84
ãĤ¤ãĥĪ
0.84
forgetting
0.83
bryce
0.81
bolt
0.78
Activations Density 0.009%