INDEX
Explanations
occurrences of the word "forget" and its variations
New Auto-Interp
Negative Logits
mage
-0.14
anova
-0.14
549
-0.14
Ậ
-0.14
oog
-0.14
leness
-0.14
rup
-0.14
Bare
-0.14
kem
-0.14
wright
-0.14
POSITIVE LOGITS
fulness
0.17
amus
0.15
:async
0.15
bid
0.14
ting
0.14
obil
0.14
tings
0.14
pit
0.14
/not
0.14
/set
0.13
Activations Density 0.017%