INDEX
Explanations
past-tense action verbs
New Auto-Interp
Negative Logits
currently
-0.75
described
-0.66
presently
-0.64
enei
-0.63
mania
-0.63
Bah
-0.62
arel
-0.61
currently
-0.61
continued
-0.60
linked
-0.59
POSITIVE LOGITS
beforehand
0.76
ãĤ¤ãĥĪ
0.76
LAST
0.74
yesterday
0.70
Doct
0.67
wrong
0.67
BEFORE
0.67
nesday
0.66
last
0.66
wolf
0.62
Activations Density 0.361%