INDEX
Explanations
phrases indicating a lack of something or a negative state
New Auto-Interp
Negative Logits
ansom
-0.16
ovich
-0.15
ên
-0.14
Äļ
-0.14
finity
-0.14
anny
-0.14
Opport
-0.13
graduate
-0.13
Opportunities
-0.13
ื
-0.13
POSITIVE LOGITS
longer
0.31
accident
0.26
different
0.25
secret
0.25
doubt
0.25
Longer
0.23
Buen
0.23
wonder
0.23
match
0.22
laughing
0.22
Activations Density 0.017%