INDEX
Explanations
phrases that express subjective experiences and personal reflections
New Auto-Interp
Negative Logits
plenty
-0.18
indeed
-0.17
tomorrow
-0.17
may
-0.16
might
-0.15
Plenty
-0.15
hopefully
-0.15
alar
-0.15
Indeed
-0.15
already
-0.14
POSITIVE LOGITS
somehow
0.41
Somehow
0.27
seem
0.23
sanki
0.22
somew
0.22
seemed
0.21
seems
0.21
inexp
0.21
seeming
0.20
myster
0.20
Activations Density 0.144%