INDEX
Explanations
expressions of personal feelings or states of being
New Auto-Interp
Negative Logits
never
-0.16
sort
-0.15
etten
-0.15
abin
-0.15
leg
-0.14
aram
-0.14
hope
-0.14
áhl
-0.14
ozilla
-0.14
ãģıãĤĭ
-0.14
POSITIVE LOGITS
currently
0.29
currently
0.28
aktu
0.26
presently
0.25
Currently
0.25
tonight
0.24
Currently
0.24
current
0.24
å½ĵåīį
0.24
íĺĦìŀ¬
0.23
Activations Density 0.202%