INDEX
Explanations
emotional expressions and reactions in characters
New Auto-Interp
Negative Logits
amen
-0.16
uede
-0.15
itas
-0.15
jed
-0.14
pos
-0.14
æĤ
-0.14
igi
-0.14
pos
-0.14
beck
-0.14
est
-0.13
POSITIVE LOGITS
веÑī
0.15
oppel
0.15
'field
0.14
мо
0.14
ãĥ¼ãĥĭ
0.14
omite
0.14
rowsers
0.14
aille
0.13
gains
0.13
Roose
0.13
Activations Density 0.318%