INDEX
Explanations
expressions of surprise or emotional reactions
New Auto-Interp
Negative Logits
alth
-0.17
ammen
-0.17
emode
-0.16
stellar
-0.15
ein
-0.15
opoulos
-0.15
eous
-0.15
ÑĨип
-0.15
ernen
-0.15
iveness
-0.14
POSITIVE LOGITS
sen
0.18
atz
0.15
ridge
0.14
.Sdk
0.14
980
0.14
red
0.14
388
0.14
sm
0.14
ata
0.14
ļĮ
0.13
Activations Density 0.017%