INDEX
Explanations
expressions related to achievement or completion
expressions of excitement or intensity related to enjoyment and satisfaction
New Auto-Interp
Negative Logits
foreseen
-0.73
inese
-0.73
emouth
-0.65
inement
-0.64
uese
-0.64
objects
-0.62
working
-0.62
intimate
-0.61
itamin
-0.61
unfamiliar
-0.60
POSITIVE LOGITS
prest
1.01
vo
0.69
Bye
0.68
Rut
0.68
bye
0.68
opter
0.66
vanquished
0.66
ILA
0.65
prevailed
0.64
Ô
0.64
Activations Density 0.612%