INDEX
Explanations
actions and emotional responses within narrative contexts
New Auto-Interp
Negative Logits
ardy
-0.17
lain
-0.15
vinc
-0.15
ÏĢη
-0.15
candid
-0.15
rahim
-0.14
ÙĨج
-0.14
ved
-0.14
cors
-0.14
canf
-0.14
POSITIVE LOGITS
ä¿
0.15
hetto
0.15
èħ¦
0.14
é¾Ħ
0.14
ottle
0.14
okane
0.14
ctime
0.14
ecut
0.13
Voll
0.13
ÚĨار
0.13
Activations Density 0.175%