INDEX
Explanations
statements related to impactful or significant events or incidents
expressions related to being emotionally or physically disturbed
New Auto-Interp
Negative Logits
hib
-0.86
ouf
-0.83
endi
-0.83
ators
-0.72
elaide
-0.69
emis
-0.69
osal
-0.68
á
-0.68
iler
-0.68
missible
-0.67
POSITIVE LOGITS
DAQ
0.85
rocked
0.77
tremend
0.74
wave
0.71
silence
0.70
rocking
0.70
apart
0.66
DOWN
0.65
waves
0.64
shaken
0.64
Activations Density 0.014%