INDEX
Explanations
varying levels of emotional responses and punctuation in the text
New Auto-Interp
Negative Logits
925
-0.16
íħ
-0.15
nel
-0.14
719
-0.14
730
-0.14
елÑı
-0.14
601
-0.14
ux
-0.14
Levin
-0.14
anny
-0.14
POSITIVE LOGITS
illac
0.15
ebi
0.15
arma
0.14
orado
0.14
dings
0.14
verture
0.14
BASH
0.13
ritch
0.13
echn
0.13
hardt
0.13
Activations Density 0.046%