INDEX
Explanations
descriptions of personal experiences and reactions
New Auto-Interp
Negative Logits
itect
-0.74
inals
-0.73
uay
-0.71
paio
-0.64
ival
-0.63
packages
-0.62
uckles
-0.62
enes
-0.62
idation
-0.62
brid
-0.61
POSITIVE LOGITS
Hebdo
0.77
noon
0.73
someone
0.66
Ͻ
0.63
misfortune
0.63
fame
0.62
someone
0.62
DERR
0.61
wrongdoing
0.61
rumours
0.60
Activations Density 0.150%