INDEX
Explanations
negative sentiments and expressions of disappointment regarding situations and events
New Auto-Interp
Negative Logits
unkt
-0.15
infra
-0.15
sta
-0.14
ardon
-0.14
turb
-0.13
mole
-0.13
632
-0.13
âĵĺ
-0.13
argin
-0.13
926
-0.13
POSITIVE LOGITS
ìĥĿ
0.15
bare
0.15
bare
0.15
nop
0.15
wald
0.15
é»
0.15
Nuggets
0.15
ÑĢез
0.15
aise
0.14
ecycle
0.14
Activations Density 0.280%