INDEX
Explanations
phrases related to negative emotions or situations, particularly sadness
expressions of sadness or melancholy
New Auto-Interp
Negative Logits
ouver
-0.76
vernment
-0.75
authorized
-0.69
UID
-0.66
entials
-0.65
FANT
-0.64
gemony
-0.63
eers
-0.61
helic
-0.60
RAFT
-0.60
POSITIVE LOGITS
omas
1.40
der
1.31
istic
1.25
istically
1.22
die
0.87
istical
0.87
hus
0.83
stal
0.82
faced
0.82
Sad
0.82
Activations Density 0.029%