INDEX
Explanations
words related to criticism or complaints
verbs related to negative actions or criticism
New Auto-Interp
Negative Logits
gloss
-0.69
ortium
-0.62
Bonds
-0.62
auga
-0.61
Nir
-0.60
nown
-0.60
Tsukuyomi
-0.60
glim
-0.60
conspicuous
-0.59
ERSON
-0.59
POSITIVE LOGITS
restling
0.96
kefeller
0.79
backer
0.79
Against
0.68
igious
0.67
boat
0.67
hester
0.65
Edited
0.65
theless
0.64
apart
0.64
Activations Density 0.064%