INDEX
Explanations
words related to criticism or negative feedback
instances of criticism directed towards individuals or entities
New Auto-Interp
Negative Logits
OTE
-0.69
bourne
-0.69
âĢ¢âĢ¢âĢ¢âĢ¢
-0.68
seed
-0.68
nown
-0.65
cre
-0.63
ammy
-0.63
aho
-0.63
ipeg
-0.61
ther
-0.61
POSITIVE LOGITS
harshly
0.78
Cosponsors
0.78
critiques
0.72
imaru
0.72
criticisms
0.69
criticized
0.68
criticizing
0.67
comments
0.67
harsh
0.65
criticism
0.65
Activations Density 0.035%