INDEX
Explanations
the word "criticism."
criticisms and negative feedback in texts
mentions of the term "criticism."
New Auto-Interp
Negative Logits
cise
-0.71
pared
-0.67
cop
-0.64
tre
-0.64
ibur
-0.64
frey
-0.63
change
-0.63
seed
-0.63
ramid
-0.63
tein
-0.63
POSITIVE LOGITS
criticism
1.00
代
0.96
critic
0.93
criticisms
0.86
critics
0.86
Crit
0.83
criticizing
0.79
reviewers
0.78
imaru
0.78
leveled
0.76
Activations Density 0.021%