INDEX
Explanations
terms related to criticism and critiques of actions or policies
New Auto-Interp
Negative Logits
-0.75
P
-0.63
fun
-0.63
C
-0.61
V
-0.60
S
-0.60
A
-0.59
cu
-0.57
up
-0.57
I
-0.56
POSITIVE LOGITS
criticism
1.91
Criticism
1.88
criticise
1.85
criticize
1.79
criticisms
1.77
criticised
1.77
criticized
1.75
criticizing
1.70
Criticism
1.69
critici
1.67
Activations Density 0.137%