INDEX
Explanations
adjectives and verbs related to criticism and scrutiny
sentiments related to concern or seriousness regarding various social issues
New Auto-Interp
Negative Logits
zzo
-0.75
erva
-0.70
ivalry
-0.69
oret
-0.68
racuse
-0.67
chan
-0.66
zl
-0.66
CTR
-0.65
LLOW
-0.64
heses
-0.64
POSITIVE LOGITS
shove
0.72
unintention
0.66
graded
0.64
priority
0.63
jeopardy
0.63
qqa
0.59
Outer
0.58
contrad
0.58
distinctly
0.57
inventoryQuantity
0.57
Activations Density 0.636%