INDEX
Explanations
words related to negative connotations or consequences
instances of negative sentiment or connotation
New Auto-Interp
Negative Logits
conservancy
-0.84
dropping
-0.82
hower
-0.80
DOM
-0.80
abiding
-0.80
adr
-0.79
iterator
-0.79
arers
-0.79
raltar
-0.78
cffff
-0.77
POSITIVE LOGITS
reinforcement
1.08
impact
0.94
spiral
0.92
publicity
0.92
consequences
0.88
feedback
0.88
effects
0.87
gearing
0.87
consequence
0.86
karma
0.86
Activations Density 0.025%