INDEX
Explanations
phrases related to criticism or negative attention towards a person or entity
phrases related to criticism or controversy surrounding individuals or entities
New Auto-Interp
Negative Logits
rave
-0.71
Nope
-0.66
olute
-0.63
mu
-0.62
Norn
-0.60
hibition
-0.59
cures
-0.58
vantage
-0.57
Yep
-0.57
eworks
-0.55
POSITIVE LOGITS
lately
0.90
recently
0.82
for
0.81
stemming
0.78
domestically
0.77
nationally
0.74
internationally
0.73
levied
0.72
due
0.72
owing
0.72
Activations Density 0.168%