INDEX
Explanations
negative sentiment or criticism in context
New Auto-Interp
Negative Logits
996
-0.17
533
-0.17
534
-0.16
YTE
-0.15
573
-0.15
abo
-0.15
673
-0.15
Philips
-0.14
vers
-0.14
ason
-0.14
POSITIVE LOGITS
oner
0.15
Pf
0.14
Royal
0.14
efon
0.14
ADOR
0.13
Lal
0.13
ãģªãĤĭ
0.13
guilty
0.13
ated
0.13
Doctrine
0.13
Activations Density 0.021%