INDEX
Explanations
blame or accusations being made against someone or some entity
instances of accusations against individuals or organizations
New Auto-Interp
Negative Logits
kos
-0.69
~/
-0.68
ominated
-0.64
die
-0.64
pleted
-0.63
survives
-0.63
»Ĵ
-0.61
gat
-0.61
LIFE
-0.61
daq
-0.61
POSITIVE LOGITS
unfairly
0.99
inappropriately
0.90
hypocritical
0.81
misleading
0.80
insensitive
0.79
unnecessarily
0.78
prejud
0.77
biased
0.77
irresponsible
0.77
dishonest
0.77
Activations Density 0.515%