INDEX
Explanations
adjectives to describe negative or controversial situations
negative descriptors and terms related to transparency and accountability
New Auto-Interp
Negative Logits
«
-0.67
.).
-0.67
�
-0.67
.):
-0.66
.),
-0.64
ãĢĮ
-0.63
arnaev
-0.62
.)
-0.60
essage
-0.59
Phys
-0.58
POSITIVE LOGITS
"
2.24
"?
1.92
",
1.86
"!
1.81
"-
1.79
"...
1.76
"—
1.75
":
1.68
".
1.67
"â̦
1.67
Activations Density 0.622%