INDEX
Explanations
words related to negative outcomes or situations
New Auto-Interp
Negative Logits
jednoc
-0.76
PhysRevD
-0.73
connexes
-0.72
rijke
-0.72
IUrlHelper
-0.70
GOTREF
-0.70
Laing
-0.68
Bourgoin
-0.68
ίκη
-0.66
cupertino
-0.66
POSITIVE LOGITS
worse
1.29
Worse
1.22
Worse
1.17
worst
1.14
Worst
1.07
worse
1.07
bad
1.05
BAD
1.05
Bad
1.02
Worst
1.02
Activations Density 0.147%