INDEX
Explanations
negative attributes associated with failure and injustice
New Auto-Interp
Negative Logits
qui
-0.15
ált
-0.15
somebody
-0.14
(optional
-0.14
iola
-0.14
atcher
-0.14
oda
-0.14
linkplain
-0.14
antar
-0.14
ran
-0.13
POSITIVE LOGITS
regardless
0.67
Regardless
0.64
Regardless
0.61
whether
0.54
whether
0.46
ardless
0.44
Whether
0.44
Whether
0.43
WHETHER
0.38
whatever
0.38
Activations Density 0.085%