INDEX
Negative Logits
hale
-0.08
standardized
-0.08
attributable
-0.07
structured
-0.07
beri
-0.07
(attrs
-0.07
accent
-0.07
Exclude
-0.07
.offset
-0.07
standards
-0.07
POSITIVE LOGITS
gekozen
0.13
(choice
0.12
escolhas
0.12
последствия
0.12
Outcomes
0.12
outcomes
0.12
Choices
0.12
_choice
0.12
conséquences
0.12
gewählt
0.11
Activations Density 0.022%