INDEX
Explanations
phrases related to authority and decision-making conflicts
New Auto-Interp
Negative Logits
haraan
-0.64
icity
-0.53
simum
-0.51
[{
-0.51
INUM
-0.51
ωση
-0.51
άνι
-0.51
PORARY
-0.50
hnia
-0.50
fallu
-0.50
POSITIVE LOGITS
him
2.07
he
1.67
his
1.49
彼は
1.36
hänen
1.30
ему
1.29
she
1.28
ihm
1.27
her
1.25
egli
1.23
Activations Density 1.972%