INDEX
Explanations
elements related to decision-making, influence, and agency in various contexts
New Auto-Interp
Negative Logits
Ƚ
-0.65
gynhyrchwyd
-0.62
PreferredItem
-0.61
✭✭
-0.61
препратки
-0.60
✭✭
-0.59
Referencies
-0.58
crdi
-0.55
лтамалар
-0.55
món
-0.55
POSITIVE LOGITS
she
1.05
her
1.02
oneself
0.94
their
0.91
She
0.90
Their
0.90
Their
0.86
she
0.85
their
0.84
shes
0.83
Activations Density 0.541%