INDEX
Explanations
mentions of individuals' backgrounds and education histories
New Auto-Interp
Negative Logits
HER
-0.20
رÙĪÙĩ
-0.17
.mvc
-0.15
ä¸Ī夫
-0.14
herb
-0.14
Herb
-0.14
esty
-0.14
izzato
-0.13
ongyang
-0.13
/|
-0.13
POSITIVE LOGITS
she
1.16
she
0.93
она
0.80
She
0.75
她
0.75
She
0.74
ella
0.67
вона
0.66
ê·¸ëħĢëĬĶ
0.59
SHE
0.58
Activations Density 1.119%