INDEX
Explanations
possessive pronouns for royalty
New Auto-Interp
Negative Logits
hang
0.60
ivory
0.57
rologist
0.56
loem
0.56
onions
0.56
indrical
0.55
dau
0.54
imbledon
0.53
фигу
0.53
ണ്ടു
0.53
POSITIVE LOGITS
His
3.59
His
3.35
Her
3.12
Her
2.92
Его
2.24
HIS
2.09
Zijn
2.06
HIS
2.00
Himself
1.94
HER
1.93
Activations Density 0.019%