INDEX
Explanations
references to familial relationships and domestic life
New Auto-Interp
Negative Logits
inki
-0.18
eos
-0.16
/epl
-0.15
amenti
-0.15
eci
-0.15
decom
-0.14
otti
-0.14
Kov
-0.14
ãģįãģª
-0.14
iesta
-0.14
POSITIVE LOGITS
Heath
0.29
Bran
0.25
Hind
0.23
Rochester
0.22
Yorkshire
0.22
Cathy
0.20
Lock
0.20
Charlotte
0.20
Emily
0.19
Jane
0.18
Activations Density 0.006%