INDEX
Explanations
mentions of familial relationships, particularly siblings
New Auto-Interp
Negative Logits
µ
-0.16
icari
-0.16
_BUF
-0.15
ovah
-0.14
itura
-0.14
granddaughter
-0.14
šk
-0.14
osu
-0.14
_Callback
-0.14
Bayer
-0.13
POSITIVE LOGITS
brother
0.99
brothers
0.96
sibling
0.85
Brother
0.84
sister
0.84
sisters
0.84
siblings
0.82
Brothers
0.79
bro
0.71
Sister
0.69
Activations Density 0.365%