INDEX
Explanations
phrases related to familial relationships and community connections
New Auto-Interp
Negative Logits
oldt
-0.19
.Debugger
-0.17
uper
-0.17
ieur
-0.16
iores
-0.16
endi
-0.15
Slater
-0.15
ault
-0.15
aunt
-0.15
ution
-0.14
POSITIVE LOGITS
son
0.55
sons
0.46
daughter
0.44
son
0.40
Son
0.39
grandson
0.38
daughters
0.38
Son
0.37
.son
0.35
Sons
0.33
Activations Density 0.172%