INDEX
Explanations
references to family relationships and dynamics
New Auto-Interp
Negative Logits
icros
-0.15
jiang
-0.14
endi
-0.14
Ø´ÙĪØ±
-0.14
grandparents
-0.14
Parents
-0.14
ault
-0.14
uele
-0.14
oldt
-0.14
вед
-0.14
POSITIVE LOGITS
son
1.12
sons
0.97
daughter
0.87
son
0.85
Son
0.85
Son
0.81
SON
0.78
daughters
0.75
.son
0.74
Sons
0.73
Activations Density 0.606%