INDEX
Explanations
mentions of sons at different ages in various contexts
mentions of the word "son."
New Auto-Interp
Negative Logits
veyard
-0.85
iculty
-0.73
orney
-0.68
SOURCE
-0.65
irrel
-0.61
EFF
-0.61
ords
-0.59
Journal
-0.59
itures
-0.59
kefeller
-0.58
POSITIVE LOGITS
hesis
1.00
hood
0.98
son
0.90
Gohan
0.89
hetically
0.87
nets
0.84
ogram
0.84
pins
0.82
heses
0.81
mares
0.79
Activations Density 0.017%