INDEX
Explanations
mentions of the word "father" in various contexts
references to the concept of fatherhood
New Auto-Interp
Negative Logits
mble
-0.93
Flavoring
-0.79
ellen
-0.75
psey
-0.73
atility
-0.71
CONT
-0.70
clusions
-0.69
AW
-0.65
Ward
-0.64
odies
-0.63
POSITIVE LOGITS
hood
1.16
hesis
1.05
hetical
1.00
father
0.99
patriarch
0.94
heses
0.93
parents
0.92
hetically
0.89
hes
0.89
son
0.78
Activations Density 0.035%