INDEX
Explanations
references to family relationships, particularly sons
references to familial relationships, particularly the term "son."
New Auto-Interp
Negative Logits
uden
-0.75
jriwal
-0.66
urg
-0.65
deterior
-0.65
eers
-0.64
ommod
-0.63
FOR
-0.63
ontent
-0.63
asonable
-0.60
demon
-0.59
POSITIVE LOGITS
nets
0.90
heses
0.88
hesis
0.88
hetically
0.82
hood
0.69
wife
0.67
pins
0.66
eatures
0.65
hetical
0.65
Kath
0.62
Activations Density 0.089%