INDEX
Explanations
words referring to family members or familial concepts, possibly across different languages
New Auto-Interp
Negative Logits
CharField
-0.54
<eos>
-0.49
insider
-0.49
Goodwin
-0.48
NextPage
-0.47
IndexPath
-0.45
Goldstein
-0.44
Kruse
-0.42
gj
-0.42
moderna
-0.42
POSITIVE LOGITS
fathers
1.70
father
1.70
dads
1.53
father
1.52
parents
1.51
Fathers
1.49
Parents
1.47
Father
1.47
Father
1.47
parental
1.45
Activations Density 0.546%