INDEX
Explanations
instances of family relationships, specifically the word "cousin"
mentions of familial relationships, specifically focusing on cousins
New Auto-Interp
Negative Logits
inth
-0.79
hner
-0.75
inem
-0.71
mberg
-0.69
gravity
-0.68
mble
-0.67
ERG
-0.66
pter
-0.65
manship
-0.64
overe
-0.64
POSITIVE LOGITS
nephew
0.88
cousins
0.87
uncle
0.86
cousin
0.85
aunt
0.84
niece
0.82
dolls
0.74
hood
0.73
relatives
0.71
hesis
0.71
Activations Density 0.021%