INDEX
Explanations
terms related to family relationships, especially uncles and aunts
references to familial relationships, specifically uncles and aunts
New Auto-Interp
Negative Logits
mberg
-0.82
isers
-0.76
mith
-0.64
âĹ¼
-0.64
isation
-0.63
rd
-0.62
izers
-0.62
collapse
-0.61
istically
-0.61
Effect
-0.61
POSITIVE LOGITS
aned
0.93
Vernon
0.92
liest
0.92
uncle
0.89
Tup
0.89
heses
0.81
athy
0.80
nephew
0.78
lies
0.77
anny
0.77
Activations Density 0.040%