INDEX
Explanations
mentions of family members
New Auto-Interp
Negative Logits
argon
-0.70
umbn
-0.70
Subtle
-0.70
*/(
-0.70
inen
-0.69
psey
-0.69
ane
-0.69
igo
-0.68
irection
-0.67
isexual
-0.66
POSITIVE LOGITS
patriarch
1.11
reunion
1.08
members
1.05
members
1.04
resemb
0.92
member
0.91
caregivers
0.91
member
0.90
reun
0.88
ilial
0.87
Activations Density 0.396%