INDEX
Explanations
phrases related to family members, especially parental figures
references to family relationships and roles
New Auto-Interp
Negative Logits
ways
-0.80
ser
-0.76
COL
-0.74
Edited
-0.74
sed
-0.74
ivities
-0.73
sports
-0.71
âĢ¢âĢ¢âĢ¢âĢ¢
-0.70
susp
-0.68
binding
-0.68
POSITIVE LOGITS
apa
1.13
Papa
1.12
Mama
1.04
ciating
0.84
uppet
0.83
omo
0.82
emonic
0.79
arella
0.77
Romeo
0.75
Panda
0.74
Activations Density 0.013%