INDEX
Explanations
references to family dynamics and relationships
New Auto-Interp
Negative Logits
Freund
-0.19
Friend
-0.19
buddy
-0.18
Friend
-0.17
Girlfriend
-0.16
friend
-0.16
_friend
-0.16
.friend
-0.15
buddies
-0.15
ÑģооÑĤ
-0.15
POSITIVE LOGITS
family
0.40
family
0.32
Family
0.31
-family
0.29
household
0.28
Family
0.27
FAMILY
0.27
å®¶æĹı
0.26
.family
0.25
_family
0.24
Activations Density 0.292%