INDEX
Explanations
mentions of family members or familial relationships
mentions of "family."
New Auto-Interp
Negative Logits
generated
-0.75
Gamer
-0.68
Downloadha
-0.66
NetMessage
-0.66
hematically
-0.62
posted
-0.62
Result
-0.61
Kick
-0.61
wei
-0.61
WITHOUT
-0.61
POSITIVE LOGITS
family
3.50
family
2.78
Family
2.76
Family
2.56
families
2.41
relatives
2.03
familial
1.85
Families
1.79
household
1.77
siblings
1.72
Activations Density 0.034%