INDEX
Explanations
references to family and personal background
New Auto-Interp
Negative Logits
年轻人
-0.46
élèves
-0.44
Students
-0.43
leerlingen
-0.43
protégé
-0.42
弟子
-0.42
新人
-0.42
students
-0.40
jóvenes
-0.40
rookies
-0.39
POSITIVE LOGITS
parents
3.04
parent
2.88
father
2.78
mother
2.71
dad
2.63
mom
2.59
Parents
2.51
fathers
2.49
parents
2.45
parent
2.44
Activations Density 0.645%