INDEX
Explanations
interactions and relationships among individuals
New Auto-Interp
Negative Logits
Girlfriend
-0.15
комплек
-0.14
granddaughter
-0.14
lover
-0.14
aget
-0.14
Glover
-0.14
avec
-0.14
enco
-0.14
ewolf
-0.14
FirstChild
-0.14
POSITIVE LOGITS
guys
0.56
boys
0.55
fell
0.55
gu
0.50
Guys
0.43
fellows
0.41
boys
0.41
l
0.41
Boys
0.41
Gu
0.39
Activations Density 0.201%