INDEX
Explanations
mentions of family relationships and personal experiences
New Auto-Interp
Negative Logits
themselves
-0.72
atum
-0.62
idates
-0.61
ãĤª
-0.56
yourselves
-0.55
pire
-0.52
Qatar
-0.52
ikhail
-0.51
himself
-0.51
Himself
-0.50
POSITIVE LOGITS
myself
0.72
husband
0.68
colleague
0.63
niece
0.62
friends
0.61
ventures
0.61
friends
0.59
poke
0.59
colleagues
0.58
collection
0.57
Activations Density 15.644%