INDEX
Explanations
references to familial relationships and connections
New Auto-Interp
Negative Logits
veyard
-0.76
misunder
-0.74
psey
-0.69
phis
-0.66
Rated
-0.66
iculty
-0.65
vernment
-0.64
anke
-0.64
sighting
-0.63
uve
-0.60
POSITIVE LOGITS
Ivanka
0.78
nets
0.77
hood
0.76
eldest
0.76
quel
0.75
Rahul
0.73
Valerie
0.73
Mia
0.73
Imran
0.73
Salman
0.70
Activations Density 0.045%