INDEX
Explanations
references to familial relationships, particularly those involving sisters
New Auto-Interp
Negative Logits
veyard
-0.72
ustomed
-0.68
uden
-0.66
ustom
-0.63
ocobo
-0.62
raltar
-0.61
ered
-0.61
OVER
-0.60
ondon
-0.60
CAST
-0.60
POSITIVE LOGITS
hood
1.22
sister
0.98
sisters
0.97
hips
0.93
folk
0.90
Isabel
0.82
Kate
0.82
Carol
0.81
daughters
0.81
Maria
0.81
Activations Density 0.007%