INDEX
Explanations
phrases related to individuals or specific people
references to significant individuals or public figures
New Auto-Interp
Negative Logits
selves
-0.78
husbands
-0.77
alities
-0.69
ascript
-0.66
miscarriage
-0.66
acas
-0.64
lishes
-0.64
¶
-0.64
mothers
-0.60
moms
-0.59
POSITIVE LOGITS
himself
0.96
personally
0.73
sov
0.72
Himself
0.71
Emin
0.70
ithe
0.69
Rai
0.66
redd
0.66
perse
0.66
chenko
0.65
Activations Density 1.464%