INDEX
Explanations
references to young males and females in various contexts
New Auto-Interp
Negative Logits
seamnă
-0.64
ÑO
-0.59
Eocene
-0.54
prochains
-0.54
tuturor
-0.54
Darío
-0.53
isier
-0.52
anet
-0.52
siger
-0.52
crites
-0.52
POSITIVE LOGITS
who
0.94
woman
0.92
gentleman
0.83
guy
0.82
person
0.80
man
0.78
bloke
0.74
TestBed
0.73
RenderAtEndOf
0.72
himself
0.67
Activations Density 0.281%