INDEX
Explanations
names of individuals
proper names, particularly of individuals and their associated roles
New Auto-Interp
Negative Logits
PDATE
-0.70
translation
-0.69
GoldMagikarp
-0.68
hindsight
-0.67
incial
-0.67
ISION
-0.67
ãĤŃ
-0.65
perty
-0.64
ATIONS
-0.64
Excellent
-0.64
POSITIVE LOGITS
herself
0.93
deen
0.80
bors
0.78
ahl
0.76
ova
0.75
otte
0.74
shaw
0.74
bikini
0.73
miscar
0.71
itte
0.71
Activations Density 0.254%