INDEX
Explanations
words related to malformation or medical issues concerning development
New Auto-Interp
Negative Logits
elle
-0.18
elt
-0.17
roll
-0.15
eler
-0.15
wald
-0.15
skin
-0.15
istol
-0.15
ppard
-0.14
eton
-0.14
jab
-0.14
POSITIVE LOGITS
colm
0.19
ī
0.18
gré
0.18
andro
0.17
orie
0.17
igned
0.16
нг
0.16
adies
0.16
uzzi
0.15
inati
0.15
Activations Density 0.028%