INDEX
Explanations
words ending in the letters 'y' or 'uy'
New Auto-Interp
Negative Logits
icher
-0.17
Ñĺ
-0.17
s
-0.16
gett
-0.16
icho
-0.16
iesz
-0.16
blow
-0.15
594
-0.15
sign
-0.15
adece
-0.15
POSITIVE LOGITS
ama
0.28
eur
0.23
esterday
0.23
eu
0.23
outube
0.23
ellow
0.22
eah
0.22
ez
0.21
outh
0.21
ield
0.20
Activations Density 0.051%