INDEX
Explanations
phrases indicating degrees or levels of something
New Auto-Interp
Negative Logits
ãĤ«ãĥ¼
-0.16
izar
-0.15
kara
-0.14
odem
-0.14
trad
-0.14
Pace
-0.14
.nlm
-0.14
sing
-0.13
ois
-0.13
undy
-0.13
POSITIVE LOGITS
ENS
0.16
anos
0.15
Ach
0.14
slightly
0.14
Ù쨧ÙĤ
0.14
ër
0.13
atti
0.13
ÑĦоÑĢ
0.13
Morales
0.13
ey
0.13
Activations Density 0.033%