INDEX
Explanations
phrases indicating quantity or frequency
New Auto-Interp
Negative Logits
ula
-0.17
arness
-0.16
ola
-0.16
Drv
-0.15
Kong
-0.15
ion
-0.14
Kir
-0.14
à¸łà¸²à¸Ħ
-0.14
ung
-0.13
Mens
-0.13
POSITIVE LOGITS
lant
0.18
oppable
0.16
ago
0.15
otch
0.15
CALE
0.15
nant
0.15
ácil
0.14
years
0.14
itoris
0.14
doors
0.14
Activations Density 0.094%