INDEX
Explanations
references to specific numerical values or measurements
New Auto-Interp
Negative Logits
monstruos
-0.32
langsung
-0.30
hiér
-0.30
vieja
-0.29
colheres
-0.29
Tinggi
-0.28
camiseta
-0.28
Polskiego
-0.28
defaultstate
-0.28
paire
-0.27
POSITIVE LOGITS
OGND
0.75
᠁
0.68
AndroidJUnit
0.68
nonUne
0.66
oler
0.64
siti
0.63
Diweddarwch
0.62
0.62
Coder
0.62
{};
0.61
Activations Density 0.002%