INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    lewati
    -0.47
    <?
    
    -0.47
     Encuentra
    -0.46
     Frederik
    -0.45
     Atenas
    -0.45
    ers
    -0.43
     voici
    -0.43
    wüns
    -0.42
     nghĩ
    -0.41
    regalo
    -0.41
    POSITIVE LOGITS
     automobile
    1.73
     Automobile
    1.66
    Automobile
    1.63
    automobile
    1.45
     automobiles
    1.41
     Automobiles
    1.32
     automóviles
    1.06
     Automobil
    1.03
     automóvil
    1.01
     automobil
    0.97
    Act Density 0.004%

    No Known Activations