INDEX
    Explanations

    references to specific numerical values or measurements

    New Auto-Interp
    Negative Logits
     monstruos
    -0.32
    langsung
    -0.30
     hiér
    -0.30
     vieja
    -0.29
     colheres
    -0.29
     Tinggi
    -0.28
     camiseta
    -0.28
     Polskiego
    -0.28
     defaultstate
    -0.28
     paire
    -0.27
    POSITIVE LOGITS
    OGND
    0.75
    0.68
    AndroidJUnit
    0.68
     nonUne
    0.66
    oler
    0.64
    siti
    0.63
    Diweddarwch
    0.62
    0.62
     Coder
    0.62
     {};
    
    0.61
    Act Density 0.002%

    No Known Activations