INDEX
    Explanations

    more, less, faster, fewer

    New Auto-Interp
    Negative Logits
    0.45
    CATEG
    0.42
     статье
    0.42
    uttosto
    0.42
     HTTPException
    0.42
     pravo
    0.42
    hankelijk
    0.41
    ினால்
    0.41
    reet
    0.40
    ెంబ
    0.40
    POSITIVE LOGITS
     more
    1.76
     fewer
    1.66
     stronger
    1.56
     более
    1.52
     less
    1.52
     יותר
    1.51
     lebih
    1.50
     daha
    1.44
     უფრო
    1.44
     harder
    1.43
    Act Density 0.333%

    No Known Activations