INDEX
    Explanations

    counting or listing words

    New Auto-Interp
    Negative Logits
     jarring
    2.03
    ти
    2.02
    те
    1.89
     excellence
    1.80
     mediocr
    1.79
     vacancies
    1.69
     звез
    1.68
     radically
    1.67
     deliberations
    1.66
    которые
    1.65
    POSITIVE LOGITS
    g
    2.53
    gj
    2.19
    ı
    2.17
    e
    2.11
    ıyla
    2.03
    ıc
    2.02
    gf
    2.01
    gis
    2.00
    t
    1.96
    uigen
    1.89
    Act Density 0.036%

    No Known Activations