INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     an
    1.66
     a
    1.48
     it
    1.38
     are
    1.37
     you
    1.28
     be
    1.27
     in
    1.24
     on
    1.09
     e
    1.07
     \
    1.07
    POSITIVE LOGITS
    ET
    1.20
    1.05
    ő
    1.04
    average
    1.02
    aver
    1.01
    averaged
    0.98
    ра
    0.96
     среднем
    0.96
    <0x80>
    0.93
    توان
    0.93
    Act Density 0.022%

    No Known Activations