INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     is
    1.18
    it
    1.11
     in
    1.09
     are
    1.09
     
    1.08
     L
    1.08
     a
    1.03
    an
    1.03
    unic
    1.00
     Un
    1.00
    POSITIVE LOGITS
     сейчас
    2.26
    2.16
    2.09
    <unused1902>
    2.08
    𒂵
    2.08
    2.06
    <unused332>
    2.06
    <unused1208>
    2.05
    𒁹
    2.05
    2.04
    Act Density 0.549%

    No Known Activations