INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    qtt
    -0.07
     истории
    -0.07
    anye
    -0.06
    retweeted
    -0.06
    Statistic
    -0.06
    PAIR
    -0.06
     поба
    -0.06
    iena
    -0.06
    ,"\
    -0.06
    βολ
    -0.06
    POSITIVE LOGITS
     allow
    0.07
    (red
    0.07
     enable
    0.06
    0.06
     bring
    0.06
     CREATE
    0.06
     vanish
    0.06
     elektron
    0.06
    0.06
     uveden
    0.06
    Act Density 0.015%

    No Known Activations