INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ného
    -0.07
    Increment
    -0.07
    _head
    -0.07
    Bern
    -0.07
     года
    -0.07
     zásob
    -0.06
     zen
    -0.06
    mw
    -0.06
     undes
    -0.06
     blueprint
    -0.06
    POSITIVE LOGITS
    ession
    0.06
     tutor
    0.06
     POINT
    0.06
    ÖL
    0.06
     hustle
    0.06
     partner
    0.06
     ресур
    0.06
    -parameter
    0.06
    _transition
    0.06
     play
    0.06
    Act Density 0.063%

    No Known Activations