INDEX
    Explanations

    disabilities

    New Auto-Interp
    Negative Logits
    _bp
    -0.07
     Attention
    -0.06
     нового
    -0.06
    cam
    -0.06
    (EXIT
    -0.06
     prosperous
    -0.06
     scooter
    -0.06
     osp
    -0.06
     listar
    -0.06
    endir
    -0.06
    POSITIVE LOGITS
     SEX
    0.07
    ilia
    0.06
    define
    0.06
    0.06
     dozen
    0.06
     sack
    0.06
    145
    0.06
     dread
    0.06
    HZ
    0.06
    0.06
    Act Density 0.011%

    No Known Activations