INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Self
    -0.07
    .manager
    -0.06
    -oriented
    -0.06
    Controllers
    -0.06
     Behaviour
    -0.06
     oriented
    -0.06
    .score
    -0.06
    -beta
    -0.06
     acad
    -0.06
    =false
    -0.06
    POSITIVE LOGITS
     Основ
    0.07
     sitesi
    0.07
    Fetching
    0.07
    0.07
     gelenek
    0.07
    905
    0.06
    emens
    0.06
     PIXI
    0.06
     joys
    0.06
    0.06
    Act Density 0.013%

    No Known Activations