INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     koşul
    -0.07
     býval
    -0.07
     Jug
    -0.06
     aprove
    -0.06
     ihtiy
    -0.06
    fields
    -0.06
    StringLength
    -0.06
     bord
    -0.06
     Armstrong
    -0.06
     множе
    -0.06
    POSITIVE LOGITS
    (theta
    0.07
    -blind
    0.07
     ACCEPT
    0.06
    tweet
    0.06
    0.06
    .AC
    0.06
    vision
    0.06
    PMENT
    0.06
     Associate
    0.06
    .CREATE
    0.06
    Act Density 0.011%

    No Known Activations