INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     управление
    -0.08
     birlik
    -0.07
    هب
    -0.07
     wholly
    -0.07
     outreach
    -0.07
    _groups
    -0.07
    -0.07
     نب
    -0.07
    -0.07
     landschap
    -0.07
    POSITIVE LOGITS
    side
    0.09
     siebie
    0.08
     docks
    0.08
     dumpsters
    0.08
     niego
    0.08
     midnight
    0.08
     forget
    0.08
    ariki
    0.07
     للس
    0.07
    expiry
    0.07
    Act Density 0.032%

    No Known Activations