INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _candidate
    -0.06
     lda
    -0.06
     مهندسی
    -0.06
    _SF
    -0.06
     predators
    -0.06
     neu
    -0.06
    options
    -0.06
     hacen
    -0.06
    -0.06
     criticism
    -0.06
    POSITIVE LOGITS
     hull
    0.14
     Hull
    0.09
    0.07
    ourse
    0.07
     remarkably
    0.07
     Seriously
    0.07
    .'/
    0.06
     gymn
    0.06
    яг
    0.06
     kolej
    0.06
    Act Density 0.002%

    No Known Activations