INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     divorced
    -0.07
    '_
    -0.06
     weap
    -0.06
    ‚ط
    -0.06
    lica
    -0.06
    -shop
    -0.06
     Mostly
    -0.06
     Klo
    -0.06
    -0.06
     counsel
    -0.06
    POSITIVE LOGITS
    بان
    0.07
     Lag
    0.07
     atmospheric
    0.06
    ه
    0.06
    _particles
    0.06
    >\<^
    0.06
     \<^
    0.06
    =`
    0.06
    (end
    0.06
     imprint
    0.06
    Act Density 0.013%

    No Known Activations