INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     quart
    -0.09
    -0.08
    -0.07
     Remark
    -0.07
    -0.07
    poster
    -0.07
    itening
    -0.07
    Poster
    -0.07
     Thess
    -0.07
     conjunction
    -0.07
    POSITIVE LOGITS
     Jah
    0.09
    /software
    0.08
     Costa
    0.08
    ments
    0.08
    ك
    0.07
     Utt
    0.07
    ات
    0.07
     Sally
    0.07
    0.07
    اتها
    0.07
    Act Density 0.023%

    No Known Activations