INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Think
    -0.06
     Scope
    -0.06
     rhet
    -0.06
    (question
    -0.06
     murdering
    -0.06
     التي
    -0.06
     guru
    -0.06
    ّ
    -0.05
    Dream
    -0.05
    descriptor
    -0.05
    POSITIVE LOGITS
    させ
    0.07
    OLUME
    0.07
    0.07
     Ober
    0.07
    expanded
    0.07
     schl
    0.07
    generated
    0.07
    وزيع
    0.07
     massasje
    0.06
    elocity
    0.06
    Act Density 0.059%

    No Known Activations