INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    رياض
    -0.07
     leth
    -0.07
     dosud
    -0.07
     Cz
    -0.06
    ้ส
    -0.06
     insurers
    -0.06
     relig
    -0.06
     الزر
    -0.06
     EGL
    -0.06
     çevre
    -0.06
    POSITIVE LOGITS
     wanting
    0.08
    wanted
    0.08
    UNT
    0.08
    WS
    0.08
     loves
    0.08
    want
    0.07
     wants
    0.07
    ****/↵
    0.07
    VAL
    0.07
    unt
    0.07
    Act Density 0.018%

    No Known Activations