INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     reven
    -0.07
    -0.07
    AZY
    -0.07
     Uganda
    -0.07
     حساب
    -0.06
    علی
    -0.06
     Carmen
    -0.06
     rebels
    -0.06
     Geld
    -0.06
     ullam
    -0.06
    POSITIVE LOGITS
     wave
    0.07
    vt
    0.07
    दर
    0.07
    0.06
     guide
    0.06
     query
    0.06
    ofday
    0.06
     integrate
    0.06
    0.06
    SYM
    0.06
    Act Density 0.004%

    No Known Activations