INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     الر
    -0.07
     complex
    -0.07
    .dd
    -0.07
     dengue
    -0.07
     different
    -0.07
     distinguishing
    -0.07
    .D
    -0.07
     access
    -0.06
     bridges
    -0.06
    iós
    -0.06
    POSITIVE LOGITS
     三
    0.09
     mongwe
    0.09
    0.08
     Hoog
    0.08
     urine
    0.08
    יַ
    0.08
    yai
    0.08
     yone
    0.08
     kya
    0.08
     RCA
    0.08
    Act Density 0.024%

    No Known Activations