INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     muita
    -0.08
     Patel
    -0.08
     educativas
    -0.07
    -0.07
     مل
    -0.07
     thoroughly
    -0.07
     great
    -0.07
     Bert
    -0.07
     Patri
    -0.07
    ಾಶ
    -0.07
    POSITIVE LOGITS
     లేక
    0.08
    coord
    0.08
     아니
    0.08
     foe
    0.08
    /she
    0.08
     hés
    0.08
     물론
    0.07
     아닌
    0.07
     아니라
    0.07
    inand
    0.07
    Act Density 0.099%

    No Known Activations