INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    seq
    -0.09
    isions
    -0.09
     اکثر
    -0.08
    isiones
    -0.08
    -0.08
     tillegg
    -0.08
     Gratis
    -0.08
    .alt
    -0.07
     ایک
    -0.07
    ાવો
    -0.07
    POSITIVE LOGITS
     união
    0.08
     informed
    0.07
    Person
    0.07
     determined
    0.07
     mitz
    0.07
     निर्धारित
    0.07
     aware
    0.07
     meals
    0.07
     यून
    0.07
     Personality
    0.07
    Act Density 0.020%

    No Known Activations