INDEX
    Explanations

    multiple languages

    New Auto-Interp
    Negative Logits
     غیر
    -0.08
     jij
    -0.08
     Finder
    -0.08
     şah
    -0.08
     Kaj
    -0.08
     wij
    -0.07
    301
    -0.07
    Finder
    -0.07
    itive
    -0.07
     خل
    -0.07
    POSITIVE LOGITS
     życie
    0.08
     characterize
    0.08
     characterized
    0.08
     khiến
    0.08
    อยู่
    0.08
     auster
    0.08
     decisões
    0.08
    -enter
    0.08
     прын
    0.08
     triunfo
    0.08
    Act Density 0.088%

    No Known Activations