INDEX
    Explanations

    foreign languages

    New Auto-Interp
    Negative Logits
     акции
    -0.08
    factory
    -0.08
    <[
    -0.07
    (\'
    -0.07
    -0.07
    README
    -0.07
    hera
    -0.07
     miserable
    -0.07
     uninter
    -0.07
     hor
    -0.07
    POSITIVE LOGITS
     والاج
    0.11
    ALLY
    0.08
     terrestrial
    0.08
    0.08
     femenino
    0.08
     féminin
    0.07
     Faux
    0.07
     cél
    0.07
     illusions
    0.07
    0.07
    Act Density 0.055%

    No Known Activations