INDEX
    Explanations

    определяется

    New Auto-Interp
    Negative Logits
     stir
    -0.08
    -0.08
     hech
    -0.07
    ion
    -0.07
     dahil
    -0.07
     ionic
    -0.07
    boys
    -0.07
     staining
    -0.07
     gay
    -0.07
     Clouds
    -0.07
    POSITIVE LOGITS
     Schön
    0.08
     lim
    0.08
    0.07
    atern
    0.07
    0.07
     Semin
    0.07
    Nk
    0.07
     Ker
    0.07
     condu
    0.07
    cow
    0.07
    Act Density 0.001%

    No Known Activations