INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     лица
    -0.06
    View
    -0.06
    -my
    -0.06
     cultivated
    -0.06
     інших
    -0.06
    atility
    -0.06
     tội
    -0.06
     rush
    -0.06
    uario
    -0.06
    many
    -0.06
    POSITIVE LOGITS
     scholars
    0.07
    *v
    0.07
     СССР
    0.06
     Schwe
    0.06
     volunte
    0.06
    تی
    0.06
    .....
    0.06
    NSArray
    0.06
    دي
    0.06
     touched
    0.06
    Act Density 0.025%

    No Known Activations