INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    בא
    -0.08
     trimes
    -0.08
    Tex
    -0.07
    .med
    -0.07
    _med
    -0.07
     mesi
    -0.07
     battlefield
    -0.07
     volv
    -0.07
     Schne
    -0.07
     preciso
    -0.07
    POSITIVE LOGITS
     კლას
    0.08
     უდ
    0.08
    0.08
     ტიპ
    0.08
     iub
    0.08
    0.08
    ラック
    0.08
     almeno
    0.08
     ახალგაზრდა
    0.08
     ქალი
    0.08
    Act Density 0.006%

    No Known Activations