INDEX
    Explanations

    .stereotype

    New Auto-Interp
    Negative Logits
    axed
    -0.06
    ABL
    -0.06
    Σ
    -0.06
     پژوهش
    -0.06
     новый
    -0.06
     lively
    -0.06
     KS
    -0.06
    positive
    -0.06
    -0.05
    λον
    -0.05
    POSITIVE LOGITS
    .stereotype
    0.18
     Scho
    0.07
    nergie
    0.07
    "type
    0.07
     其他
    0.06
     О
    0.06
    (delete
    0.06
     seçenek
    0.06
    려고
    0.06
     Regardless
    0.06
    Act Density 0.000%

    No Known Activations