INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.98
     допомогою
    -0.86
    ².
    -0.85
     capable
    -0.81
     estates
    -0.80
    -0.78
    ilerini
    -0.77
     not
    -0.77
    ogenen
    -0.77
    MSc
    -0.75
    POSITIVE LOGITS
     your
    1.30
     you
    1.27
    None
    1.11
    传统
    1.03
     none
    1.02
     want
    1.00
    want
    1.00
    %)$
    0.99
     None
    0.96
    0.95
    Act Density 0.042%

    No Known Activations