INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .After
    -0.06
     عزیز
    -0.06
    .dense
    -0.06
     آمد
    -0.06
     форму
    -0.06
    uploaded
    -0.06
     позволяет
    -0.06
    sol
    -0.06
     شکن
    -0.06
    аю
    -0.06
    POSITIVE LOGITS
     Ukra
    0.07
    intro
    0.07
    .channel
    0.06
     Serial
    0.06
     біл
    0.06
    armac
    0.06
    0.06
     Jennifer
    0.06
     Kv
    0.06
     Kat
    0.06
    Act Density 0.004%

    No Known Activations