INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     quedaba
    -0.35
     crees
    -0.33
     developed
    -0.32
     choice
    -0.32
     dapur
    -0.31
     tested
    -0.31
     minds
    -0.31
     cocina
    -0.29
     Aufla
    -0.28
     achieved
    -0.28
    POSITIVE LOGITS
    WriteBarrier
    0.63
     تضيفلها
    0.63
    httphttps
    0.62
    Tembelea
    0.61
     lenker
    0.60
    󠁢
    0.60
    :✨
    0.58
    ьаж
    0.57
    umani
    0.57
     electrolux
    0.55
    Act Density 0.135%

    No Known Activations