INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     enzyme
    -0.07
    ινων
    -0.06
     retention
    -0.06
    normally
    -0.06
     ناح
    -0.06
     year
    -0.06
    ійно
    -0.06
     prisons
    -0.06
     Gale
    -0.06
     Booking
    -0.06
    POSITIVE LOGITS
    مر
    0.07
     republican
    0.07
    (USER
    0.07
     musel
    0.07
    _loader
    0.06
    ued
    0.06
     predicted
    0.06
     Orr
    0.06
     found
    0.06
     experimented
    0.06
    Act Density 0.058%

    No Known Activations