INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     decrease
    -0.08
     increase
    -0.07
     transitions
    -0.07
     increases
    -0.07
     increased
    -0.07
     decreasing
    -0.07
     decreased
    -0.07
     Gaines
    -0.07
     Jin
    -0.07
    REW
    -0.06
    POSITIVE LOGITS
    ƒ
    0.07
    (helper
    0.07
     kapas
    0.07
    equip
    0.07
    _PED
    0.07
     دفتر
    0.07
    graphic
    0.07
     вак
    0.06
     العربي
    0.06
     amb
    0.06
    Act Density 0.008%

    No Known Activations