INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
     rant
    -0.07
    -0.07
    -0.06
    (Task
    -0.06
     assertEquals
    -0.06
    تين
    -0.06
    ASSERT
    -0.06
    fire
    -0.06
    Sing
    -0.06
    POSITIVE LOGITS
    0.06
     plus
    0.06
     slows
    0.06
    -dollar
    0.06
     hoy
    0.05
    selectors
    0.05
    handled
    0.05
    Uses
    0.05
    �이
    0.05
    .Is
    0.05
    Act Density 0.041%

    No Known Activations