INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     jurisdiction
    -0.06
     terrace
    -0.06
    Severity
    -0.06
    agy
    -0.06
     إليه
    -0.06
     завдання
    -0.06
    uania
    -0.06
    security
    -0.06
     लगत
    -0.06
     chí
    -0.06
    POSITIVE LOGITS
     (~
    0.06
    (volume
    0.06
     introducing
    0.06
     ['#
    0.06
     Cub
    0.06
     WOM
    0.06
     harass
    0.06
    	array
    0.06
    ックス
    0.06
    <const
    0.06
    Act Density 0.007%

    No Known Activations