INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    lications
    -0.07
     şik
    -0.07
    ALTH
    -0.06
     Breaking
    -0.06
     Florida
    -0.06
    -0.06
     concede
    -0.06
     viewpoint
    -0.06
    .cgColor
    -0.06
     contracts
    -0.06
    POSITIVE LOGITS
    ervlet
    0.06
    .sap
    0.06
    0.06
    ,port
    0.06
    _sh
    0.06
     Wr
    0.06
     heter
    0.06
    tol
    0.06
    ther
    0.06
    .props
    0.06
    Act Density 0.185%

    No Known Activations