INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    /color
    -0.07
     PAT
    -0.06
     Kul
    -0.06
    ogen
    -0.06
    -0.06
    -0.06
    kick
    -0.06
    igmat
    -0.06
    -win
    -0.06
    PAT
    -0.06
    POSITIVE LOGITS
    locking
    0.08
     silly
    0.07
    etsy
    0.07
     corr
    0.07
     overly
    0.07
     ComboBox
    0.06
    aniel
    0.06
     MISSING
    0.06
    loan
    0.06
     stretches
    0.06
    Act Density 0.015%

    No Known Activations