INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Assass
    -0.07
    (item
    -0.06
    /licenses
    -0.06
    Filtered
    -0.06
    -0.06
    <Transform
    -0.06
    Annotation
    -0.06
    Categories
    -0.06
     Wrestle
    -0.06
     ------
    -0.06
    POSITIVE LOGITS
    imest
    0.07
    -not
    0.07
    ROT
    0.07
    codegen
    0.07
    -but
    0.06
     shipped
    0.06
    loth
    0.06
    osen
    0.06
     Millenn
    0.06
    _admin
    0.06
    Act Density 0.034%

    No Known Activations