INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ANEL
    -0.07
    OLVE
    -0.07
    cales
    -0.06
    OUND
    -0.06
    iping
    -0.06
    riendly
    -0.06
    هد
    -0.06
    ología
    -0.06
    colon
    -0.06
    ooled
    -0.06
    POSITIVE LOGITS
     stim
    0.32
     Stim
    0.15
    stim
    0.13
     Contributor
    0.07
    .flip
    0.07
     Req
    0.07
    ])),↵
    0.07
     performer
    0.07
    =DB
    0.07
     steroid
    0.06
    Act Density 0.001%

    No Known Activations