INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     prisons
    -0.07
     pepper
    -0.06
     Plugins
    -0.06
     Southampton
    -0.06
    ono
    -0.06
     Victor
    -0.06
    outfile
    -0.06
     Moose
    -0.06
     clay
    -0.06
     golf
    -0.06
    POSITIVE LOGITS
     än
    0.07
    compressed
    0.07
     coisa
    0.06
    equ
    0.06
     مد
    0.06
    0.06
    —we
    0.06
    /Branch
    0.06
    QUE
    0.06
    $q
    0.06
    Act Density 0.007%

    No Known Activations