INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ren
    -0.74
    ESE
    -0.70
    sen
    -0.70
    ={
    -0.69
    TAIN
    -0.67
    heim
    -0.64
    sters
    -0.63
    ievers
    -0.63
    DS
    -0.63
    rence
    -0.63
    POSITIVE LOGITS
     gotta
    1.41
     gonna
    1.37
     been
    1.20
     got
    1.16
     gotten
    1.10
    been
    1.00
     Been
    0.93
     going
    0.86
     supposed
    0.85
     gone
    0.84
    Act Density 0.230%

    No Known Activations