INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     banging
    -0.08
     prep
    -0.08
     oint
    -0.07
     coating
    -0.07
     toku
    -0.07
     dém
    -0.07
     Prec
    -0.07
     uitle
    -0.07
    avali
    -0.07
     totally
    -0.07
    POSITIVE LOGITS
    _placeholder
    0.10
    .placeholder
    0.09
    placeholder
    0.09
    _goal
    0.08
    -placeholder
    0.08
    _error
    0.08
    Placeholder
    0.08
     infamous
    0.07
    _failure
    0.07
    goal
    0.07
    Act Density 0.007%

    No Known Activations