INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Forecast
    -0.07
     erotic
    -0.07
    lossen
    -0.07
    .gen
    -0.07
    _sub
    -0.07
    >${
    -0.07
    cene
    -0.07
     menace
    -0.06
    -0.06
     underway
    -0.06
    POSITIVE LOGITS
     amplified
    0.12
     amplify
    0.11
     ampl
    0.10
    Miller
    0.08
     Ampl
    0.08
    -url
    0.07
     dil
    0.06
     gracefully
    0.06
     amplifier
    0.06
     disproportionately
    0.06
    Act Density 0.003%

    No Known Activations