INDEX
    Explanations

    intensifiers, particularly the word "very."

    New Auto-Interp
    Negative Logits
    ecut
    -0.16
    ehr
    -0.15
    oundary
    -0.15
    iverse
    -0.15
    inki
    -0.14
    EMALE
    -0.14
    esc
    -0.14
     swe
    -0.14
    ushman
    -0.14
    ³
    -0.13
    POSITIVE LOGITS
    anni
    0.14
     ìĿ´ìĸ´
    0.13
    aylight
    0.13
     nudity
    0.13
    steen
    0.13
    uti
    0.13
    ijo
    0.13
    .ham
    0.13
    ken
    0.13
     naked
    0.13
    Act Density 0.044%

    No Known Activations