INDEX
    Explanations

    punctuation marks, specifically exclamation points and question marks

    New Auto-Interp
    Negative Logits
    emade
    -0.16
    opup
    -0.15
    quential
    -0.15
     Irr
    -0.14
    emens
    -0.14
    isd
    -0.14
    Stamped
    -0.14
    Criterion
    -0.13
    ipple
    -0.13
     Hardy
    -0.13
    POSITIVE LOGITS
     Josh
    0.15
     Central
    0.14
     Liber
    0.14
     Vib
    0.14
     Liberation
    0.14
    tt
    0.14
    tml
    0.14
     Driver
    0.14
     Era
    0.14
    ificador
    0.13
    Act Density 0.128%

    No Known Activations