INDEX
    Explanations

    occurrences of specific characters or punctuation marks, particularly apostrophes

    New Auto-Interp
    Negative Logits
     wallets
    -0.68
     Wash
    -0.68
     peanuts
    -0.65
    othy
    -0.60
     Morse
    -0.60
     stacks
    -0.59
     Livingston
    -0.59
     Lars
    -0.59
     Avery
    -0.58
     Crosby
    -0.58
    POSITIVE LOGITS
    oeuv
    1.02
    ét
    0.97
    eros
    0.89
    euro
    0.88
    avez
    0.88
    ava
    0.79
    ê
    0.79
    hist
    0.78
    esp
    0.78
    hab
    0.77
    Act Density 0.006%

    No Known Activations