INDEX
    Explanations

    question words like "What" or "Who"

    occurrences of the word "Wh"

    New Auto-Interp
    Negative Logits
     Grande
    -0.67
    WARE
    -0.67
     Duo
    -0.65
     Awakening
    -0.61
    ULAR
    -0.61
     Blazers
    -0.61
     Strauss
    -0.61
     Letter
    -0.59
     Barton
    -0.59
     sten
    -0.57
    POSITIVE LOGITS
    istle
    1.43
    ilst
    1.35
    irlwind
    1.26
    olly
    1.21
    ispers
    1.20
    soever
    1.17
    isky
    1.16
    olen
    1.11
    irling
    1.10
    ichever
    1.08
    Act Density 0.023%

    No Known Activations