INDEX
    Explanations

    questions starting with 'what'

    New Auto-Interp
    Negative Logits
    Interstitial
    -0.81
    DERR
    -0.74
    mens
    -0.73
    renheit
    -0.70
    apsed
    -0.70
    Said
    -0.69
    agos
    -0.66
    anus
    -0.66
    uffer
    -0.65
    ache
    -0.64
    POSITIVE LOGITS
    ?
    0.95
    ?'"
    0.90
    ?'
    0.89
     exactly
    0.87
    !?
    0.85
    ?"
    0.84
    ?".
    0.84
    ?ãĢį
    0.81
    ?",
    0.81
     happens
    0.79
    Act Density 0.050%

    No Known Activations