INDEX
    Explanations

    questioning expressions or prompts

    rhetorical questions and expressions of uncertainty

    New Auto-Interp
    Negative Logits
    shaw
    -0.76
    icro
    -0.72
    aper
    -0.71
    ania
    -0.69
    achy
    -0.68
    undo
    -0.68
    arers
    -0.67
    arer
    -0.67
    opsis
    -0.65
    eper
    -0.65
    POSITIVE LOGITS
    .?
    0.98
     ???
    0.89
    ����
    0.88
    ?,
    0.86
     Huh
    0.81
     Nope
    0.77
    soever
    0.77
     Interest
    0.74
    ?:
    0.73
     Nationwide
    0.69
    Act Density 0.032%

    No Known Activations