INDEX
    Explanations

    phrases indicating desire or preference

    expressions questioning desires or preferences

    New Auto-Interp
    Negative Logits
    bis
    -0.61
     Letter
    -0.59
    notations
    -0.59
     Oaks
    -0.56
    pc
    -0.56
    rams
    -0.55
    Runner
    -0.55
     Fargo
    -0.55
     variants
    -0.54
     requested
    -0.54
    POSITIVE LOGITS
    ?)
    1.18
    ?),
    1.16
    ?!
    1.14
    ?).
    1.10
    ?!"
    1.09
    !?"
    1.07
    ?"
    1.01
    ?
    1.00
    !?
    0.99
    ?'"
    0.92
    Act Density 0.103%

    No Known Activations