INDEX
    Explanations

    comparisons between different options or choices

    phrases indicating comparison or asking rhetorical questions about choices

    New Auto-Interp
    Negative Logits
    bis
    -0.70
    KO
    -0.66
    iHUD
    -0.62
    Board
    -0.61
     poppy
    -0.58
     horizont
    -0.57
    Pir
    -0.57
    pi
    -0.57
     carriers
    -0.57
    gra
    -0.57
    POSITIVE LOGITS
    ?!
    1.20
    ?]
    1.12
    ?
    1.09
    ?)
    1.08
    ?),
    1.08
    !?
    1.07
    ?).
    0.99
    ?!"
    0.99
    ?"
    0.98
    ?'
    0.98
    Act Density 0.159%

    No Known Activations