INDEX
    Explanations

    interrogative phrases, particularly questions

    New Auto-Interp
    Negative Logits
     long
    -0.67
    </i>
    -0.64
    aure
    -0.63
    navbar
    -0.62
    oire
    -0.62
     Alu
    -0.62
     AL
    -0.61
     Stoner
    -0.60
    aus
    -0.60
    lug
    -0.58
    POSITIVE LOGITS
    %?
    2.03
    ?!?
    1.83
    ’?
    1.73
    ?"
    1.70
    ?}
    1.70
    !?
    1.68
    }?
    1.67
    ?”
    1.64
    $?
    1.62
    ?
    1.59
    Act Density 0.189%

    No Known Activations