INDEX
    Explanations

    interrogative sentences or questions

    New Auto-Interp
    Negative Logits
    aus
    -0.71
    𝙫
    -0.71
    böz
    -0.70
    oire
    -0.69
    𝙜
    -0.69
    navbar
    -0.68
     Bradley
    -0.67
    ade
    -0.65
    a
    -0.65
     Alu
    -0.65
    POSITIVE LOGITS
    %?
    1.88
    ?
    1.71
    ؟
    1.64
    ?!?
    1.62
    ?}
    1.55
    ’?
    1.54
    $?
    1.53
    ?"
    1.49
    !?
    1.48
    1.45
    Act Density 0.133%

    No Known Activations