INDEX
    Explanations

    question marks and query indicators in the text

    New Auto-Interp
    Negative Logits
    -0.90
    .
    -0.76
     (
    -0.71
    ↵↵
    -0.66
     I
    -0.65
     In
    -0.61
    </i>
    -0.59
     in
    -0.59
    ,
    -0.59
     a
    -0.58
    POSITIVE LOGITS
    ="?
    1.39
     ?'
    1.34
     $?
    1.31
     ?...
    1.30
     '?'
    1.28
     !?
    1.27
    ?<
    1.22
     Majefty
    1.22
     '?
    1.20
    ?!?
    1.20
    Act Density 0.143%

    No Known Activations