INDEX
    Explanations

    punctuation marks and formatting symbols

    New Auto-Interp
    Negative Logits
     myſelf
    -1.71
     itſelf
    -1.63
     Efq
    -1.62
     Jefus
    -1.61
     ―――――
    -1.60
     doubtnut
    -1.60
    ſelves
    -1.59
    ſelf
    -1.59
     ་་
    -1.55
     Anſ
    -1.53
    POSITIVE LOGITS
    .
    1.32
    ,
    1.12
    ;
    0.99
    0.98
    <eos>
    0.97
     (
    0.94
      
    0.92
    )
    0.92
    ↵↵
    0.92
    0.91
    Act Density 0.183%

    No Known Activations