INDEX
    Explanations

    punctuation marks, specifically periods and their usage in text

    New Auto-Interp
    Negative Logits
    <eos>
    -0.82
     I
    -0.76
      
    -0.67
     -
    -0.66
    </sup>
    -0.65
     R
    -0.64
    ↵↵
    -0.64
    ).
    -0.64
    ?
    -0.64
     He
    -0.64
    POSITIVE LOGITS
     itſelf
    1.89
    ſelf
    1.84
    .")
    
    1.83
     Majefty
    1.79
     myſelf
    1.77
    ."));
    1.77
     Reſ
    1.75
    !")
    
    1.74
     houſe
    1.71
     Anſ
    1.71
    Act Density 0.149%

    No Known Activations