INDEX
    Explanations

    the word "due", and also names

    New Auto-Interp
    Negative Logits
    -1.33
      
    -1.20
    -1.02
     (
    -0.96
       
    -0.91
    _
    -0.90
     I
    -0.90
    ↵↵
    -0.88
    :
    -0.85
    <eos>
    -0.84
    POSITIVE LOGITS
     Majefty
    1.97
     myſelf
    1.95
     Efq
    1.82
     purpoſe
    1.80
     itſelf
    1.75
     Jefus
    1.72
     pleaſure
    1.70
     himſelf
    1.66
    ſelf
    1.63
     themſelves
    1.62
    Act Density 0.880%

    No Known Activations