INDEX
    Explanations

    words indicating relative measurement

    comparisons

    New Auto-Interp
    Negative Logits
     Monfieur
    -1.88
     myſelf
    -1.85
     Efq
    -1.84
     Majefty
    -1.66
     ſeveral
    -1.63
     raiſ
    -1.57
     Diſ
    -1.56
     Jefus
    -1.56
     Reſ
    -1.55
     themſelves
    -1.55
    POSITIVE LOGITS
    0.92
    ,
    0.91
     (
    0.91
    -
    0.84
      
    0.77
     [
    0.77
     y
    0.74
    f
    0.71
    ing
    0.69
    .
    0.69
    Act Density 0.979%

    No Known Activations