INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -1.64
     itſelf
    -1.62
     Efq
    -1.54
     Monfieur
    -1.51
     faſt
    -1.49
     raiſ
    -1.48
     auffi
    -1.47
     pleaſure
    -1.43
     ſeveral
    -1.41
     Majefty
    -1.41
    POSITIVE LOGITS
    1.24
    ,
    1.09
     (
    0.98
    0.96
    .
    0.94
      
    0.93
    -
    0.88
    (
    0.84
     and
    0.84
     or
    0.84
    Act Density 0.857%

    No Known Activations