INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     against
    -1.24
    against
    -0.98
    <eos>
    -0.94
     Against
    -0.89
    Against
    -0.89
     contre
    -0.84
     AGAINST
    -0.84
    .
    -0.74
     versus
    -0.71
    ↵↵
    -0.69
    POSITIVE LOGITS
     myſelf
    1.27
     Efq
    1.22
     itſelf
    1.18
     Jefus
    1.15
     himſelf
    1.05
     pleaſure
    1.05
     ſtate
    1.02
     uſed
    1.02
    ſelf
    1.02
     raiſ
    1.02
    Act Density 0.038%

    No Known Activations