INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     violets
    -1.18
    osen
    -1.17
     myſelf
    -1.13
     Monfieur
    -1.07
     Efq
    -1.05
     ſtate
    -1.03
     himſelf
    -1.02
     Majefty
    -1.02
     itſelf
    -1.00
     raiſ
    -1.00
    POSITIVE LOGITS
     von
    0.51
    bart
    0.50
     Von
    0.48
     J
    0.48
    0.47
     pr
    0.45
     si
    0.45
    </strong>
    0.45
     v
    0.45
     Par
    0.44
    Act Density 0.060%

    No Known Activations