INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -1.25
     myſelf
    -1.25
     Efq
    -1.11
     Majefty
    -1.06
     Jefus
    -1.04
     Theſe
    -1.02
     himſelf
    -1.01
     doubtnut
    -0.98
     Monfieur
    -0.98
     raiſ
    -0.98
    POSITIVE LOGITS
    e
    0.73
    i
    0.67
    a
    0.57
     /
    0.56
     (
    0.54
    /
    0.54
    o
    0.53
     to
    0.52
     changed
    0.50
     that
    0.49
    Act Density 1.537%

    No Known Activations