INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Monfieur
    -1.13
     Cæsar
    -1.05
     faſt
    -1.05
     Reſ
    -1.04
     Theſe
    -1.02
     pleaſure
    -1.01
     myſelf
    -1.00
     greateſt
    -1.00
    ſelf
    -0.99
     Diſ
    -0.99
    POSITIVE LOGITS
    .
    0.58
    vector
    0.54
     Ar
    0.52
    0.51
     an
    0.50
     and
    0.50
     or
    0.46
     I
    0.46
    ,
    0.45
     es
    0.45
    Act Density 0.138%

    No Known Activations