INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Sa
    -0.93
     Re
    -0.75
     Ph
    -0.71
    sa
    -0.70
     Qu
    -0.65
     Res
    -0.65
     Es
    -0.63
     Th
    -0.61
     Pe
    -0.61
     ph
    -0.61
    POSITIVE LOGITS
     myſelf
    1.23
     raiſ
    1.12
     Efq
    1.06
     himſelf
    1.05
     uſed
    1.02
     itſelf
    1.00
     auffi
    0.99
     Jefus
    0.99
    ſelf
    0.97
     ſta
    0.94
    Act Density 0.131%

    No Known Activations