INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     def
    -0.71
    EF
    -0.66
     inf
    -0.63
    EV
    -0.61
     Ev
    -0.59
     Def
    -0.56
    Ev
    -0.55
     EP
    -0.53
    def
    -0.52
     Leon
    -0.51
    POSITIVE LOGITS
     Monfieur
    1.05
     againſt
    0.96
     myſelf
    0.96
     ſche
    0.94
    s
    0.94
     Efq
    0.92
     Jefus
    0.91
     himſelf
    0.90
     ―――――
    0.89
     themſelves
    0.89
    Act Density 0.109%

    No Known Activations