INDEX
    Explanations

    names or initials of individuals

    New Auto-Interp
    Negative Logits
     Eſ
    -0.99
     Beſ
    -0.94
     Anſ
    -0.92
     Theſe
    -0.91
     Conſ
    -0.91
     againſt
    -0.87
    #+#
    -0.87
     Reſ
    -0.87
     itſelf
    -0.86
     Perſ
    -0.85
    POSITIVE LOGITS
     J
    0.88
     C
    0.82
     K
    0.81
     W
    0.81
     O
    0.80
     D
    0.78
     A
    0.78
     M
    0.77
     B
    0.77
     G
    0.76
    Act Density 0.152%

    No Known Activations