INDEX
    Explanations

    occurrences of specific capital letters, likely referring to acronyms or titles

    New Auto-Interp
    Negative Logits
     Anſ
    -1.72
     Theſe
    -1.66
     Diſ
    -1.59
     faſt
    -1.53
     Reſ
    -1.52
     Houſe
    -1.52
     Monfieur
    -1.48
     Beſ
    -1.46
     itſelf
    -1.46
     ſeveral
    -1.45
    POSITIVE LOGITS
     G
    2.11
     M
    2.10
     D
    2.09
     P
    2.08
     S
    2.07
     B
    2.05
     R
    2.04
     L
    2.04
     C
    2.03
     W
    2.03
    Act Density 0.571%

    No Known Activations