INDEX
    Explanations

    names or designations of individuals or groups

    New Auto-Interp
    Negative Logits
     itſelf
    -0.95
     faſt
    -0.88
     myſelf
    -0.88
     againſt
    -0.81
     uſe
    -0.77
     iſt
    -0.74
     ―――――
    -0.73
     ་་
    -0.73
     Theſe
    -0.72
     themſelves
    -0.71
    POSITIVE LOGITS
     G
    1.12
     M
    1.04
     P
    1.03
     W
    1.03
     K
    1.01
     B
    1.01
     Z
    0.99
     F
    0.99
     L
    0.98
     S
    0.98
    Act Density 0.945%

    No Known Activations