INDEX
    Explanations

    references to authors and their publications in scientific contexts

    New Auto-Interp
    Negative Logits
     Diſ
    -0.99
     ſeveral
    -0.98
     Anſ
    -0.97
     itſelf
    -0.95
     Theſe
    -0.94
     Reſ
    -0.94
     ་་
    -0.93
     Beſ
    -0.92
     themſelves
    -0.90
     myſelf
    -0.86
    POSITIVE LOGITS
     J
    1.86
    J
    1.60
     j
    1.47
    j
    1.06
     L
    1.04
     K
    1.00
     M
    0.98
     l
    0.96
     V
    0.95
     W
    0.92
    Act Density 0.158%

    No Known Activations