INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Ancient
    -1.69
     Ancient
    -1.66
    ancient
    -1.61
     ancient
    -1.58
     ANCI
    -1.45
     Reſ
    -1.30
     Efq
    -1.28
     Eſ
    -1.16
     Diſ
    -1.15
     ―――――
    -1.15
    POSITIVE LOGITS
    0.73
     R
    0.64
     man
    0.59
     "
    0.58
     F
    0.57
     World
    0.57
     H
    0.56
    ,
    0.54
     he
    0.54
     V
    0.54
    Act Density 0.086%

    No Known Activations