INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -1.30
     Monfieur
    -1.20
     itſelf
    -1.18
     Majefty
    -1.11
     doubtnut
    -1.10
     Efq
    -1.07
     ainfi
    -1.07
     Chriftian
    -1.02
    ſelf
    -1.01
    ſelves
    -1.01
    POSITIVE LOGITS
     "
    0.76
     “
    0.75
     ‘
    0.64
     '
    0.63
     est
    0.63
    ↵↵
    0.61
    /
    0.60
    ,
    0.58
    "
    0.57
     the
    0.56
    Act Density 0.027%

    No Known Activations