INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -1.06
     Theſe
    -1.06
     Monfieur
    -1.03
     myſelf
    -1.03
     Majefty
    -1.02
     pleaſure
    -1.02
     Efq
    -1.02
     purpoſe
    -0.98
     iſt
    -0.97
     Chriftian
    -0.95
    POSITIVE LOGITS
    kespea
    0.71
     Smiths
    0.68
    fords
    0.65
    DCs
    0.63
     Davids
    0.60
    CDs
    0.58
    ptons
    0.58
    SCs
    0.56
    ansons
    0.55
    ECs
    0.54
    Act Density 0.948%

    No Known Activations