INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -0.82
     Jefus
    -0.80
     pleaſure
    -0.79
     triom
    -0.75
     ſtate
    -0.74
     houſe
    -0.73
     Chriftian
    -0.71
     purpoſe
    -0.69
     ſeveral
    -0.68
     iſt
    -0.68
    POSITIVE LOGITS
     be
    1.35
     Be
    1.21
    Be
    1.14
    ToBe
    1.02
     shouldBe
    1.01
     BE
    1.00
     être
    1.00
    be
    0.94
     able
    0.91
     бъдат
    0.90
    Act Density 0.217%

    No Known Activations