INDEX
    Explanations

    Twitter usernames

    occurrences of the end-of-text token

    New Auto-Interp
    Negative Logits
     shortage
    -0.75
     fulfilling
    -0.73
     bite
    -0.72
     venom
    -0.72
     tense
    -0.72
     claws
    -0.72
     loan
    -0.72
     bites
    -0.71
     captcha
    -0.71
     pacing
    -0.71
    POSITIVE LOGITS
    Writ
    1.41
    _
    1.36
    Official
    1.29
    Stud
    1.23
    Reports
    1.23
    Ide
    1.22
    News
    1.19
    Jew
    1.18
    WithNo
    1.17
    Games
    1.17
    Act Density 0.163%

    No Known Activations