INDEX
    Explanations

    mentions of user-related identifiers, specifically the term "username"

    New Auto-Interp
    Negative Logits
    ValueStyle
    -1.03
     Seeder
    -0.93
    دانشنامهٔ
    -0.93
    awaiter
    -0.83
     Moq
    -0.82
    setof
    -0.81
    apollo
    -0.80
     Lyra
    -0.79
    *}$
    -0.79
    Tikang
    -0.79
    POSITIVE LOGITS
     username
    0.77
     whoſe
    0.73
     Wur
    0.73
    username
    0.71
    mistic
    0.71
     Theſe
    0.71
    Username
    0.68
     Wür
    0.67
     setUsername
    0.64
     Username
    0.64
    Act Density 0.117%

    No Known Activations