INDEX
    Explanations

    mentions of social media handles and other online identifiers

    New Auto-Interp
    Negative Logits
     initial
    -0.65
    UnsafeEnabled
    -0.60
     («
    -0.57
    UrlResolution
    -0.56
    Diwedd
    -0.55
     BrowserModule
    -0.54
    примеча
    -0.53
     «
    -0.50
     CommonModule
    -0.50
     للاسماء
    -0.50
    POSITIVE LOGITS
    Official
    0.79
     Official
    0.76
    official
    0.76
     OFFICIAL
    0.69
    #!/
    0.69
    enumii
    0.66
    thereal
    0.65
     Oficial
    0.64
    iam
    0.62
    thisis
    0.61
    Act Density 0.148%

    No Known Activations