INDEX
    Explanations

    references to specific social media interactions and metrics

    Comes after "user" at the start of turn

    graphical interface elements

    New Auto-Interp
    Negative Logits
    Kariera
    -0.40
    ecution
    -0.39
    })`
    -0.38
    NextPage
    -0.37
     especiales
    -0.37
    ########.
    -0.37
    icated
    -0.34
     sages
    -0.34
     Pagina
    -0.34
    mistry
    -0.33
    POSITIVE LOGITS
    twimg
    0.84
    httphttps
    0.82
     snippetHide
    0.65
    UserScript
    0.59
    setupUi
    0.53
     NSCoder
    0.49
    ModelSerializer
    0.47
    providedIn
    0.46
     OnInit
    0.46
    ToFit
    0.45
    Act Density 0.098%

    No Known Activations