INDEX
    Explanations

    references to social media interactions, specifically Twitter and Instagram handles

    New Auto-Interp
    Negative Logits
    bewerken
    -0.88
     ―――――
    -0.88
    abestanden
    -0.88
    amaño
    -0.82
     Houſe
    -0.81
    ^(@)
    -0.79
    انيف
    -0.79
     photolibrary
    -0.76
     Anſ
    -0.75
    ValueStyle
    -0.75
    POSITIVE LOGITS
     @
    0.88
     (@
    0.63
    @
    0.58
    /@
    0.52
    "@
    0.52
    '@
    0.51
     '@
    0.49
     "@
    0.48
    ,@
    0.48
    ("@
    0.46
    Act Density 0.196%

    No Known Activations