INDEX
    Explanations

    references to Facebook and its associated features or links

    New Auto-Interp
    Negative Logits
    istic
    -0.16
    ãĥ©ãĥĥãĤ¯
    -0.15
    exact
    -0.15
    Ñħи
    -0.15
     Richardson
    -0.15
    agli
    -0.14
    vetica
    -0.14
    'Brien
    -0.14
    ê¹
    -0.14
    omor
    -0.14
    POSITIVE LOGITS
     Messenger
    0.23
    /twitter
    0.21
    (fb
    0.20
    s
    0.19
    /T
    0.19
     messenger
    0.18
     fb
    0.18
    .com
    0.17
    istan
    0.17
     FB
    0.17
    Act Density 0.011%

    No Known Activations