INDEX
    Explanations

    references to surveillance and privacy concerns

    New Auto-Interp
    Negative Logits
     oneself
    -0.57
    ')['
    -0.53
    yourself
    -0.52
     yourself
    -0.51
     myself
    -0.50
    對方
    -0.49
     ویکی‌پدیا
    -0.49
    myself
    -0.49
    เขา
    -0.49
    ตัวเอง
    -0.48
    POSITIVE LOGITS
     me
    1.03
     us
    1.02
     you
    0.89
     conmigo
    0.84
    RegressionTest
    0.79
     contigo
    0.74
    CrossRef
    0.72
     للمعارف
    0.70
     comigo
    0.69
     нас
    0.65
    Act Density 0.393%

    No Known Activations