INDEX
    Explanations

    expressions of anticipation and commitment to watching television shows

    New Auto-Interp
    Negative Logits
    Äĥm
    -0.15
     Verde
    -0.14
    JI
    -0.14
    .AF
    -0.14
    ifax
    -0.14
     Hari
    -0.14
    aukee
    -0.13
    еле
    -0.13
    linger
    -0.13
     Weinstein
    -0.13
    POSITIVE LOGITS
     watch
    0.88
    watch
    0.79
     Watch
    0.76
     watching
    0.73
    Watch
    0.72
    -watch
    0.72
     watches
    0.71
     watched
    0.69
     WATCH
    0.69
    .watch
    0.68
    Act Density 0.283%

    No Known Activations