INDEX
    Explanations

    instances of the word "watch" and its variations

    New Auto-Interp
    Negative Logits
     Abp
    -0.83
     للمعارف
    -0.81
     Damian
    -0.73
    ])),
    -0.68
     Damien
    -0.68
     Clough
    -0.68
    ◆◆
    -0.67
    trip
    -0.67
    cycline
    -0.66
    coln
    -0.66
    POSITIVE LOGITS
     watch
    1.75
     WATCH
    1.68
     Watch
    1.67
     watches
    1.59
     Watches
    1.55
    watches
    1.52
    watch
    1.51
    Watch
    1.51
     Watched
    1.50
    WATCH
    1.48
    Act Density 0.051%

    No Known Activations