INDEX
    Explanations

    actions related to monitoring or verifying something

    instances of the word "check" in various contexts

    New Auto-Interp
    Negative Logits
    ufact
    -0.74
    theless
    -0.73
    ãĤ´ãĥ³
    -0.71
    usable
    -0.69
    é¾įå
    -0.68
    \":
    -0.68
    ña
    -0.68
    ĸļ
    -0.67
    Sin
    -0.66
    ãĥĢ
    -0.66
    POSITIVE LOGITS
     whether
    1.04
    lists
    0.96
    boxes
    0.96
    mate
    0.96
     IDs
    0.95
     out
    0.93
     if
    0.82
     boxes
    0.77
     Yelp
    0.77
     into
    0.76
    Act Density 0.028%

    No Known Activations