INDEX
    Explanations

    phrases indicating the beginning or completion of a task or discussion

    phrases related to being out of the ordinary or unconventional situations

    New Auto-Interp
    Negative Logits
    utical
    -0.69
    raud
    -0.66
    cum
    -0.65
    ãĥį
    -0.65
    lege
    -0.63
    ILA
    -0.63
    ume
    -0.62
    OK
    -0.61
    oster
    -0.61
    olo
    -0.61
    POSITIVE LOGITS
     equation
    1.02
     gate
    0.98
     frying
    0.90
     closet
    0.89
     loop
    0.84
     box
    0.84
     gates
    0.83
     woods
    0.81
     realm
    0.80
     fray
    0.77
    Act Density 0.063%

    No Known Activations