INDEX
    Explanations

    phrases related to instructions or commands

    occurrences of the word "dos" and variations related to dosage

    New Auto-Interp
    Negative Logits
    ISM
    -0.96
    Reviewer
    -0.80
     Immunity
    -0.73
    gypt
    -0.71
    ICAN
    -0.70
    ocene
    -0.70
    INTON
    -0.69
    raud
    -0.69
    WB
    -0.67
    istically
    -0.64
    POSITIVE LOGITS
    omething
    1.29
     dos
    1.15
     Dos
    1.03
     Santos
    1.01
    hiba
    0.98
    wana
    0.81
    dos
    0.80
    pec
    0.80
    ega
    0.79
    ques
    0.79
    Act Density 0.005%

    No Known Activations