INDEX
    Explanations

    mentions or variations of the word "dos" in different contexts

    references to dosages and the term "dos."

    New Auto-Interp
    Negative Logits
    raud
    -0.78
    ISM
    -0.78
    Reviewer
    -0.72
    RY
    -0.71
    Loading
    -0.68
    WB
    -0.66
    pter
    -0.64
     Demon
    -0.62
    ORE
    -0.62
    Robin
    -0.62
    POSITIVE LOGITS
    omething
    1.36
     dos
    1.28
     Dos
    1.11
     Santos
    1.10
    dos
    1.06
    hiba
    0.96
    wana
    0.88
    ques
    0.82
    ctr
    0.81
     lapt
    0.80
    Act Density 0.005%

    No Known Activations