INDEX
    Explanations

    phrases related to commands or instructions given by individuals

    commands or phrases related to urgent situations and safety

    New Auto-Interp
    Negative Logits
    20439
    -0.79
     quir
    -0.78
    etheless
    -0.71
     magnification
    -0.70
    mittedly
    -0.69
     anecd
    -0.68
    imilar
    -0.67
    everal
    -0.66
     Flavoring
    -0.66
    ynchron
    -0.64
    POSITIVE LOGITS
    !".
    1.71
    !",
    1.61
    !"
    1.56
    !'"
    1.46
    !!"
    1.42
    !'
    1.40
    !!
    1.20
    !
    1.18
    !!!!
    1.18
    ?!"
    1.16
    Act Density 0.501%

    No Known Activations