INDEX
    Explanations

    conditional phrases related to decision-making or precautions in various contexts

    to be safe, clear, or explicit

    New Auto-Interp
    Negative Logits
     itſelf
    -0.75
     myſelf
    -0.66
    OGND
    -0.65
     Theſe
    -0.64
     Eſ
    -0.63
     Efq
    -0.63
     perſon
    -0.63
     Anſ
    -0.62
     ſta
    -0.61
    ſelf
    -0.60
    POSITIVE LOGITS
     precaution
    1.32
     precautionary
    1.16
     precau
    1.02
     caution
    1.00
     safety
    0.94
     cautious
    0.93
     safest
    0.93
     safer
    0.93
     precautions
    0.87
    safety
    0.86
    Act Density 0.178%

    No Known Activations