INDEX
    Explanations

    phrases mentioning the potential for negative consequences or harmful impacts

    occurrences of the word "cause" and its variations, indicating potential consequences or effects

    New Auto-Interp
    Negative Logits
    stra
    -0.76
    atu
    -0.75
    ian
    -0.72
    zai
    -0.69
    zb
    -0.68
     Rated
    -0.68
    iazep
    -0.67
    ta
    -0.66
    ature
    -0.66
    sonian
    -0.66
    POSITIVE LOGITS
     havoc
    1.30
     headaches
    1.03
     mayhem
    0.97
     trouble
    0.97
     confusion
    0.96
     unnecessary
    0.93
     outbreaks
    0.90
     friction
    0.90
     panic
    0.87
    ãĥĨãĤ£
    0.87
    Act Density 0.052%

    No Known Activations