INDEX
    Explanations

    phrases indicating blame or responsibility

    statements assigning blame or responsibility to specific entities or individuals

    New Auto-Interp
    Negative Logits
    arious
    -0.83
    cit
    -0.82
    obi
    -0.77
    dayName
    -0.76
    itsu
    -0.75
    alli
    -0.75
    yssey
    -0.71
    Dialogue
    -0.71
    Boo
    -0.71
    atar
    -0.70
    POSITIVE LOGITS
     ruining
    1.43
     causing
    1.30
     creating
    1.29
     ensuring
    1.22
     inciting
    1.20
     provoking
    1.19
     spreading
    1.19
     destroying
    1.19
     initiating
    1.19
     bringing
    1.18
    Act Density 0.133%

    No Known Activations