INDEX
    Explanations

    phrases that indicate confrontation or opposition

    New Auto-Interp
    Negative Logits
    TransparentColor
    -0.16
    .sul
    -0.14
    iders
    -0.14
    ielding
    -0.14
    ิà¸ķร
    -0.14
    вад
    -0.13
    villa
    -0.13
    gon
    -0.13
    ĵåIJį
    -0.13
    .answers
    -0.13
    POSITIVE LOGITS
    uhan
    0.16
    ourt
    0.15
    pin
    0.15
     DÄĽ
    0.14
    sound
    0.14
     cáo
    0.14
     aux
    0.14
    lord
    0.14
    Modifiers
    0.14
    0.14
    Act Density 0.030%

    No Known Activations