INDEX
    Explanations

    words related to authority, control, and power

    phrases related to conflict, danger, and political issues

    New Auto-Interp
    Negative Logits
    ggles
    -0.66
     partName
    -0.60
    ratulations
    -0.57
    pires
    -0.57
    )\
    -0.55
    )!
    -0.54
    guyen
    -0.54
    >)
    -0.53
     yours
    -0.51
    bernatorial
    -0.51
    POSITIVE LOGITS
     because
    1.11
    because
    0.96
     "...
    0.88
     owing
    0.87
     ".
    0.86
     whereas
    0.86
    .
    0.84
     despite
    0.83
     but
    0.83
     "â̦
    0.83
    Act Density 1.120%

    No Known Activations