INDEX
    Explanations

    instances of individuals, actions, and events related to controversial or newsworthy topics

    New Auto-Interp
    Negative Logits
    !.
    -0.61
    }.
    -0.58
    ();
    -0.55
    !,
    -0.54
    !).
    -0.51
     bask
    -0.51
    lance
    -0.51
    !'
    -0.51
    )!
    -0.50
    };
    -0.49
    POSITIVE LOGITS
     inappropriately
    0.62
     unfairly
    0.60
     "'
    0.60
     "â̦
    0.57
     improperly
    0.57
     "
    0.57
     misunderstood
    0.53
     inappropriate
    0.53
     unlawfully
    0.53
     improper
    0.53
    Act Density 7.387%

    No Known Activations