INDEX
Explanations
controversial and polarizing issues or actions involving various groups or individuals
themes related to social issues and challenges
New Auto-Interp
Negative Logits
okin
-0.54
IRC
-0.52
GN
-0.51
unfocusedRange
-0.51
addon
-0.49
confir
-0.48
referen
-0.48
millenn
-0.48
Indust
-0.47
Hawks
-0.46
POSITIVE LOGITS
.''.
0.77
.;
0.73
!.
0.72
whilst
0.71
.</
0.71
.",
0.67
lest
0.66
etc
0.66
.–
0.66
.
0.65
Activations Density 0.956%