INDEX
Explanations
references to political and social topics such as ISIS, West, religion, China, Russia, law enforcement, government, and other related terms
references to geopolitical events, entities, and concepts
New Auto-Interp
Negative Logits
;;;;
-0.59
srf
-0.53
senal
-0.51
foundland
-0.50
jri
-0.49
ãĤ´ãĥ³
-0.49
ULL
-0.48
Leilan
-0.48
Azerb
-0.47
ilar
-0.47
POSITIVE LOGITS
cannot
0.65
should
0.65
could
0.63
had
0.60
can
0.60
will
0.59
has
0.58
hadn
0.58
forgot
0.57
might
0.57
Activations Density 0.901%