INDEX
Explanations
references to the term "Trotsky" or variations thereof
references to military or law enforcement personnel and entities
New Auto-Interp
Negative Logits
ually
-0.69
UAL
-0.69
Reloaded
-0.67
ãĤ¦ãĤ¹
-0.62
alam
-0.62
ividual
-0.61
++++++++++++++++
-0.60
arians
-0.58
arten
-0.58
ilated
-0.57
POSITIVE LOGITS
dden
1.17
opers
1.15
phies
1.11
ppo
1.09
opa
0.98
pez
0.96
plets
0.93
phy
0.93
tted
0.92
jan
0.91
Activations Density 0.030%