INDEX
Explanations
references to freedoms and rights, particularly related to religion and expression
New Auto-Interp
Negative Logits
lasses
-0.16
trail
-0.16
ÄįÃŃ
-0.16
raith
-0.14
}elseif
-0.14
avia
-0.14
ustum
-0.14
stdcall
-0.14
ivia
-0.14
icrosoft
-0.14
POSITIVE LOGITS
expression
0.32
speech
0.24
Expression
0.24
expression
0.24
religion
0.23
freely
0.22
peaceful
0.21
Expression
0.21
freedom
0.21
-expression
0.21
Activations Density 0.031%