INDEX
Explanations
references to a specific organization related to freedom
references to "Freedom" in various contexts
New Auto-Interp
Negative Logits
AMS
-0.71
aco
-0.69
ENTS
-0.69
nas
-0.68
liam
-0.67
Kard
-0.65
ept
-0.64
illac
-0.64
eor
-0.63
amac
-0.63
POSITIVE LOGITS
Fighters
0.81
prisoner
0.77
roam
0.76
Freedom
0.76
freedom
0.75
Freedom
0.75
Rights
0.75
Liberties
0.74
freedom
0.73
bies
0.72
Activations Density 0.013%