INDEX
Explanations
the word "liberty"
references to the concept of "liberty."
references to the concept of liberty
New Auto-Interp
Negative Logits
HEAD
-0.75
gaard
-0.73
PAR
-0.71
Production
-0.67
PM
-0.66
ergy
-0.65
AMS
-0.65
Electric
-0.59
Bi
-0.59
outed
-0.59
POSITIVE LOGITS
liberties
1.05
liberty
1.02
Liberties
0.97
fulness
0.92
tarian
0.86
freedom
0.85
tarians
0.84
freedoms
0.82
acies
0.79
ously
0.78
Activations Density 0.010%