INDEX
Explanations
references to famous historical and philosophical figures
references to historical figures and concepts related to aristocracy
New Auto-Interp
Negative Logits
à©
-0.81
ned
-0.79
aneously
-0.78
LOD
-0.77
ning
-0.73
Fraser
-0.71
lights
-0.71
FE
-0.69
McKenzie
-0.68
LA
-0.68
POSITIVE LOGITS
ocrats
1.46
ocrat
1.35
ocratic
1.33
Arist
1.04
ocr
0.98
anqu
0.89
theoret
0.87
welf
0.87
iosyn
0.86
ocracy
0.84
Activations Density 0.018%