INDEX
Explanations
terms related to aristocracy
New Auto-Interp
Negative Logits
ITT
-0.17
itten
-0.17
dim
-0.14
537
-0.14
avian
-0.13
guar
-0.13
ITION
-0.13
*time
-0.13
icopt
-0.13
heim
-0.13
POSITIVE LOGITS
ocratic
0.33
otel
0.31
ocracy
0.30
otle
0.29
ocrat
0.29
ocrats
0.29
odem
0.23
OCR
0.21
olean
0.20
ipp
0.20
Activations Density 0.008%