INDEX
Explanations
specific nouns or terms related to authority and governance
New Auto-Interp
Negative Logits
Gap
-0.17
Gap
-0.16
afil
-0.15
nev
-0.15
gap
-0.15
resse
-0.15
nick
-0.15
aload
-0.14
.operations
-0.14
ickle
-0.14
POSITIVE LOGITS
hor
0.21
LETE
0.16
fault
0.16
749
0.15
acher
0.15
Brotherhood
0.15
Pra
0.14
959
0.14
orning
0.14
onnement
0.14
Activations Density 0.098%