INDEX
Explanations
terms related to authority and hierarchy in religious contexts
New Auto-Interp
Negative Logits
pleaſure
-0.85
itſelf
-0.85
itarianism
-0.77
ſche
-0.76
ergies
-0.75
MigrationBuilder
-0.75
purpoſe
-0.75
OGND
-0.75
ſtate
-0.74
equator
-0.73
POSITIVE LOGITS
their
0.59
,
0.47
:
0.46
—
0.45
leurs
0.45
are
0.43
over
0.43
--
0.42
<bos>
0.42
Their
0.42
Activations Density 0.102%