INDEX
Explanations
names or identifiers mentioned in text
instances of identification or naming of individuals
New Auto-Interp
Negative Logits
hibition
-0.73
guiActiveUnfocused
-0.70
ocratic
-0.67
course
-0.66
ocracy
-0.64
akings
-0.62
ocrats
-0.62
rouse
-0.61
brim
-0.59
idious
-0.58
POSITIVE LOGITS
pseudonym
0.88
as
0.85
surn
0.83
by
0.83
onyms
0.81
anonymously
0.80
locally
0.77
onym
0.77
only
0.75
unnamed
0.75
Activations Density 0.080%