INDEX
Explanations
names of authors or creators
parentheses in the text
New Auto-Interp
Negative Logits
consist
-0.80
indul
-0.79
normalized
-0.78
everyday
-0.77
increment
-0.76
conformity
-0.76
entitle
-0.76
met
-0.76
treat
-0.75
equival
-0.75
POSITIVE LOGITS
who
1.54
pictured
1.41
formerly
1.40
pron
1.36
whose
1.33
aka
1.31
Assistant
1.30
CEO
1.22
sic
1.22
University
1.21
Activations Density 0.069%