INDEX
Explanations
occurrences of the word "author" and its variants
New Auto-Interp
Negative Logits
ίÏīν
-0.17
-eyed
-0.15
ary
-0.15
eyes
-0.15
chant
-0.15
berra
-0.15
legates
-0.15
elyn
-0.15
756
-0.14
符
-0.14
POSITIVE LOGITS
itative
0.27
itarian
0.23
ship
0.22
ised
0.18
entic
0.18
ing
0.17
иÑĤеÑĤ
0.16
icity
0.16
itive
0.15
izes
0.15
Activations Density 0.031%