INDEX
Explanations
references to authors and their works, particularly those involved in book publishing
New Auto-Interp
Negative Logits
905
-0.15
etat
-0.14
actor
-0.14
ESP
-0.14
partment
-0.14
storefront
-0.14
aqu
-0.14
asher
-0.14
apt
-0.14
errat
-0.13
POSITIVE LOGITS
novel
0.21
author
0.17
onBind
0.16
autor
0.15
Author
0.15
dab
0.14
itudes
0.14
ycastle
0.14
Writes
0.14
ado
0.14
Activations Density 0.208%