INDEX
Explanations
references to authors and their works
references to authors and their works
New Auto-Interp
Negative Logits
osta
-0.71
bath
-0.69
-|
-0.68
vere
-0.66
dash
-0.64
ooth
-0.64
angs
-0.64
jobs
-0.63
eg
-0.61
ntil
-0.61
POSITIVE LOGITS
author
3.82
authors
2.52
Author
2.38
author
2.24
AUTHOR
2.10
Author
2.01
Authors
1.98
writer
1.97
novelist
1.82
authors
1.69
Activations Density 0.020%