INDEX
Explanations
references to writers in various contexts
references to writers
New Auto-Interp
Negative Logits
ibaba
-0.81
rals
-0.74
undai
-0.74
eneg
-0.72
illon
-0.71
ypes
-0.70
umph
-0.68
inho
-0.67
ADRA
-0.67
asonic
-0.67
POSITIVE LOGITS
writer
1.11
laureate
1.04
writer
0.98
writ
0.90
writers
0.87
Writer
0.87
writers
0.85
writing
0.82
fiction
0.82
uscript
0.81
Activations Density 0.018%