INDEX
Explanations
mentions of the word "novel"
mentions of the word "novel."
New Auto-Interp
Negative Logits
xon
-0.74
henko
-0.67
zona
-0.64
olid
-0.63
adow
-0.62
tics
-0.61
poke
-0.61
older
-0.61
Downloadha
-0.59
++++
-0.59
POSITIVE LOGITS
ties
1.12
izations
1.05
isations
1.04
isation
0.90
manuscript
0.87
adaptation
0.81
ization
0.81
culosis
0.81
acters
0.80
uscript
0.78
Activations Density 0.013%