INDEX
Explanations
references to literature and notable literary works and figures
New Auto-Interp
Negative Logits
<bos>
-0.55
ון
-0.46
祖
-0.42
?
-0.41
ówki
-0.40
穴
-0.40
/***/
-0.40
szak
-0.40
sikan
-0.39
-0.39
POSITIVE LOGITS
novels
1.26
fiction
1.11
novelist
1.08
literary
1.07
novelists
1.06
novel
1.02
novel
1.01
poet
1.00
poets
0.99
Novels
0.99
Activations Density 0.318%