INDEX
Explanations
references to books and their publication details
New Auto-Interp
Negative Logits
modeling
-0.17
honors
-0.17
fiber
-0.17
Harbor
-0.17
favored
-0.16
fibers
-0.16
pto
-0.16
favors
-0.15
favorable
-0.15
artifact
-0.15
POSITIVE LOGITS
Guardian
0.24
extract
0.24
BBC
0.23
BBC
0.22
Extract
0.22
extracts
0.21
Extract
0.21
bbc
0.21
Booker
0.21
extract
0.20
Activations Density 0.247%