INDEX
Explanations
links to additional content embedded within the text
prompts for further reading or continuing the content
New Auto-Interp
Negative Logits
romy
-0.74
rag
-0.73
consolation
-0.72
folk
-0.70
clutch
-0.70
stewards
-0.69
condol
-0.67
picnic
-0.64
stood
-0.64
stray
-0.64
POSITIVE LOGITS
Advertisement
1.06
Below
0.97
Expand
0.96
Thumbnails
0.95
Contents
0.87
isodes
0.82
Travels
0.81
Clause
0.80
Mode
0.80
Hide
0.77
Activations Density 0.030%