INDEX
Explanations
phrases or words related to legal or formal citations
references to citations within the text
New Auto-Interp
Negative Logits
yss
-0.76
akia
-0.76
nut
-0.73
yles
-0.72
enthal
-0.71
aido
-0.71
ntil
-0.71
olen
-0.71
nu
-0.70
independ
-0.69
POSITIVE LOGITS
citation
1.75
citations
1.53
Citation
1.10
cite
0.92
cited
0.91
footnote
0.89
Clicker
0.88
ibli
0.80
attribution
0.77
DragonMagazine
0.75
Activations Density 0.009%