INDEX
Explanations
specific references or citations in a text
references to citations or legal references in the text
New Auto-Interp
Negative Logits
alth
-0.75
Dim
-0.73
creen
-0.70
aky
-0.70
opian
-0.69
chats
-0.69
eks
-0.67
Kit
-0.67
fren
-0.65
aiden
-0.65
POSITIVE LOGITS
citation
4.36
citations
3.27
Citation
3.03
cite
1.80
attribution
1.65
quotation
1.41
quotations
1.40
cited
1.29
Attribution
1.20
citing
1.17
Activations Density 0.030%