INDEX
Explanations
references to different types of official documents or content citations
instances of content in brackets or brackets themselves, indicating cited content or references
New Auto-Interp
Negative Logits
hog
-0.81
seams
-0.74
lemon
-0.72
consumption
-0.69
uncomp
-0.68
clash
-0.67
uncertain
-0.67
intensive
-0.66
emit
-0.66
imperson
-0.65
POSITIVE LOGITS
Pg
1.35
â̦]
1.32
...]
1.27
sic
1.06
interstitial
1.05
src
1.04
paragraph
1.02
nb
1.00
Footnote
0.99
?]
0.93
Activations Density 0.017%