INDEX
Explanations
references to a specific source or citation in a text
references to sources of information or citations
New Auto-Interp
Negative Logits
uckle
-0.93
satell
-0.92
andel
-0.91
interstitial
-0.86
aeper
-0.77
oÄŁ
-0.77
teasp
-0.76
compr
-0.75
exting
-0.74
pload
-0.74
POSITIVE LOGITS
Source
1.38
Sources
1.25
source
1.07
Sources
0.95
Source
0.95
ource
0.93
Fed
0.93
Forge
0.90
books
0.89
source
0.88
Activations Density 0.003%