INDEX
Explanations
dates and author names in a specific format
the presence of attributions or citations in the text
New Auto-Interp
Negative Logits
ĸļ
-0.90
rons
-0.85
ifts
-0.78
enance
-0.75
ippi
-0.74
raints
-0.73
ifting
-0.73
inkle
-0.72
Beir
-0.72
orical
-0.71
POSITIVE LOGITS
cffff
1.11
|--
0.98
··
0.83
+---
0.72
|
0.72
âĢ¢âĢ¢
0.71
////////////////////////////////
0.70
Mah
0.70
EntityItem
0.69
grep
0.68
Activations Density 0.017%