INDEX
Explanations
references to sources or attributions within a text
instances of the phrase "according to" followed by a source or reference
New Auto-Interp
Negative Logits
VERT
-0.66
wid
-0.66
ield
-0.65
arse
-0.65
fax
-0.64
izont
-0.63
cffffcc
-0.62
rette
-0.62
rouse
-0.62
ertodd
-0.61
POSITIVE LOGITS
"[
1.21
"â̦
1.00
"'
0.99
however
0.92
"...
0.91
"(
0.87
there
0.86
'[
0.86
citing
0.82
which
0.82
Activations Density 0.108%