INDEX
Explanations
sources or references in a text
phrases that indicate the presence of unspecified sources or references in statements
New Auto-Interp
Negative Logits
ĸļ
-0.87
osate
-0.80
aeper
-0.71
eer
-0.67
arrow
-0.67
eers
-0.66
emale
-0.65
ecake
-0.64
pend
-0.64
avorite
-0.64
POSITIVE LOGITS
sources
0.98
informants
0.86
Sources
0.80
source
0.78
source
0.75
consulted
0.73
briefed
0.73
informant
0.72
alyst
0.72
priv
0.71
Activations Density 0.017%