INDEX
Explanations
mentions or variations of the word "paper."
references to scholarly papers and articles
New Auto-Interp
Negative Logits
dism
-0.66
disbelief
-0.64
Wow
-0.63
status
-0.63
condition
-0.61
latitude
-0.60
conditional
-0.60
exclus
-0.60
license
-0.60
confirmed
-0.58
POSITIVE LOGITS
apers
4.55
aper
4.20
aping
1.78
aped
1.61
apes
1.51
ape
1.46
apest
1.34
acco
1.25
apor
1.24
aps
1.17
Activations Density 0.005%