INDEX
Explanations
warnings about graphic or explicit content along with mentions of language that may not be suitable for all audiences
warnings or notifications about graphic content
New Auto-Interp
Negative Logits
anew
-0.83
sonian
-0.81
externalActionCode
-0.73
contempor
-0.71
Newsletter
-0.70
Reviewer
-0.69
united
-0.66
DragonMagazine
-0.66
Gutenberg
-0.66
reinvest
-0.65
POSITIVE LOGITS
dangers
1.01
danger
0.99
beware
0.97
dangerous
0.93
danger
0.92
abuses
0.87
foul
0.87
scares
0.84
injure
0.84
unsafe
0.84
Activations Density 0.635%