INDEX
Explanations
phrases related to online sharing or publication
phrases indicating inclusion or references to external sources
New Auto-Interp
Negative Logits
tis
-0.90
ecast
-0.84
meric
-0.80
chool
-0.78
uel
-0.75
riger
-0.74
Flavoring
-0.70
ertain
-0.70
esters
-0.70
dom
-0.69
POSITIVE LOGITS
references
1.24
quotes
1.22
phrases
1.17
descriptions
1.13
excerpts
1.13
slogans
1.10
statements
1.09
wording
1.08
disclaimer
1.07
disclaim
1.07
Activations Density 0.375%