INDEX
Explanations
words indicating new or innovative concepts and discoveries
New Auto-Interp
Negative Logits
-
-0.93
y
-0.84
<b>
-0.79
c
-0.79
</b>
-0.76
?
-0.70
.
-0.69
"
-0.67
Today
-0.67
k
-0.67
POSITIVE LOGITS
NOVEL
1.12
Novel
1.11
Datuak
1.10
日閲覧
1.08
BibitemShut
1.07
^(@)
1.07
Cordialement
1.06
savevideo
1.04
novel
1.03
Cyfeiriadau
1.03
Activations Density 0.166%