INDEX
Explanations
phrases or words in quotes
instances of quoted phrases or dialogue
New Auto-Interp
Negative Logits
rall
-0.77
Archdemon
-0.75
matter
-0.73
Tribune
-0.73
Nieto
-0.72
outfielder
-0.71
Catal
-0.71
affiliate
-0.70
editor
-0.69
Hoy
-0.69
POSITIVE LOGITS
normal
1.36
official
1.29
pure
1.28
safe
1.26
cheat
1.26
classic
1.24
false
1.24
sufficient
1.24
too
1.23
traditional
1.23
Activations Density 0.137%