INDEX
Explanations
words related to publications or releases of books, movies, or series
the presence of end-of-text markers
New Auto-Interp
Negative Logits
advant
-0.80
hap
-0.73
ealous
-0.71
ghazi
-0.70
chuk
-0.69
odan
-0.66
ague
-0.63
Administ
-0.63
ertodd
-0.62
vantage
-0.62
POSITIVE LOGITS
anthology
1.01
premie
0.98
premiere
0.97
paperback
0.90
isodes
0.89
theaters
0.84
installment
0.84
debut
0.81
release
0.81
collaborations
0.81
Activations Density 0.857%