INDEX
Explanations
words related to preferences or things that are liked or favored
words that indicate popularity or preference
New Auto-Interp
Negative Logits
ural
-0.85
inas
-0.78
ijk
-0.77
ufact
-0.76
TPPStreamerBot
-0.76
amping
-0.73
arty
-0.72
heed
-0.72
abad
-0.69
acial
-0.68
POSITIVE LOGITS
favorites
0.94
favorite
0.91
Favorite
0.84
haunt
0.81
Favor
0.81
Favorite
0.79
é¾įå
0.77
favourites
0.77
é¾įå¥ij士
0.76
è¦
0.74
Activations Density 0.013%