INDEX
Explanations
references to things that are preferred or enjoyed by an individual
references to favorite or preferred items, activities, or experiences
New Auto-Interp
Negative Logits
aping
-0.92
aton
-0.87
ldon
-0.80
attle
-0.80
redits
-0.79
heed
-0.79
ural
-0.76
hare
-0.76
avis
-0.76
idem
-0.75
POSITIVE LOGITS
Favorite
0.91
pokemon
0.89
tricks
0.83
hobbies
0.82
hobby
0.80
unsolved
0.79
favorite
0.79
haunt
0.77
moments
0.77
brands
0.77
Activations Density 0.037%