INDEX
    Explanations

    references to things that are preferred or enjoyed by an individual

    references to favorite or preferred items, activities, or experiences

    New Auto-Interp
    Negative Logits
    aping
    -0.92
    aton
    -0.87
    ldon
    -0.80
    attle
    -0.80
    redits
    -0.79
    heed
    -0.79
    ural
    -0.76
    hare
    -0.76
    avis
    -0.76
    idem
    -0.75
    POSITIVE LOGITS
    Favorite
    0.91
     pokemon
    0.89
     tricks
    0.83
     hobbies
    0.82
     hobby
    0.80
     unsolved
    0.79
     favorite
    0.79
     haunt
    0.77
     moments
    0.77
     brands
    0.77
    Act Density 0.037%

    No Known Activations