INDEX
    Explanations

    terms related to attraction or desirability

    references to the concept of "appeal."

    New Auto-Interp
    Negative Logits
    ifa
    -0.77
    Ñĥ
    -0.77
     Rost
    -0.70
     Colleges
    -0.69
    fters
    -0.68
     Coh
    -0.67
     Berk
    -0.67
    kson
    -0.65
    apy
    -0.65
    FT
    -0.64
    POSITIVE LOGITS
     Flavoring
    1.08
    yrinth
    1.01
    ingly
    0.89
    ocene
    0.87
     appeal
    0.79
    minist
    0.79
    ikawa
    0.78
    ĸļ
    0.78
    ously
    0.77
    atism
    0.77
    Act Density 0.015%

    No Known Activations