INDEX
    Explanations

    words related to popularity in various contexts

    New Auto-Interp
    Negative Logits
    uran
    -0.74
    erm
    -0.71
     Dull
    -0.69
     Shell
    -0.68
    INAL
    -0.68
    endez
    -0.67
    thur
    -0.66
    inis
    -0.64
    ibur
    -0.64
     intest
    -0.63
    POSITIVE LOGITS
    ability
    0.90
    ately
    0.88
    ously
    0.87
    Reviewer
    0.83
    rise
    0.78
    itism
    0.76
    acy
    0.75
     ratings
    0.75
    iqueness
    0.73
    itious
    0.73
    Act Density 0.014%

    No Known Activations