INDEX
    Explanations

    mentions of things or actions being popular

    the concept of popularity in various contexts

    New Auto-Interp
    Negative Logits
    thur
    -0.88
     Aviv
    -0.74
    heed
    -0.73
    aca
    -0.70
    ouls
    -0.69
    ander
    -0.69
     Kear
    -0.68
    aul
    -0.65
     Centauri
    -0.64
    cule
    -0.64
    POSITIVE LOGITS
    popular
    1.20
     popular
    1.11
     Popular
    0.85
    rities
    0.80
    iatus
    0.73
     favourite
    0.72
    ity
    0.71
     unpopular
    0.70
    ised
    0.69
     circulation
    0.68
    Act Density 0.016%

    No Known Activations