INDEX
    Explanations

    mentions of protest songs and their cultural significance

    New Auto-Interp
    Negative Logits
     Ars
    -0.17
     Chore
    -0.17
    .toolbox
    -0.16
    dance
    -0.15
     Robotics
    -0.15
     Ukraj
    -0.15
     رÙĤ
    -0.14
    çĵľ
    -0.14
     chore
    -0.14
    èĪŀ
    -0.14
    POSITIVE LOGITS
     Dylan
    0.61
     Bob
    0.48
    Bob
    0.42
     bob
    0.38
    ylan
    0.37
    bob
    0.33
     DY
    0.32
    DY
    0.30
     Dy
    0.27
     Blonde
    0.26
    Act Density 0.014%

    No Known Activations