INDEX
    Explanations

    phrases related to influence or power

    mentions of influence across various contexts and entities

    New Auto-Interp
    Negative Logits
    TAG
    -0.84
    ITIES
    -0.76
    ft
    -0.75
    leigh
    -0.72
    yll
    -0.70
    Quotes
    -0.68
    atri
    -0.68
    Simple
    -0.68
    TH
    -0.67
     Dill
    -0.66
    POSITIVE LOGITS
     pedd
    1.16
     influence
    0.99
     cooker
    0.98
     influencing
    0.96
     sway
    0.92
     exerted
    0.88
     influences
    0.82
     shaping
    0.82
     influenced
    0.78
     multiplier
    0.74
    Act Density 0.034%

    No Known Activations