INDEX
    Explanations

    mentions of political figures and positions

    the word "shadow" and its contextual uses

    New Auto-Interp
    Negative Logits
    urses
    -0.91
    awaru
    -0.77
    renheit
    -0.76
    anchester
    -0.75
    apsed
    -0.74
    unker
    -0.73
    tics
    -0.73
    keye
    -0.71
    OPLE
    -0.71
    Attempts
    -0.70
    POSITIVE LOGITS
    moon
    0.98
     shadow
    0.94
     Shadow
    0.93
    shadow
    0.87
    loo
    0.83
    Shadow
    0.80
     Shadows
    0.75
    boxing
    0.75
    dust
    0.75
    busters
    0.75
    Act Density 0.015%

    No Known Activations