INDEX
    Explanations

    mentions of books, authors, and historical events or figures in the political context

    New Auto-Interp
    Negative Logits
    avorite
    -0.81
    sylv
    -0.72
    medium
    -0.71
    \-
    -0.71
    depending
    -0.71
    Deal
    -0.70
    mitter
    -0.69
    nant
    -0.69
    oresc
    -0.68
    umerous
    -0.68
    POSITIVE LOGITS
     Other
    0.98
     Its
    0.90
     Others
    0.89
     Beyond
    0.88
     Friends
    0.87
     Politics
    0.87
     Dying
    0.87
     Mysterious
    0.84
     Transformation
    0.83
     Problem
    0.83
    Act Density 0.183%

    No Known Activations