INDEX
    Explanations

    references to authority figures in discussions about governance or policy

    New Auto-Interp
    Negative Logits
     corners
    -0.15
    amburger
    -0.15
    amat
    -0.15
    olumn
    -0.15
    ullen
    -0.15
    SSF
    -0.15
     Lev
    -0.14
     Corner
    -0.14
    ableObject
    -0.14
    omin
    -0.14
    POSITIVE LOGITS
     regret
    0.18
    illance
    0.17
     further
    0.16
    å¹¹
    0.16
    èį
    0.16
    imers
    0.16
    \views
    0.15
    ingers
    0.15
     hoped
    0.15
     fur
    0.15
    Act Density 0.059%

    No Known Activations