INDEX
    Explanations

    references to social issues and political topics

    New Auto-Interp
    Negative Logits
     mathemat
    -0.76
     sacrific
    -0.72
     fortun
    -0.68
     loopholes
    -0.66
     scattering
    -0.64
     elig
    -0.64
     myster
    -0.64
     elim
    -0.62
     jog
    -0.61
     SERV
    -0.61
    POSITIVE LOGITS
    ï¸ı
    1.39
    ski
    0.89
    mental
    0.86
    tracks
    0.86
    s
    0.84
    sure
    0.82
    ttle
    0.81
    esc
    0.81
    ship
    0.80
    ï¸
    0.80
    Act Density 0.772%

    No Known Activations