INDEX
    Explanations

    references to social class and political critique

    New Auto-Interp
    Negative Logits
    .fx
    -0.16
     Truy
    -0.14
     cigaret
    -0.14
     Crazy
    -0.14
    ãģ¤ãģ¶
    -0.14
    ichtig
    -0.14
    meli
    -0.14
    Ghost
    -0.13
     Ghost
    -0.13
    ppo
    -0.13
    POSITIVE LOGITS
    uten
    0.16
     rotten
    0.15
    bour
    0.15
    ap
    0.14
    arkin
    0.14
     subjective
    0.13
     bourgeois
    0.13
    Ñij
    0.13
    etc
    0.13
    ancy
    0.13
    Act Density 0.011%

    No Known Activations