INDEX
    Explanations

    references to socio-economic class, particularly the middle class

    New Auto-Interp
    Negative Logits
     Réponses
    -0.84
     pitié
    -0.80
    ècie
    -0.80
    :✨
    -0.80
     françaises
    -0.79
    xFFFFFFFF
    -0.75
     torchvision
    -0.75
    NamedQueries
    -0.74
     poussière
    -0.73
    aksikan
    -0.72
    POSITIVE LOGITS
     Middle
    2.04
     MIDDLE
    1.95
    Middle
    1.95
     middle
    1.92
    middle
    1.83
    MIDDLE
    1.78
     Middel
    1.59
     Middles
    1.44
    middlewares
    1.17
    Middleware
    1.16
    Act Density 0.046%

    No Known Activations