INDEX
    Explanations

    words related to identity and cultural background

    following a form of "to be"

    New Auto-Interp
    Negative Logits
    beforeEach
    -0.64
    BagLayout
    -0.64
    kloped
    -0.63
     którzy
    -0.62
    WRENCE
    -0.61
    argout
    -0.58
    addContainerGap
    -0.57
    standers
    -0.56
     חיצוניים
    -0.56
    TintMode
    -0.56
    POSITIVE LOGITS
     obsessed
    0.69
     wearing
    0.68
     married
    0.67
     trying
    0.67
     suing
    0.63
     doing
    0.63
     aware
    0.62
     famous
    0.60
     interested
    0.59
     part
    0.58
    Act Density 0.474%

    No Known Activations