INDEX
    Explanations

    phrases related to inclusivity or the incorporation of various elements

    references to various categories or examples within a text

    New Auto-Interp
    Negative Logits
    mosp
    -0.60
    elling
    -0.59
     Cummings
    -0.59
     Lung
    -0.58
     Oaks
    -0.56
     Oakland
    -0.56
    icultural
    -0.55
    raq
    -0.55
    Preview
    -0.54
     Vaughan
    -0.54
    POSITIVE LOGITS
    itiz
    0.76
     guiActiveUn
    0.74
    iton
    0.71
    hots
    0.70
    available
    0.67
    fman
    0.65
    ser
    0.65
     BF
    0.64
    atta
    0.64
    gradient
    0.63
    Act Density 0.150%

    No Known Activations