INDEX
    Explanations

    references to specific identities or categories, particularly those related to human characteristics and societal constructs

    New Auto-Interp
    Negative Logits
     Sexo
    -0.17
    ilon
    -0.16
     scrut
    -0.14
    .setParent
    -0.14
     
    -0.14
    kili
    -0.14
     Parent
    -0.13
    364
    -0.13
    woord
    -0.13
    NullOrEmpty
    -0.13
    POSITIVE LOGITS
    -like
    0.27
    -esque
    0.23
    -style
    0.23
    -wide
    0.23
    wide
    0.22
    -era
    0.22
    like
    0.22
     sized
    0.21
    -sized
    0.20
    -type
    0.19
    Act Density 0.024%

    No Known Activations