INDEX
    Explanations

    words related to philosophical and ideological concepts

    New Auto-Interp
    Negative Logits
    ses
    -0.19
    shan
    -0.18
    inator
    -0.18
    اء
    -0.18
    s
    -0.18
    sel
    -0.17
    lett
    -0.17
    flake
    -0.17
    sed
    -0.17
    iness
    -0.17
    POSITIVE LOGITS
    apolis
    0.29
    ity
    0.25
    ism
    0.22
    ization
    0.20
    ized
    0.19
    alysis
    0.19
    cy
    0.19
    zelf
    0.19
    ismus
    0.19
    opsis
    0.19
    Act Density 0.070%

    No Known Activations