INDEX
    Explanations

    references to identity and transformation concepts

    New Auto-Interp
    Negative Logits
    bias
    -0.55
    treat
    -0.53
    Bias
    -0.51
     quo
    -0.50
     iprot
    -0.49
    PLEMENT
    -0.49
     fédéral
    -0.48
    èvement
    -0.48
    ]})
    -0.48
    zeciw
    -0.48
    POSITIVE LOGITS
     cloned
    0.67
     identity
    0.66
     imposter
    0.65
    cloned
    0.64
     Identität
    0.64
     impostor
    0.62
     Identity
    0.61
     TextAppearance
    0.59
     identidad
    0.58
    Identity
    0.58
    Act Density 0.436%

    No Known Activations