INDEX
    Explanations

    terms related to power dynamics and social justice issues

    New Auto-Interp
    Negative Logits
     Atenas
    -0.53
     osv
    -0.49
     consultato
    -0.49
    ſelf
    -0.48
     Brahmin
    -0.48
    」、「
    -0.47
     etc
    -0.46
     către
    -0.45
     Schuyler
    -0.45
     såsom
    -0.45
    POSITIVE LOGITS
     CreateTagHelper
    0.94
    GEBURTSDATUM
    0.81
    észetes
    0.69
    ContentAsync
    0.68
    يديا
    0.66
    SBATCH
    0.66
     дописавши
    0.64
     виправивши
    0.63
    ConstraintMaker
    0.63
    Autoritní
    0.63
    Act Density 0.239%

    No Known Activations