INDEX
    Explanations

    terms relating to measurement, evaluation, and comparison of quantities or conditions

    New Auto-Interp
    Negative Logits
    AxisAlignment
    -0.77
    Datuak
    -0.65
     uren
    -0.64
    /*---
    -0.63
     erfolgte
    -0.59
     Clan
    -0.59
    substack
    -0.58
     onResponse
    -0.58
     hieronder
    -0.58
     paramInt
    -0.57
    POSITIVE LOGITS
     stuff
    0.94
     everybody
    0.91
    Everybody
    0.89
     Everybody
    0.85
    Nobody
    0.80
     somebody
    0.79
    everybody
    0.78
     jeito
    0.77
     things
    0.75
     thing
    0.74
    Act Density 1.811%

    No Known Activations