INDEX
    Explanations

    terms related to critique or criticism

    New Auto-Interp
    Negative Logits
    ha
    -0.19
    idity
    -0.17
    ience
    -0.15
    HA
    -0.15
    erule
    -0.15
    ths
    -0.15
    iyah
    -0.15
    lds
    -0.14
    esco
    -0.14
    hurst
    -0.14
    POSITIVE LOGITS
    icism
    0.29
    ters
    0.29
    ically
    0.26
    ter
    0.24
    éri
    0.21
    ics
    0.21
    iques
    0.18
    icial
    0.18
    izens
    0.18
    elpers
    0.18
    Act Density 0.007%

    No Known Activations