INDEX
    Explanations

    terms related to belief systems or ideologies, particularly those ending in 'ist'

    New Auto-Interp
    Negative Logits
    er
    -0.27
    ity
    -0.20
    ed
    -0.20
    erli
    -0.18
    ITY
    -0.16
    anine
    -0.16
    thesis
    -0.15
    chw
    -0.15
    uliar
    -0.15
    jian
    -0.15
    POSITIVE LOGITS
    ically
    0.26
    (ic
    0.26
    ische
    0.21
    ycz
    0.21
    otle
    0.20
    ical
    0.18
    ICAL
    0.18
    tir
    0.18
    -leaning
    0.18
    endencies
    0.17
    Act Density 0.047%

    No Known Activations