INDEX
    Explanations

    terms related to incorrect beliefs or understanding

    terms related to misconceptions and misinformation

    New Auto-Interp
    Negative Logits
    igree
    -0.74
    atom
    -0.72
    negie
    -0.71
    ramid
    -0.69
    edom
    -0.67
    imen
    -0.66
    itar
    -0.66
    incinn
    -0.66
    ropri
    -0.65
    gans
    -0.65
    POSITIVE LOGITS
     misunderstanding
    1.01
     misconceptions
    1.00
     misconception
    0.94
     misunderstand
    0.87
     inaccur
    0.85
     misinformation
    0.84
     misinterpret
    0.84
     incorrectly
    0.81
     misunderstood
    0.77
     inaccurate
    0.76
    Act Density 0.044%

    No Known Activations