INDEX
    Explanations

    words related to criticism and public opinion

    New Auto-Interp
    Negative Logits
    akin
    -0.15
    .gov
    -0.15
    anyahu
    -0.14
    ervas
    -0.14
    aris
    -0.14
    ategy
    -0.14
    ekim
    -0.14
    ourcem
    -0.13
    orem
    -0.13
    issen
    -0.13
    POSITIVE LOGITS
     critics
    0.28
     opponents
    0.27
     detr
    0.27
     oppon
    0.26
     academics
    0.25
     experts
    0.25
     some
    0.24
     prominent
    0.24
     groups
    0.24
     advocacy
    0.24
    Act Density 0.639%

    No Known Activations