INDEX
    Explanations

    words related to significant changes or shifts in thinking or perspective

    concepts related to shifts in paradigms or fundamental changes in systems

    New Auto-Interp
    Negative Logits
    itarian
    -0.74
    é¾
    -0.72
    llah
    -0.72
    sports
    -0.69
    Ñĭ
    -0.67
    itals
    -0.67
     DISTRICT
    -0.67
    IENCE
    -0.67
     Cosponsors
    -0.66
     Murd
    -0.65
    POSITIVE LOGITS
    avior
    0.88
    velop
    0.81
    avi
    0.80
    OPLE
    0.74
    anger
    0.71
    rums
    0.68
    mann
    0.68
    lers
    0.68
    Í
    0.67
    rop
    0.67
    Act Density 0.115%

    No Known Activations