INDEX
    Explanations

    changing paradigms and perceptions

    New Auto-Interp
    Negative Logits
     correctamente
    0.46
     correctly
    0.41
    优点
    0.40
     properly
    0.40
     правильно
    0.39
     correctement
    0.38
     высокой
    0.37
     poseb
    0.36
     buenas
    0.36
     improvements
    0.36
    POSITIVE LOGITS
     perceptions
    0.86
     priorities
    0.84
     paradigms
    0.79
     perception
    0.79
     attitudes
    0.77
     behavior
    0.73
     paradigm
    0.71
     paradigma
    0.71
     approaches
    0.70
     восприя
    0.70
    Act Density 0.065%

    No Known Activations