INDEX
    Explanations

    promoting positive or harmful outcomes

    New Auto-Interp
    Negative Logits
     implications
    0.51
     Implications
    0.47
     intentar
    0.45
    :"+
    0.42
     Impact
    0.41
     determinar
    0.40
    Impact
    0.39
    imposed
    0.39
     meghatá
    0.39
    インパクト
    0.39
    POSITIVE LOGITS
     growth
    0.68
     awareness
    0.63
     crescimento
    0.58
     uptake
    0.57
     healthy
    0.57
     innovation
    0.55
     creativity
    0.55
     wzrost
    0.55
    growth
    0.54
     togetherness
    0.54
    Act Density 0.020%

    No Known Activations