INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Parm
    -0.06
     cola
    -0.06
    xml
    -0.06
    ptune
    -0.06
     plaza
    -0.06
    'an
    -0.06
     MG
    -0.06
    -0.06
     lx
    -0.06
    fire
    -0.06
    POSITIVE LOGITS
     avoidance
    0.09
    Dod
    0.08
     avoided
    0.08
     evade
    0.07
    ’av
    0.07
     avoiding
    0.07
    0.07
     خود
    0.07
     exercise
    0.06
     seasonal
    0.06
    Act Density 0.005%

    No Known Activations