INDEX
    Explanations

    phrases indicating monetary contributions or expenses

    New Auto-Interp
    Negative Logits
    arget
    -0.17
    asi
    -0.17
    ayer
    -0.17
    zens
    -0.15
    nop
    -0.15
    é¡¿
    -0.15
    agi
    -0.14
    ermann
    -0.14
    еÑĢе
    -0.14
    orig
    -0.14
    POSITIVE LOGITS
    yscale
    0.18
    ednou
    0.16
    ypy
    0.15
    LineStyle
    0.15
    itates
    0.15
    elu
    0.14
    atedRoute
    0.14
    otland
    0.14
    emas
    0.14
    uien
    0.14
    Act Density 0.411%

    No Known Activations