INDEX
    Explanations

    references to benefits or advantageous outcomes

    New Auto-Interp
    Negative Logits
    adoo
    -0.17
    adox
    -0.15
    ivas
    -0.14
    onga
    -0.14
    oupper
    -0.14
    minate
    -0.14
    quito
    -0.14
    rouch
    -0.14
    quiv
    -0.14
    имÑĥ
    -0.14
    POSITIVE LOGITS
    actors
    0.34
    actor
    0.34
    itted
    0.30
    action
    0.23
    actions
    0.23
    icia
    0.23
    acting
    0.22
    eci
    0.20
     actors
    0.20
     Actors
    0.20
    Act Density 0.008%

    No Known Activations