INDEX
    Explanations

    auxiliary verbs and negations

    New Auto-Interp
    Negative Logits
     hooks
    0.40
     Esses
    0.37
     Hooks
    0.35
     Applies
    0.34
     pins
    0.34
     Assists
    0.34
     Assault
    0.33
     S
    0.33
     XIV
    0.33
    为主
    0.33
    POSITIVE LOGITS
     hacerlo
    0.50
     według
    0.44
     ঝুঁক
    0.42
    确实
    0.41
    0.40
     indeed
    0.39
    aurants
    0.39
     tijdens
    0.39
     pretože
    0.38
     farlo
    0.38
    Act Density 0.012%

    No Known Activations