INDEX
    Explanations

    the word "To" in various contexts and its frequency of use

    New Auto-Interp
    Negative Logits
     intrusive
    -0.64
     dece
    -0.64
     intentions
    -0.63
     exceptions
    -0.63
     proposals
    -0.61
     regards
    -0.60
     flagged
    -0.60
     forthcoming
    -0.60
     vulnerabilities
    -0.58
     signatures
    -0.58
    POSITIVE LOGITS
    ilet
    1.71
    pping
    1.30
    ilers
    1.21
    pped
    1.15
    asted
    1.14
    ffee
    1.10
    ppings
    1.08
    pper
    1.05
    ppers
    1.05
    asts
    1.03
    Act Density 0.028%

    No Known Activations