INDEX
    Explanations

    occurrences of the word "on"

    New Auto-Interp
    Negative Logits
    enance
    -0.65
    aneers
    -0.64
    BILITIES
    -0.63
     macros
    -0.62
     amend
    -0.62
     minors
    -0.61
     diluted
    -0.60
    oother
    -0.60
    BIP
    -0.60
    ometimes
    -0.59
    POSITIVE LOGITS
    nen
    1.34
    etheless
    1.10
    stru
    1.01
    nect
    1.01
    stant
    0.98
    etic
    0.95
    cé
    0.93
    nie
    0.91
    osuke
    0.88
    ni
    0.87
    Act Density 0.022%

    No Known Activations