INDEX
    Explanations

    references to the number "two" or pairs of items

    instances of the word "two."

    New Auto-Interp
    Negative Logits
    atown
    -0.84
    renheit
    -0.79
    Ô
    -0.77
    ovi
    -0.75
     Tradable
    -0.72
    untu
    -0.72
    fw
    -0.72
    yz
    -0.71
    ugu
    -0.69
    ondon
    -0.68
    POSITIVE LOGITS
     halves
    1.18
     sexes
    0.93
    fold
    0.93
     sides
    0.91
     aforementioned
    0.86
     Kore
    0.86
     thirds
    0.82
     main
    0.80
     largest
    0.79
     finalists
    0.79
    Act Density 0.063%

    No Known Activations