INDEX
    Explanations

    repeated patterns and variations of the word "for."

    New Auto-Interp
    Negative Logits
    nameof
    -0.15
    urger
    -0.15
    currentColor
    -0.15
    cher
    -0.15
    µľ
    -0.14
    Ïĥια
    -0.14
    iswa
    -0.14
    allet
    -0.14
     Tropical
    -0.14
     Demir
    -0.14
    POSITIVE LOGITS
    gers
    0.17
     Coleman
    0.15
    geç
    0.13
    jumbotron
    0.13
     Dro
    0.13
    è®
    0.13
    venues
    0.13
    anoia
    0.13
    ίκ
    0.13
    ÙħÙĪÙĦ
    0.13
    Act Density 0.237%

    No Known Activations