INDEX
    Explanations

    occurrences of the word "the" and variations related to a specific pattern or structure

    New Auto-Interp
    Negative Logits
    reau
    -0.16
    annon
    -0.16
    herence
    -0.15
    ereo
    -0.14
    era
    -0.14
    allen
    -0.14
     Kurulu
    -0.14
    elm
    -0.14
    uer
    -0.14
     Pods
    -0.14
    POSITIVE LOGITS
    pool
    0.22
    POOL
    0.21
    Pool
    0.18
     yat
    0.18
     cellFor
    0.16
    kish
    0.15
    ston
    0.15
    UG
    0.15
    ptr
    0.15
    íĴį
    0.15
    Act Density 0.008%

    No Known Activations