INDEX
    Explanations

    instances of the word "text"

    New Auto-Interp
    Negative Logits
    CVE
    -0.77
     CVE
    -0.72
    rolet
    -0.69
    vernment
    -0.67
    MAL
    -0.66
    ño
    -0.66
    negie
    -0.66
    ^^^^
    -0.63
     Kashmir
    -0.61
     Ern
    -0.61
    POSITIVE LOGITS
    ured
    1.40
    area
    1.14
    uring
    1.11
    uality
    1.09
    ural
    1.08
    ures
    1.07
    urized
    1.06
    ual
    1.00
    book
    0.97
    iles
    0.96
    Act Density 0.022%

    No Known Activations