INDEX
    Explanations

    mentions of a wide range or variety of options, features, or categories

    instances of a specific token indicating the end of a text segment

    New Auto-Interp
    Negative Logits
    illard
    -0.78
    onis
    -0.73
    ador
    -0.73
    ilan
    -0.72
    ón
    -0.69
    ICAN
    -0.68
    nces
    -0.67
     Dul
    -0.63
    icks
    -0.63
    unia
    -0.63
    POSITIVE LOGITS
     swath
    1.22
     range
    1.22
     ranging
    1.15
     variety
    1.13
     array
    1.10
     spectrum
    1.08
    ranging
    1.05
    spread
    1.04
     assortment
    1.00
     scope
    0.97
    Act Density 0.032%

    No Known Activations