INDEX
    Explanations

    mentions of something being special in some context

    the concept of "special."

    New Auto-Interp
    Negative Logits
     Twain
    -0.69
     Ri
    -0.66
    anon
    -0.65
     Giul
    -0.65
     Cah
    -0.63
    Ķ
    -0.63
     Cao
    -0.62
    amus
    -0.61
    Į
    -0.60
    ·
    -0.60
    POSITIVE LOGITS
    ised
    1.15
    ties
    1.00
    izations
    0.96
    isations
    0.95
    isable
    0.90
    ized
    0.89
    marine
    0.84
    ities
    0.83
    isal
    0.81
    atural
    0.81
    Act Density 0.022%

    No Known Activations