INDEX
    Explanations

    phrases emphasizing exclusivity or singularity

    New Auto-Interp
    Negative Logits
    anner
    -0.15
    agher
    -0.14
    hn
    -0.14
    HN
    -0.14
    _USAGE
    -0.14
    æ¼Ĥ
    -0.13
    fur
    -0.13
    .KEY
    -0.13
    tim
    -0.13
    ael
    -0.13
    POSITIVE LOGITS
     only
    0.17
     лиÑĪÑĮ
    0.17
    alars
    0.17
    /pi
    0.15
    iero
    0.15
    éru
    0.14
    ToOne
    0.14
    gett
    0.14
    imus
    0.14
    anko
    0.14
    Act Density 0.097%

    No Known Activations