INDEX
    Explanations

    proper nouns and names of places or organizations

    New Auto-Interp
    Negative Logits
    BAD
    -0.15
    оÑĢо
    -0.15
     tub
    -0.14
    ä¼ı
    -0.14
     BAD
    -0.14
    _BACKEND
    -0.13
    606
    -0.13
    æ¡
    -0.13
    岡
    -0.13
    zo
    -0.13
    POSITIVE LOGITS
     Ab
    0.23
    querque
    0.20
    -ab
    0.18
     ab
    0.18
     аб
    0.17
    .Ab
    0.16
    enant
    0.16
     AB
    0.16
    olutely
    0.16
    áb
    0.15
    Act Density 0.034%

    No Known Activations