INDEX
    Explanations

    specific names, entities, or proper nouns related to individuals or organizations

    New Auto-Interp
    Negative Logits
    add
    -0.20
    ate
    -0.18
    ee
    -0.18
    all
    -0.18
    ase
    -0.18
    ail
    -0.18
    ant
    -0.17
    ass
    -0.17
    et
    -0.17
    ado
    -0.17
    POSITIVE LOGITS
    zburg
    0.25
    sburg
    0.24
    burg
    0.23
    lymp
    0.22
    burgh
    0.22
    ksen
    0.21
    recht
    0.21
    strup
    0.21
    zheimer
    0.21
    chwitz
    0.21
    Act Density 0.032%

    No Known Activations