INDEX
    Explanations

    references to diplomats or diplomatic titles

    New Auto-Interp
    Negative Logits
    CLU
    -0.15
    ìĬµ
    -0.15
    ylvania
    -0.14
    опол
    -0.14
     ucwords
    -0.14
    lore
    -0.14
    erged
    -0.14
    _MATH
    -0.14
     Toll
    -0.13
    ulla
    -0.13
    POSITIVE LOGITS
     embassy
    0.37
     Embassy
    0.35
     diplomatic
    0.30
     ambassador
    0.29
     diplomat
    0.29
    Emb
    0.29
     emb
    0.29
     diplomats
    0.28
     Ambassador
    0.27
     Dipl
    0.26
    Act Density 0.208%

    No Known Activations