INDEX
    Explanations

    descriptions of influential historical figures and their characteristics

    New Auto-Interp
    Negative Logits
    stery
    -0.17
    atrice
    -0.16
    ạng
    -0.15
    esture
    -0.15
    ãĤ¹ãĥ¬
    -0.15
     Blades
    -0.15
     Flour
    -0.14
    sert
    -0.14
    estone
    -0.14
    ');");↵
    -0.14
    POSITIVE LOGITS
     figure
    0.28
     person
    0.27
    人çī©
    0.25
     figura
    0.23
     someone
    0.23
     guy
    0.22
     somebody
    0.22
     homme
    0.22
     man
    0.22
     uomo
    0.21
    Act Density 0.271%

    No Known Activations