INDEX
    Explanations

    terms related to fundamental concepts or principles

    New Auto-Interp
    Negative Logits
    bie
    -0.15
     Morg
    -0.14
    spiel
    -0.14
    Ãły
    -0.14
    bac
    -0.14
    icans
    -0.14
    ött
    -0.14
    edom
    -0.14
    ican
    -0.14
    irm
    -0.14
    POSITIVE LOGITS
    mente
    0.20
     flaw
    0.17
     importance
    0.17
     shift
    0.17
    ist
    0.16
     differences
    0.16
    ists
    0.16
     difference
    0.16
    ism
    0.16
    arily
    0.15
    Act Density 0.028%

    No Known Activations