INDEX
    Explanations

    terms related to architecture and architectural design

    New Auto-Interp
    Negative Logits
    oo
    -0.19
    ager
    -0.15
    airo
    -0.14
    ara
    -0.14
    oad
    -0.14
    erator
    -0.14
    114
    -0.14
    621
    -0.14
    hood
    -0.14
    622
    -0.14
    POSITIVE LOGITS
    urally
    0.29
    ural
    0.23
    ivist
    0.20
    atron
    0.19
    essel
    0.19
    /arch
    0.19
    itect
    0.18
     sư
    0.17
    urve
    0.17
    URAL
    0.17
    Act Density 0.016%

    No Known Activations