INDEX
    Explanations

    phrases related to architectural features and design

    references to the concept of architecture

    New Auto-Interp
    Negative Logits
    ml
    -0.72
    primary
    -0.70
    rd
    -0.69
    early
    -0.67
     Hubbard
    -0.67
    ca
    -0.67
    eworthy
    -0.66
    henko
    -0.66
    Pos
    -0.66
     Ventura
    -0.66
    POSITIVE LOGITS
    itect
    1.32
     architecture
    1.16
    urally
    1.06
     Architecture
    1.01
     architectures
    0.95
    ural
    0.87
     architect
    0.86
     architectural
    0.85
     chops
    0.83
     mismatch
    0.79
    Act Density 0.016%

    No Known Activations