INDEX
    Explanations

    text describing details and characteristics, potentially related to planning or visualization

    New Auto-Interp
    Negative Logits
     affor
    -2.07
     encomp
    -2.06
     reluct
    -1.99
     philanth
    -1.97
     accla
    -1.96
     impra
    -1.96
     increa
    -1.95
     depic
    -1.94
     embra
    -1.94
     strick
    -1.93
    POSITIVE LOGITS
     etc
    0.69
    GraphicsUnit
    0.68
     different
    0.68
     techniques
    0.67
     patterns
    0.67
     relationships
    0.64
     sizes
    0.64
    0.63
     variables
    0.63
     từng
    0.63
    Act Density 0.338%

    No Known Activations