INDEX
    Explanations

    references to academic roles, research funding, and scholarships

    New Auto-Interp
    Negative Logits
     fiber
    -0.18
     labor
    -0.17
     behavioral
    -0.16
     honored
    -0.16
     theater
    -0.16
     analyzed
    -0.16
     neighbors
    -0.15
     honors
    -0.15
     counseling
    -0.15
     fibers
    -0.15
    POSITIVE LOGITS
     EPS
    0.27
    EPS
    0.25
     Outputs
    0.23
     UK
    0.21
    UK
    0.21
     Norwich
    0.20
    outputs
    0.20
     outputs
    0.19
     Imperial
    0.19
    Outputs
    0.19
    Act Density 0.117%

    No Known Activations