INDEX
    Explanations

    references to people's backgrounds or experiences

    New Auto-Interp
    Negative Logits
    apa
    -0.16
    ardi
    -0.16
    oton
    -0.15
    iro
    -0.15
    ew
    -0.15
    isk
    -0.15
    ib
    -0.14
    ää
    -0.14
     Discovery
    -0.14
    baz
    -0.14
    POSITIVE LOGITS
    /background
    0.18
    bench
    0.16
    reten
    0.16
    educt
    0.15
     backgrounds
    0.15
    Enumerator
    0.15
     background
    0.15
    èĥĮæĻ¯
    0.14
     Background
    0.14
     lad
    0.14
    Act Density 0.009%

    No Known Activations