INDEX
    Explanations

    references to columns in data tables or structured data formats

    New Auto-Interp
    Negative Logits
    ycl
    -0.16
    rim
    -0.16
    kola
    -0.15
    anvas
    -0.15
    lemen
    -0.15
    eldon
    -0.15
    slash
    -0.15
    sembl
    -0.14
    emin
    -0.14
    upt
    -0.14
    POSITIVE LOGITS
    ar
    0.29
    arity
    0.26
    ists
    0.23
    aire
    0.22
    aires
    0.21
    wise
    0.19
    ophon
    0.19
    ISTS
    0.18
    heads
    0.17
    -wise
    0.17
    Act Density 0.023%

    No Known Activations