INDEX
    Explanations

    discussions related to social and educational issues

    New Auto-Interp
    Negative Logits
    viso
    -0.17
    ISCO
    -0.16
    lashes
    -0.15
     anders
    -0.15
    uela
    -0.15
    ERY
    -0.15
    inyin
    -0.14
    okol
    -0.14
    BOOLE
    -0.14
    nal
    -0.14
    POSITIVE LOGITS
    alone
    0.15
    etik
    0.15
    ubs
    0.15
    kaar
    0.14
    tu
    0.14
    420
    0.14
    rolls
    0.14
    berger
    0.14
    elo
    0.13
     Lik
    0.13
    Act Density 0.015%

    No Known Activations