INDEX
    Explanations

    references to console logging functions in code

    New Auto-Interp
    Negative Logits
    icom
    -0.16
     ruk
    -0.16
    hall
    -0.16
    rc
    -0.15
    et
    -0.15
    otre
    -0.15
    etak
    -0.15
    etre
    -0.15
    ady
    -0.14
    etro
    -0.14
    POSITIVE LOGITS
    .log
    0.34
    .dir
    0.23
    .warn
    0.23
    .table
    0.22
    .info
    0.22
    .group
    0.21
    _vlog
    0.20
    _log
    0.20
    .assert
    0.19
     log
    0.19
    Act Density 0.004%

    No Known Activations