INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    ĪĴ
    -0.72
    metic
    -0.69
    igmatic
    -0.68
    task
    -0.66
    holes
    -0.63
    ticket
    -0.62
    checking
    -0.62
    infeld
    -0.62
    emer
    -0.61
    Murray
    -0.61
    POSITIVE LOGITS
    ariat
    0.82
    kefeller
    0.75
     [+
    0.73
    osa
    0.73
    ulz
    0.73
    ohyd
    0.73
     Cells
    0.71
    oodle
    0.70
    ardless
    0.67
    roxy
    0.65
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.