INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     displayName
    -0.08
     Flat
    -0.07
    /original
    -0.06
     Tig
    -0.06
     Gil
    -0.06
    (Dense
    -0.06
    aston
    -0.06
    _AG
    -0.06
    -0.06
    _iters
    -0.06
    POSITIVE LOGITS
     stepped
    0.07
     afar
    0.06
    .Paths
    0.06
     stepping
    0.06
     Pradesh
    0.06
    chr
    0.06
    оры
    0.06
    aysia
    0.06
    taxonomy
    0.06
    cow
    0.06
    Act Density 0.018%

    No Known Activations