INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ainting
    -0.07
    uil
    -0.06
    INCT
    -0.06
     depicts
    -0.06
    _latency
    -0.06
     tiener
    -0.06
     cmp
    -0.06
     rehabilitation
    -0.06
     punch
    -0.06
    (created
    -0.06
    POSITIVE LOGITS
    borg
    0.07
    .hostname
    0.07
    dog
    0.06
     gotten
    0.06
    latent
    0.06
    0.06
    CTSTR
    0.06
     degrees
    0.06
    -high
    0.06
     intermedi
    0.06
    Act Density 0.005%

    No Known Activations