INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Entropy
    -0.07
    opacity
    -0.06
    fabs
    -0.06
    -shared
    -0.06
    Marcus
    -0.06
    Card
    -0.06
    Jane
    -0.06
    -0.05
    /stream
    -0.05
     Curso
    -0.05
    POSITIVE LOGITS
     οικο
    0.07
     national
    0.07
    жно
    0.07
    */↵
    0.06
    생활
    0.06
    object
    0.06
     {{↵
    0.06
    ?↵
    0.06
     who
    0.06
    {:
    0.06
    Act Density 0.005%

    No Known Activations