INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ibre
    -0.07
    .DATA
    -0.07
    afd
    -0.06
     eclipse
    -0.06
     dominance
    -0.06
     iktidar
    -0.06
    Collapse
    -0.06
    ive
    -0.06
    Settings
    -0.06
     Used
    -0.06
    POSITIVE LOGITS
    .cols
    0.06
     giành
    0.06
    .metrics
    0.06
    .describe
    0.06
    ancies
    0.06
    flake
    0.06
     Rutgers
    0.06
     brun
    0.06
    0.05
    .Yellow
    0.05
    Act Density 0.003%

    No Known Activations