INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Seen
    -0.07
     Knee
    -0.07
     cubes
    -0.07
     knee
    -0.07
     collision
    -0.06
     Eyes
    -0.06
     seen
    -0.06
     eye
    -0.06
    _survey
    -0.06
     oxy
    -0.06
    POSITIVE LOGITS
     format
    0.17
     Format
    0.15
    Format
    0.15
    format
    0.14
     formats
    0.13
    FORMAT
    0.12
    _format
    0.11
     FORMAT
    0.11
    Formats
    0.11
     Formats
    0.10
    Act Density 0.029%

    No Known Activations