INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     cubes
    -0.14
    Cube
    -0.12
     rectangles
    -0.12
     Cub
    -0.12
     triangular
    -0.11
     Pyramid
    -0.11
     Cube
    -0.11
     rectangular
    -0.11
     pyramid
    -0.11
    berger
    -0.11
    POSITIVE LOGITS
     circle
    0.51
     Circle
    0.43
    circle
    0.39
     circles
    0.38
     circular
    0.38
    Circle
    0.38
    -circle
    0.34
    .circle
    0.34
    åľĨ
    0.33
     concent
    0.31
    Act Density 0.112%

    No Known Activations