INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Plane
    -0.75
     Plate
    -0.72
     Law
    -0.71
     Line
    -0.71
     Leader
    -0.69
     Generator
    -0.69
     Frame
    -0.68
     House
    -0.68
     Generation
    -0.68
     Stock
    -0.68
    POSITIVE LOGITS
     engel
    0.25
     arab
    0.24
     dak
    0.24
     russell
    0.24
    aimana
    0.24
     blanc
    0.23
     jude
    0.23
     lav
    0.23
     mexicana
    0.23
     romano
    0.23
    Act Density 0.002%

    No Known Activations