INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    It
    -0.08
    it
    -0.07
     It
    -0.07
    Permissions
    -0.07
    Hum
    -0.07
    belum
    -0.07
     Enemies
    -0.07
    _HELP
    -0.07
    IT
    -0.07
    imulation
    -0.07
    POSITIVE LOGITS
     square
    0.14
     Square
    0.14
     squared
    0.13
    square
    0.12
     squares
    0.12
    Square
    0.11
    -square
    0.11
    sq
    0.09
    (square
    0.09
     SQUARE
    0.09
    Act Density 0.011%

    No Known Activations