INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Pad
    -0.08
     abc
    -0.07
    Campo
    -0.07
     mars
    -0.06
     oo
    -0.06
     Mercy
    -0.06
     armour
    -0.06
     Temple
    -0.06
    uma
    -0.06
     cush
    -0.06
    POSITIVE LOGITS
     write
    0.17
     writing
    0.15
     written
    0.13
     Write
    0.13
     writer
    0.12
     Writing
    0.12
    write
    0.12
     writers
    0.12
    Writer
    0.12
    Write
    0.11
    Act Density 0.099%

    No Known Activations