INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Honolulu
    -0.09
     zut
    -0.08
    -0.08
    -0.07
     fint
    -0.07
     iki
    -0.07
     desirable
    -0.07
     tubular
    -0.07
     Vine
    -0.07
    strlen
    -0.07
    POSITIVE LOGITS
    0.08
    aphne
    0.08
     sd
    0.08
     Richard
    0.08
     rib
    0.08
     rodents
    0.08
     Slim
    0.08
     уб
    0.07
    (mouse
    0.07
    mise
    0.07
    Act Density 0.004%

    No Known Activations