INDEX
    Explanations

    instances of descriptive phrases and their contexts

    New Auto-Interp
    Negative Logits
     Pixels
    -0.17
    laÄį
    -0.14
    ÙĪØ§Øª
    -0.14
    izr
    -0.14
    bay
    -0.13
     Muss
    -0.13
    íĥĪ
    -0.13
    formance
    -0.13
    ãģ¤
    -0.13
    tright
    -0.13
    POSITIVE LOGITS
    orz
    0.17
    rios
    0.17
    oir
    0.15
    ekim
    0.14
    ância
    0.14
    icontrol
    0.14
    rones
    0.13
     rosa
    0.13
    erved
    0.13
    uele
    0.13
    Act Density 0.219%

    No Known Activations