INDEX
    Explanations

    references to significant cultural or artistic events

    New Auto-Interp
    Negative Logits
    imd
    -0.17
    ätz
    -0.15
    ippet
    -0.15
    nip
    -0.15
    326
    -0.15
    wig
    -0.15
    HING
    -0.14
    553
    -0.14
    heimer
    -0.14
    _dot
    -0.14
    POSITIVE LOGITS
    abela
    0.16
    ppo
    0.16
     Trot
    0.14
    eda
    0.14
    ÏĦÏī
    0.14
    uler
    0.14
    $body
    0.14
    uzzi
    0.13
    ÑĩаÑģ
    0.13
    313
    0.13
    Act Density 0.030%

    No Known Activations