INDEX
    Explanations

    phrases and sentences that convey contrasts or nuanced statements

    New Auto-Interp
    Negative Logits
    odus
    -0.15
     nothing
    -0.15
     hlad
    -0.14
    leh
    -0.14
     Nothing
    -0.14
    emd
    -0.14
     få
    -0.14
    ãĥĸãĥª
    -0.13
    é«
    -0.13
    Nothing
    -0.13
    POSITIVE LOGITS
     nor
    0.50
     Nor
    0.42
    nor
    0.41
    Nor
    0.36
     NOR
    0.30
     anymore
    0.22
     sondern
    0.22
    epad
    0.18
     ноÑĢ
    0.18
     Norwegian
    0.17
    Act Density 0.142%

    No Known Activations