INDEX
    Explanations

    actions and requests related to teaching and sharing experiences

    New Auto-Interp
    Negative Logits
    ologne
    -0.13
    osemite
    -0.13
    -many
    -0.13
    luet
    -0.13
    uars
    -0.12
    yleft
    -0.12
    ucker
    -0.12
    okud
    -0.12
    -Am
    -0.12
    .Dev
    -0.12
    POSITIVE LOGITS
     some
    0.84
    some
    0.69
     Some
    0.64
    Some
    0.59
    ä¸ĢäºĽ
    0.58
    .some
    0.56
     SOME
    0.54
    _some
    0.53
     qualche
    0.48
     einige
    0.47
    Act Density 0.551%

    No Known Activations