INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -1.26
     what
    -1.05
    -1.04
    -1.04
    -0.95
    -0.94
    ~~
    -0.93
    中国
    -0.89
     linkovi
    -0.88
    まだ
    -0.87
    POSITIVE LOGITS
     my
    1.08
     here
    0.96
    case
    0.96
    ときは
    0.93
    iedział
    0.92
     fact
    0.91
    interests
    0.90
    zepte
    0.90
    yszcz
    0.90
    為に
    0.89
    Act Density 0.076%

    No Known Activations