INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    主任
    -0.08
     subpoena
    -0.08
     patrol
    -0.08
     teha
    -0.08
     befest
    -0.08
     chased
    -0.08
    ping
    -0.07
     Sheriff
    -0.07
     uza
    -0.07
    -0.07
    POSITIVE LOGITS
     cocoa
    0.13
     шокол
    0.12
     chocolate
    0.10
     chocol
    0.10
     coca
    0.10
     chocolade
    0.10
     chocolat
    0.10
     caffe
    0.09
    latent
    0.09
     NVIDIA
    0.09
    Act Density 0.029%

    No Known Activations