INDEX
    Explanations

    is followed by descriptors

    New Auto-Interp
    Negative Logits
     These
    0.71
    These
    0.71
    these
    0.71
     этих
    0.67
     these
    0.66
     Dieser
    0.65
     эти
    0.64
     aceste
    0.63
    این
    0.63
    Эти
    0.63
    POSITIVE LOGITS
     excerpt
    0.46
     dedicate
    0.44
     потому
    0.43
     dedicated
    0.43
     contient
    0.43
     yüzden
    0.41
    မျိုး
    0.41
     iyong
    0.41
    र्निंग
    0.39
     summarizes
    0.39
    Act Density 0.071%

    No Known Activations