INDEX
    Explanations

    nouns that indicate subjects or entities within various contexts

    New Auto-Interp
    Negative Logits
    another
    -0.43
    something
    -0.43
     something
    -0.43
    omyces
    -0.42
     ujednoznacz
    -0.40
    imwrite
    -0.39
     précis
    -0.38
     few
    -0.37
     another
    -0.37
    Sometimes
    -0.37
    POSITIVE LOGITS
     Semua
    0.77
     모든
    0.71
    WriteBarrier
    0.68
     semua
    0.68
     सभी
    0.66
     wszystkie
    0.66
    すべての
    0.65
     toate
    0.65
     تمامی
    0.64
    Semua
    0.63
    Act Density 0.093%

    No Known Activations