INDEX
    Explanations

    nouns followed by descriptions

    New Auto-Interp
    Negative Logits
    ן
    1.05
    ει
    1.01
    ார்க
    0.94
    0.88
     aree
    0.86
     znacznie
    0.84
     ακόμα
    0.84
     musica
    0.81
    오는
    0.79
    0.79
    POSITIVE LOGITS
    it
    1.15
     evidences
    0.91
    beans
    0.90
    nama
    0.88
     commences
    0.87
    0.85
    ja
    0.85
     диагности
    0.85
    creates
    0.84
    platz
    0.83
    Act Density 0.209%

    No Known Activations