INDEX
    Explanations

    proper nouns, especially names and titles

    New Auto-Interp
    Negative Logits
    SharedDtor
    -0.36
     naselje
    -0.36
    ponses
    -0.34
     mémor
    -0.34
     SIMBAD
    -0.33
    ntö
    -0.33
     vœ
    -0.32
     succès
    -0.32
    .$,
    -0.32
    :✨
    -0.32
    POSITIVE LOGITS
    oa̍t
    0.63
    saraba
    0.62
    tagHelperRunner
    0.61
    出版年
    0.57
    Gön
    0.52
    KURZBESCHREIBUNG
    0.51
    ddots
    0.51
     wireType
    0.50
     AssemblyTitle
    0.49
     屋根
    0.48
    Act Density 0.081%

    No Known Activations