INDEX
    Explanations

    phrases related to the number and variety of studies and issues discussed

    quantifiers followed by nouns

    New Auto-Interp
    Negative Logits
    featureID
    -0.68
     незавершена
    -0.64
    Autoritní
    -0.59
    Jeografia
    -0.57
     &___
    -0.57
    :✨
    -0.55
     ligiloj
    -0.53
     împre
    -0.52
     }{@
    -0.52
    #+#
    -0.51
    POSITIVE LOGITS
     uni
    0.40
    tiefel
    0.40
     few
    0.37
     AssemblyCompany
    0.36
     manche
    0.35
     experiences
    0.35
     many
    0.34
    最近
    0.34
     job
    0.33
     prev
    0.33
    Act Density 0.158%

    No Known Activations