INDEX
    Explanations

    phrases related to community engagement and emotional reactions

    "All" followed by a determiner

    all the following words

    New Auto-Interp
    Negative Logits
    principalColumn
    -0.71
    sizeCache
    -0.70
    owohl
    -0.69
     Италијани
    -0.68
    SharedDtor
    -0.63
    Personensuche
    -0.61
     дописавши
    -0.61
     iNdEx
    -0.61
     Ambos
    -0.61
    :✨
    -0.60
    POSITIVE LOGITS
     all
    1.84
    all
    1.17
     allemaal
    1.14
     allt
    1.00
     semua
    0.99
     все
    0.98
     всех
    0.88
     tantos
    0.87
     All
    0.87
     všetky
    0.87
    Act Density 0.276%

    No Known Activations