INDEX
Explanations
references to people's perceptions and social interactions
pronouns for people and groups
New Auto-Interp
Negative Logits
ferien
-0.44
gher
-0.43
besluit
-0.42
:✨
-0.41
Portion
-0.41
-0.41
asti
-0.41
nahilalakip
-0.40
ataan
-0.40
arote
-0.40
POSITIVE LOGITS
audiences
0.48
people
0.45
strangers
0.45
ProtoMessage
0.44
audience
0.43
PhysRevLett
0.41
contentLoaded
0.41
witnesses
0.40
fjspx
0.40
others
0.39
Activations Density 0.054%