INDEX
Explanations
instances of interactive content related to social issues or commentary
New Auto-Interp
Negative Logits
.''
-0.65
ÂŃ
-0.63
)."
-0.61
—"
-0.61
SPONSORED
-0.60
.</
-0.60
."
-0.60
.''.
-0.58
]."
-0.58
.ãĢį
-0.57
POSITIVE LOGITS
':
0.74
!:
0.72
Profile
0.71
Variant
0.68
Emails
0.64
%:
0.63
SHARES
0.63
odore
0.61
reenshots
0.61
Favorite
0.61
Activations Density 0.673%