INDEX
Explanations
mentions of the word "Facebook" in various contexts
references to Facebook and social media interactions
New Auto-Interp
Negative Logits
iaz
-0.72
olar
-0.69
Virtue
-0.67
zin
-0.67
dinand
-0.65
liga
-0.64
RAG
-0.64
Downloadha
-0.64
ktop
-0.64
OPA
-0.63
POSITIVE LOGITS
username
0.96
account
0.91
postings
0.89
username
0.83
avatar
0.82
commenter
0.81
outage
0.81
inbox
0.80
hashtag
0.79
emot
0.78
Activations Density 0.065%