INDEX
Explanations
references to Facebook and its associated features or links
New Auto-Interp
Negative Logits
istic
-0.16
ãĥ©ãĥĥãĤ¯
-0.15
exact
-0.15
Ñħи
-0.15
Richardson
-0.15
agli
-0.14
vetica
-0.14
'Brien
-0.14
ê¹
-0.14
omor
-0.14
POSITIVE LOGITS
Messenger
0.23
0.21
(fb
0.20
s
0.19
/T
0.19
messenger
0.18
fb
0.18
.com
0.17
istan
0.17
FB
0.17
Activations Density 0.011%