INDEX
Explanations
references to written content or posts
New Auto-Interp
Negative Logits
متعلقه
-0.60
AssemblyTitle
-0.52
numerusform
-0.51
olin
-0.48
cade
-0.47
שוליים
-0.47
uders
-0.47
AttributeSet
-0.46
bree
-0.46
report
-0.46
POSITIVE LOGITS
emails
1.41
posts
1.23
Emails
1.22
emails
1.16
Posts
1.16
messages
1.09
newsletters
1.05
tweets
1.01
letters
0.99
articles
0.98
Activations Density 0.190%