Last week, my team and I noticed our crawls were capturing large amounts of data for the host external.fsjc1-3.fna.fbcdn.net (URLs were mostly broken links/ removed Facebook pages). We've tried blocking "external.fsjc1-3.fna.fbcdn.net" from the fbcdn.net and www.facebook.com hosts and a couple of teammates blocked this host altogether. Both approaches have worked and Facebook content is capturing fine.
I was wondering if anyone has tried different scoping for this new host / what your experiences have been...
Please sign in to leave a comment.