How we blocked TikTok's Bytespider bot and cut our bandwidth by 60%

May 22, 2024 | about 3 minutes
Share:
Image for How we blocked TikTok's Bytespider bot and cut our bandwidth by 60%

Note: This is a pretty technical blog post so feel free to ignore if you're only here to buy original comic art on Nerd Crawler. If you want to learn about some code that is making Nerd Crawler better, read on! 

 

What is Nerd Crawler?

 

Nerd Crawler is the easiest marketplace for comic artists to sell original comic art, and with our bidding technology that extends the auction time whenever someone bids, true fans win and artists earn as much as 2x more. Put it simply, we're building a better eBay for comic artists like Frank Cho, Sabine Rich, Khoi Pham, and many more! 

 

What’s the problem?

Nerd Crawler has to serve a lot of images. My app is built with Ruby on Rails, and I use Cloudinary to host my images. Depending on the day, I was getting ~300K image requests and serving almost 20GB of images per day. This resulted in my image hosting costs being one of my largest expenses, and the cost kept growing and growing. So I needed to figure out a way to reduce cost.

 

What did I try?

Cloudinary offers a neat way to resize images to lower resolution when you add URL parameters like so:

Original image URL:
https://res.cloudinary.com/ctung/image/upload/v1694543154/nerdcrawler/chris/pieces/bbusdaqgo3ge7grm23ju.jpg

Lower res image URL:
https://res.cloudinary.com/ctung/image/upload/w_600,q_100/v1694543154/nerdcrawler/chris/pieces/bbusdaqgo3ge7grm23ju.jpg

This was a decent fix, but things were still not looking good as I was noticing in the past few days requests and bandwidth were continuing to climb.

 

 

What was the REAL solution? 

I decided to look at the data in the last 7 days and found these two alarming tables in Cloudinary’s reporting dashboard:

Though I have some users in Singapore, it made so no sense that they were taking up 67% of my bandwidth. And when I scrolled down, I saw something even weirder. 

The #1 ranking browser was Bytespider, which isn’t even a real browser!

 

I dug into this more and discovered Bytespider is TikTok’s new web crawler. It looks like they’re scraping the internet to build a search engine, and they’ve been hitting my site a TON!

 

So, the true fix was to stop Bytespider from ever hitting my site. With Rails, you can use a gem called Rack-Attack to solve this by returning a 403. Here’s the sample code:


With this bit of code, my app will return a 403 error when the browser is “Bytespider”, effectively blocking TikTok from hitting my site, which means I won’t serve them gigs upon gigs of images per day!

 

Here’s what the bandwidth graph looks like now after implementing the Bytespider blocker:

There aren’t too many startup graphs where you want to see the lines drop off a cliff, but in this case, I love seeing that steep drop!

Key Takeaways:

1. If you’re building a web app and noticing bandwidth cost rising, take a look and see if Bytespider is hitting your site. If they are, it may be good to block their traffic.

 

2. When you’re moving fast, it’s easy to deprioritize fixes to deliver better features and services to customers, but if something looks really bad, it might be worth taking a few hours to dig into the problem and see if you can fix it. You might find it’s an easy fix that can save you some time and money.

Share:

Raremarq is the easiest way to sell and auction your rare pieces directly to collectors.

Yes! Send me exclusive offers, new listings, and personalized tips for Raremarq.
© 2025 Raremarq
All Rights Reserved.
Cookies help us deliver our services. By using our services, you agree to our use of cookies.
Found a bug?