Our Web Site Hell Story - and Why Service Matters
Or
How a Texas Cattle Rancher Named Larry Saved Our Web Butt
One inconspicious March Sunday morning our bookkeeper sought to catch up on some work, only to find the www.chickensys.com website performing very poorly. It would come up, then grabbing another page, ti would time out. Glad she could put off her work, she reported it to us and went about her Sunday.
Not wanting our website to be down on a weekend (customers often do their work on weekends, and orders often come in on weekends), we gave it some time, and when by afternoon it didn't clear up, we called our hosting company's technical support number. They weren't much help, choosing instead to instantly assume it was on our end. We didn't change anything, and all other web sites that we access worked fine - just not ours.
We had no idea that this was the beginning of a 16-day web site outage for us. We always assumed that each day there would be a solution, but it just dragged out to 16 days of not being with a web-presence for our company. If we would have known...
By the end of the week, and after about 12 combined hours of talking and waiting on the phone, and writing out trouble tickets to our hoster, we had absolutely no progress. We pleaded and pleaded to our hoster that the time-out issue was on their end, but to no avail. By the end of the week we had about 100 customers email complaints into us - it was beyond a doubt that the problem was worldwide and our site had a severe problem. But for some reason our hoster couldn't replicate it and just sat on their hands despite our desperate pleadings. All our tickets never got responded to (at least that week).
An explanation on web site hosting: Most every individual and small company uses "shared-hosting", which are companies that offer you space to host your web site and very cheap prices - around $5 a month. You are on a single physical computer/server with perhaps hundreds or thousands of others. Ordinarily it works quite well. (The alternative is to get a dedicated server, where you are the only entity on that machine, but that costs around $100 a month.)
This is important to note, because hosting companies don't have much vested interest in assisting you - you are only worth $5 a month, hardly enough to warrant the service persons paycheck. However, they have to brag about their service because they cannot get new customers unless they offer it strongly. Their industry is intensely competitive, but it doesn't mean their service is GOOD. It's not - it's friendly, timely, and accessible. But they are very unlikely to help you in any way if you have a mysterious issue or something that is not common.
One real poor aspect of a shared-hosing companies service is that when you call in, you re always talking to a Level 1 rep, and entry level person. Generally this person is trained to an acceptable level, but they can't do anything. The only people that can do anything are Level 2 or Level 3 (the highest level) technicians, and they will not ever speak to you over the phone. Phone communication is important because that's the only way you can exchange information in a quick give-take way. When things go Level 2/3, it's all email and usually this communication is delayed, or you get one shot a day in exchanging information, or you get massively misunderstood. Or all of these things. It's awful.
When we called in over and over and over again, we would get a new rep and we'd have to explain things over, and over, and over again. This wasted immense amounts of time.
Anyway... by Friday we had to form a contingency plan where we would move our web site(s) to another hoster if our current one, which we had been with for many years, kept failing at helping us. Again, the problem was that our site could be accessed upon first grab, but if you went to a couple different pages, it would start timing out and not give you anything more.
On Saturday, no progress was made, so we got another hoster and started the process of moving our stuff over. This was also loaded with problems, since our sites are fairly complex, and the new hoster wound up failing on their promise to migrate our site over (incompatibility issues, as it turned out).
But more importantly we, by accident, started finding out perhaps why our site was failing. We noted that our site was was being accessed about 30-40 times a second by random IP addresses. This wouldn't explain the timing-out issues completely (to make a long story shorter), but it did give us some clue.
By Monday (Day 8), we hadn't gotten www.chickensys.com migrated, plus the new hoster didn't give us a proper account and we had even more problems on our hands. To make matters even more difficult, our old hoster shut down our account because our site was causing performance problems on the shared server (which inconveniences all the other site on that machine). INCREDIBLE: after a week of pleading that something was wrong, the hoster ends up penalizing us.
Now, usually a hoster gives you a warning and reinstates you after you've assured them you've taken care of the issue. Our hoster gave us that warning, and by then we learned how to shut off traffic into the site from our nameserver. So we shut off the traffic, so from the outside people got a different warning - Site Does Not Exist. So bad goes to worse.
Later on Monday, in order to migrate our site safely, we had to turn on traffic again so the new hoster could get the material. Unfortunately, it looks like our old hoster was keeping an eye on us, and on Tuesday morning they shut us off again, but now it was for good. They refused to host ANY of our sites anymore. We have about 20 different domains. Upon calling them up and doing a bit of yelling, we got the rest of our sites reinstated but not chickensys.com, which concerned us less because it was being moved. At this point we were so disgusted by our old hoster that we wanted to move anyway. The old hoster would have been more forgiving, but they had a 3 Strikes rule, and apparently they count from the ENTIRE TIME you are hosted by them. We had an infraction about 5 years ago and they counted that! Incredible.
Old hoster throws chickensys.com out. We have it hosted on a new hoster and are ready to flip on the switch. We haven't coded a single line of code for our programs or addressed a Bug Report because we've been 24/7 on the web site issue. But, perhaps worst, we know that once we flip the switch for our new hoster, that 30-40 hits traffic will start coming in. We could only hope that either our new hoster filters this stuff out, or is more robust to handle the "noise". We intentionally went with a smaller company to get out of the red tape we previously were in.
Onward... things don't pan out. We flip switch and start examining our access logs, and sure enough, the noisy traffic starts piling in. It didn't take more than 2-3 hours for the new hoster to shut us down. We called in and declared ourselves guilty, and asked them if they could be of any help, and also that we weren't intending to cause them problems, but we just responsibly wanted to solve the issue. They were nice enough to accommodate, but in the end they weren't helpful at all. So on Day 9 we had to shut off traffic again and go back to the drawing board to find out what all this noisy traffic was about and why it was happening.
The best way to explain what came next is to start at the eventual solution, which occurred on Day 15, about 6 days later. On Day 10 or 11, we found out what the problem was. It took several days after that to figure out a solution, but it wasn't until Day 15 where we actually were convinced that our problem could even be solved. In fact, we were a minute away from just getting a dedicated server (big money) to put all the problems behind us, when we fell upon - accidentally - the solution.
Here's why all that noisy traffic was coming into chickensys.com: I'm sure you hear a lot about computer viruses. It's easy to think that they just affect the computer they reside on, but many are much more vile than that. Many viruses use the host computer to send out Net traffic that you are completely unaware that it's sending. That's what was happening to us - a virus called Pushdo someone put chickensys.com on it's "list" and then millions of infected computers around the world started attacking our domain with bogus requests. The Pushdo virus has infected computers access a list of domain names - not IP addresses - on some centralized servers around the world, and then their job is to take those domains and attack them.
These bogus request are difficult to filter because you can't block any one IP address, they are from all over. They have to be filtered on the network end, on the web site end. Since we are committed to cheap shared hosting, we were at the mercy of asking the hoster's to filter this sort of thing. The problem is, they don't care and it's just a hassle for them. At $5 a month, you aren't worth it.
Now we come to a Texas cattle rancher named Larry. (I know you are waiting for that.) The only reason we found out about the Pushdo virus as it relates to bogus network traffic is because the ONLY information on this - obtained through Google - is from a web site - a blog - from www.htcomp.net. "htcomp" stands for Hometown Computing and it is run by a (very smart) cattle rancher and/or IT network guy named Larry. (I recommend reading his bio - this guy lives a diverse of a life as you can imagine. Don't let the name Larry think he's some kind of auto mechanic or something.)
Now, there is tons of information on Pushdo, but only as it relates to the infected computers. In fact, Pushdo has largely been eliminated from new infections, but that doesn't cure the tens of millions of computers that slog on being infected without the user knowing. Only htcomp.net had information - and this was the personal experience of this Texas cattle rancher - on what to do about dealing with Pushdo's net attacks. (The blog entry can be found here: www.htcomp.net/?pageid=85&blogid=2, read it, it is quite interesting).
Larry saved our butt because his blog post was so well written even we could understand it. And his problem was EXACTLY what we were going through.
But there was still a significant problem: Larry's solution relied on the fact that he ran his own servers. He was able to figure out a solution because he could filter traffic on his network level, but us with shared hosting - we couldn't do that. But the good part is that finally we knew how to solve our problem, we just had to find someone that would cooperate.
Problem was - no one would cooperate. We contacted about 5-10 hosting companies, some very private concerns, some recommended by friends. All of them declined to filter the traffic, even though we knew and had convincing proof that it would not affect their loads even one iota. They just didn't care. For $5 a month, they don't want any hint of any problem. So as it seemed, it looked like chickensys.com was tainted for life, but chickensys.com is famous around the world for our products. It was not acceptable to change our domain. What do we do????
We figured that we had to resign ourselves to "wasting" $100 a month on a dedicated server, and filter out things ourselves. Remember, we aren't experts at this so even figuring out how to do this was a challenge. On Day 15, we called our new hoster and made a last ditch plea to please consider filtering this out on their end. They refused, so I started setting up a sale of a dedicated server on their end over the phone.
At this point the representative suggested something we had tried before, a service the hoster sponsored called CloudFlare. CloudFlare is a service that supposedly speeds up your web site access by caching your web site all over the world on certain servers. They are called a CDN - Content Distribution Network. One side-effect of this service is that they serve as a buffer between the computer accessing your domain and our site. CloudFlare has some services - on their free tier - called Page Rules, which seemed handy to filter out the bogus requests. We had tried using CloudFlare earlier for this, but it just didn't work. They have no phone support, so we put in a ticket, but as everything else, it was just ignored, or the responses we got were from techs who hardly bothered to read our ticket in full and just gave us trite solutions that didn't work of had their own issues.
HOWEVER... this time the new hoster representative suggested approaching it a little differently; not signing up for CloudFlare from their hosting sponsorship and doing it directly through CloudFlare. That also required for use to change our nameservers from GoDaddy (we liked their service so we always had them there, their Domain Manager is excellent) to CloudFlare. We didn't like doing that, plus we wanted no disruptions in email access - which fortunately survived the entire 16-day outage, it was just the web site domain that had the issues. But we decided to go through with it anyway.
That was the solution! All of a sudden CloudFlare started filtering out the bogus requests properly and on the new hosters end, we saw the bogus traffic simply disappear!
So on Day 16, our web site was up permanantly again, and CloudFlare filters all the bad traffic on their end. We can even see this happening on their reports pages. CloudFlare causes a little hassle when we work on our web site because of their caching mechanism, but we are forced to use them because there is no other way to cheaply filter the bogus requests from Puishdo infected computers to be filtered out.
So we give a big "Whew! and get back to work programming Translator and Constructor and all our cool music things, well away from the geekdom that flood Netland. We thank Larry and his cows. We write our monthly check for $5 out to our new hoster, and work to migrate the rest of our web sites to them before May 1 2013, when our contract with the old hoster expires.
The bogus requests will only diminish as the infected computers around the world die off or get reqpired. That will be perahps 4 ot 5 years.
We are bit chagrined that it had to take 16 days to solve what actually turned out to be a very simple solution. We understand that maybe our problem was a rarity, but still... We had this problem in a very minor way one year ago at our old hoster, and one of the things the old hoster suggested was to sign up for CloudFlare from their control panel. We did, but that seemed to cause more problems. (I'll skip THAT major episode, but suffice it to say that eventually we canceled CloudFlare and somehow the bogus traffic problem went away - we suspect the old hoster elected to filter the bogus requests on network but just kept it secret. In fact, that may be the cause of this whole issue -the old hoster on March 4th of 2013 moved the physical server we were on. We suspect that moved us away from the network filtering and then the bogus requests started coming in again.)
The truth was that CloudFlare being enabled remotely from the hosters control panels was faulty, and the hosting representatives should have known better than to keep blowing us off and instead, take a specific interest on the customers problems and see it through. We want to take all bad things that happen and turn them into positives, so we can say that is one thing we learned about OUR service.
It is important that we just don't leave it to the customer to sort out his own problems. Although we understand that we can't do anything about his own computer, or his network, or his ISP, or whatever, we CAN take a more personal interest in a customers issue and see it through in the areas we CAN go through, even if it doesn't involve our products.
Customers should understand that companies resources are limited, so understandably many customers service requests can turn out to be rabbit trail chases. But that's what GOOD SERVICE is about, the representative should hold the line early based on his refined instincts and the information he starts getting. He shouldn't pressured but know where to go to figure out the answers.
That would help customers and companies equally.
I hope you've enjoyed the story, thank God it's all taken care of. We are better company for it. Have a good day!
|