CPU and I/O throttling, better than VPS for Shared Hosting

Solokron

Well-Known Member
Aug 8, 2003
852
2
168
Seattle
cPanel Access Level
DataCenter Provider
Look at this little snippet I ran into today

"

Hosting is an interesting business. It has come a long way from the free website days of Geocities and the like. Speed, reliability, and features have grow far beyond what they were only a few short years ago. What you can now buy for under $10 a month is 10x as powerful as what you could buy 5 years ago for the same money. That is a good thing.

Hosting also has some short comings that still haven’t been addressed that hamper many individuals and companies with lower priced shared hosting. I would like to discuss some of these short comings and how I believe they can be fixed and show how we think we have finally been able to achieve shared hosting nirvana :)

Listed below are the well known and almost impossible problems to solve with regards to shared hosting -

1) Resource Allocation - Because many users are on a single system it is hard to allocate resources properly. You may have a very underutilized server (Resources going to waste) or an overcrowded server that can become bogged down with overuse. It is impossible to determine the usage of a particular user in advance. This causes unreliability on a server because a server can spike out of control at any time. Common problems are too much disk i/o consumed in bursts by particular users, extreme short or long term memory usage by a particular user, or spiking or prolonged cpu usage by a small group of users on a particular server.

** What we can do about it -

To us it all comes down to instantaneous real time tracking (About every 15-30 milliseconds in our case). Advances in general linux kernel tracking and our own propriety tracking have finally allowed us to know which users need the most resources. We have then tried to automate as much of the process of allocating free resources in real time to these users as possible. We are constantly updating these automated tools to make the process as seamless as possible.

2) Tracking User Consumed Resources - You can’t “blame” users or even monitor many user activities (Cpu use, or disk i/o usage) on shared servers because tracking user resources means you have to know what user ran what process. The problem comes into play when you have major applications that don’t run as a particular user and instead are run as their own process with no “real” user tracking. Let me give you an example of the two biggest problem applications in this area - MySQL, and Apache. MySQL consumes an enormous amount of CPU and Disk I/O resources, but it normally runs as the “mysql” user. This makes it very difficult to track resources used by a particular user. Built in tools such as slow query logs, etc are extremely inaccurate in measuring disk i/o and CPU usage by user. The other major culprit is Apache. Apache runs as a separate user as well (At least in our case, and most web hosts have a similar setup). Apache spawned script process such as PHP, Perl, etc can easily be made to run as a specific user for tracking purposes, but the Apache processes themselves and the corresponding cpu and I/O overhead is never attributed to a single user. There are many applications that fall into this category, and all of them make tracking inaccurate and problematic for hosting companies.

** What we can do about it -

I have spent thousands of hours over the last 3 years working personally on this problem and I am VERY happy to report that we have nearly solved this problem (MANY thanks to kernel developers all over the world that helped out as well as the talented developers in house!!). We have spent considerable time and money modifying the linux kernel and userspace applications (MySQL, Apache, etc) to report exactly which user is responsible for cpu and i/o usage in real time. Lets give an example - Lets say we have user “matt” that does a MySQL query that take 2 minutes to complete (clock time) and 90 seconds of real CPU time to complete (Actually number of CPU seconds required to complete the query). When MyQL passes the query to specific thread to be serviced we start tracking for that particular user the EXACT cpu time that was used, and the exact number of system reads and writes as well as device specific reads and writes. We can use this to track and slow down the extremely heavy users in real time so that the server is calm and available for everyone to use. We are in live testing right now on several boxes and hope to have the CPU portion of this rolled everywhere in the next 3-4 weeks. The disk I/O portion of our code is already live on 90% of our system and will be live on 100% of our servers in the next week. The importance of this can’t be overstated. This is what shared hosting has needed forever, and what will allow it to compete with and in many cases top VPS in performance while maintaining stability in the system.

3) Immediate action on policy enforcement - This is a BIG deal with shared hosting! When a specific user “violates” a policy like excessive CPU usage or disproportionate disk I/O or memory usage 99.9% of all shared hosting companies will try and alleviate the problem by killing processes or banning a user AFTER the damage has already been done. It does no good to ban someone after they have consumed so much CPU that the server becomes sluggish. The sluggishness or downtime has already happened at that point. Most hosting companies have a very difficult time every determining where these cpu and I/O abuses are coming from let alone mitigating the problem before it happens. Virtually NO SHARED HOSTING COMPANIES have good options to actively slow down cpu usage - They usually just stop the offending processes (Not a good option), or kill processes consuming too much disk I/O - not a good option either.

** What we can do about it -

As mentioned above in item #2 we can now track and monitor cpu and disk I/O usage in real time for all our users. Based on this information we can now do what no other shared hosting company has ever been able to do effectively. We can limit disk I/O activity in real time and limit CPU activity in real time for all our users. This allows us to mitigate the effects of sudden spikes in usage that would normally affect all other users. Here is a good example to illustrate this point - Lets say we have 100 users on a server and that 99% of the time everything runs smoothly but one day one of the users makes it to the front page of digg.com. One of two things are going to happen. Either the user that is causing the excessive load is going to be shut down or the server is going to be sluggish or possibly down until traffic subsides to that site. We have been in this position many times in the past and its never good for anyone. The disk I/O portion of that problem is now solved for us, and the CPU issue should be ready in the next few weeks. Instead of shutting the user down we can now “contain” them as if they were on a VPS. We can isolate them from other users so they don’t cause problems but still allow them to use any extra CPU cycles or I/O operations that are available.

**What does this mean?

What does this really mean? It means we will very soon be able to offer the VPS experience for less money and less hassle than every other VPS product out there, and that our shared hosting product is about to become a LOT more stable than everybody else out there. My opinion is that most VPS users really don’t need or want root access that requires their own time for security updates, Cpanel updates, etc. They simply want a contained environment and guaranteed resource allocation. We will be able to offer guaranteed resources just like a VPS or dedicated server solution without requiring any changes for our users. I am very excited to see all this materialize as this has been my pet project for several years. If you have made it this far in the blog entry I congratulate you for your tenaciousness in plowing through my technical ramblings!

Thanks,
Matt Heaton / President Bluehost.com - Hostmonster.com"

They run a cPanel environment and have successfully done this which is what cPanel clients have been praying for cPanel to do with requests upon requests for years now. Great news for Bluehost/Hostmonster, terrible for everyone else. When will we get a real user-based processing and I/O management system?!

Is there anyone else upset?
 

electric

Well-Known Member
Nov 5, 2001
790
11
318

Bailey

Well-Known Member
Aug 12, 2001
120
1
318
Wisconsin
When reading things on the internet, it is always wise to "consider the source."

Strutting and posturing do not equal solving anything. :cool:

:D Bailey
 

eth00

Well-Known Member
PartnerNOC
Mar 30, 2003
721
1
168
NC
cPanel Access Level
Root Administrator
Eh interesting but I want to actually see it released or at least working before any big deal is made of it. Actions speak louder then words alone :)
 

Cristi4n

Well-Known Member
PartnerNOC
Jul 2, 2006
73
0
156
Limit CPU and IO

I was wondering when people will start to notice what bluehost does (regarding that post).
I had a short chat with Matt on this. I have guessed pretty much everything he is using to be able to actually limit a process on io and cpu and also limit based on uid. Some things remained unexplained, but those things are minor and can be ignored for now, he probably did kernel work.
I will take a few guesses about what your problems will be in general:

1. This is not a simple thing to do, do not expect a new icon in cPanel soon (not in the next 2 years) for that. Bluehost has one for cpu throttle logs for your apps (based on uid, whatever) but it was made inhouse.

2. The kernel used in rhel is pretty old (some people will argue with me on this). I hate that anyway, although some things are backported 95% of what you will need is not in that stupid stable kernel. Come on, 2.6.18 ?

3. Either you have money to spend on kernel developers (like bluehost) or you have the knowledge and time to do it yourself. It still needs kernel and other things modified to be fully functional even with the latest kernel. And you don't even have the latest kernel, remember your on rhel.

4. If you have 1 to 10 shared hosting servers it doesn't worth it, you will still be able to manage that number of servers with your eye. If not, then stop dreaming about the things bluehost has.

5. Forget about mod_php. That is a thing of the past. You are still using mod_php ?

6. If you do not intend to do kernel work than you will end up waiting for a new rhel kernel for the next 12 to 24 months, maby longer. By that time everyone will probably have a shiny new button in cPanel.

7. If you have no idea about programming, than forget about everything, unless you pay someone to do all your work.

There are some more things to discuss regarding this but the main problem right now is the default kernel used in CentOS.
 

InterServed

Well-Known Member
Jul 10, 2007
275
18
68
cPanel Access Level
DataCenter Provider

WireNine

Well-Known Member
Aug 14, 2006
207
4
168
cPanel Access Level
Root Administrator
A lot of web hosting companies are picking up on this, Hostgator for one has something similar which throttles cpu/memory and disk i/o usage.
 

VeZoZ

Well-Known Member
Dec 14, 2002
245
0
166
cPanel Access Level
DataCenter Provider
I exchanged a few emails from Matt about this but after a few stopped getting any sort of response. So I guess he had second thoughts about making this available to the public for a price. Or just doesn't our companies having it one or the other.
 

WireNine

Well-Known Member
Aug 14, 2006
207
4
168
cPanel Access Level
Root Administrator
so, the question is, how can we do this on our own servers?
You can either hire a developer to do this and pay a lot of money as I believe this requires kernel level modification or request it from cPanel to make it available for everyone.
 

Funkadelic

Well-Known Member
Feb 10, 2006
73
0
156
hosting is learning towards unlimited everything as time goes on. How are the small guys supposed to keep up with the big guys if they can't offer a service that is up to par? We don't all have money available for the developers and as the hosting market will inevitably "unlimited everything" very soon I see this as a necessary feature so everybody can continue to maintain their shared hosting services.
 

SoftDux

Well-Known Member
May 27, 2006
1,023
5
168
Johannesburg, South Africa
cPanel Access Level
Root Administrator
hosting is learning towards unlimited everything as time goes on. How are the small guys supposed to keep up with the big guys if they can't offer a service that is up to par? We don't all have money available for the developers and as the hosting market will inevitably "unlimited everything" very soon I see this as a necessary feature so everybody can continue to maintain their shared hosting services.
Funkadelic, you'll soon realize that this isn't feasible and leads to many problems - both for the host, and the client. It's physically impossible to give a client more than 2TB space - and @ $2pm it's not financially viable either. The guys who offer these services have a LOT of rules in place to protect themselves and to avoid over-usages. But, it's a smart selling point if you feel like it :)

You can either hire a developer to do this and pay a lot of money as I believe this requires kernel level modification or request it from cPanel to make it available for everyone.
Ali, I don't know if I want to go this route, especially since we run everything one XEN - and custom XEN kernel tend to cause more problems than good if it's not maintained properly. If cPanel had to implement this feature (which I sugget they do, since others have it already), then I doubt if they would go the kernel route.

Surely, there's a different way of doing it as well? I suppose the kernel-route will give better performance, but by how much? What about programs like PRM - couldn't something like that be used? Has anyone tried?
 

cPanelDavidG

Technical Product Specialist
Nov 29, 2006
11,212
13
313
Houston, TX
cPanel Access Level
Root Administrator
Bugzilla is no longer monitored. I recommend you use our Feature Request forum instead. Here are a few relevant threads:

http://forums.cpanel.net/f145/auto-account-suspend-137537.html

http://forums.cpanel.net/f145/cpu-usage-reports-via-cpanel-137713.html

http://forums.cpanel.net/f145/limit-cpu-usage-ram-cpanel-134133.html

Feel welcome to comment and provide your input on how such functionality should be implemented.