Scalability

Scalability is NOT the ability to throw more machines into the mix. Not only does it not guarantee to solve the problem, it doesn't even identify the problem it is trying to solve.

The scale of something is how big "it" is, or how much of "it" you need to deal with, and it can refer to a number of factors.

Data Volume

My article on Big Data concludes that Big Data might be an issue for databases, but it isn't going to be an issue for your applications unless you get to the stratospheric levels of Twitter and Facebook. Australia isn't big enough to push that volume of data through an application. Because of this the databases that are accessed by the application won't suffer a Big Data problem either.

The sort of volumes that can be handled my any modern database are far beyond that which can be put through an application by a user.

User Population

The total number of users isn't actually relevant except in two respects. Authentication (who is who) and authorization (what are they allowed to do). Neither of these are data intensive, so large user populations do not raise any database scalability issues. Incidentally, authorization is likely to pushed further towards the data as there are increased concerns about privacy and accountability/audit.

The reason why the user population is irrelevant is because the far more important measure is how much activity those users perform. If they log on once a year, it is insignificant. If they log on at the start of the day, log off at the end, and do nothing in between then they are *almost* insignificant. In the last case, it is an issue if you allocate a chunk of memory dedicated to them when they log on and keep it until they log off. But if that is your situation, some form of connection pool is simple enough to apply.

Transactions

In the Early Internet Age, these were 'hits' or 'page views'. Now everything is going AJAX, we need to look at 'requests' which has the potential to be a more precise measure. The trick is that user requests (from the user to the application layer) are different from database requests (from the application layer to the database). Application developers need to start being aware of the latter, as this has a much greater effect on database scalability than the former.

Scalability problems arising from Requests can be further broken down into several potential bottlenecks

Disk
 
Disk volume isn't the problem. Random access speed is the issue, because they can only spin so fast. When all the data can't fit into memory, you have a latency in retrieving it from disk. The good news is that flash will help here. Flash drives (which still go through a 'drive' abstraction layer) and cards/memory merge the speed of RAM with the durability of disk. Flash is great for random reads. Unfortunately it isn't as good for writes (especially updates), but that all gets rationalized through the DBWR process anyway. I'd be very surprised if the Flash Cache technology in 11gR2 doesn't get extended in Oracle 12# and beyond to make it a compelling offering.

Network
 
In theory network scalability is about the size of the pipe. The more concurrent transactions, the thicker the pipe needs to be. The best resolution to this is to ensure that data packets are as few in number and small in size as is practical. However the mess of abstraction layers between the application and database makes this approach harder to actually put into effect. Solutions such as Oracle's Apex and Mike Stonebraker's VoltDB push more work to the database layer to make the most efficient use of the network.
 
CPU
 
CPU is the expensive bottleneck. Adding more CPU cores to an Oracle database is where the licensing costs come into play. It will therefore be the place where scrimping and saving is most likely, especially as organizations look to consolidation and virtualization. The associated cost makes it attractive to shift CPU to a separate application layer. However increasing CPU load on the application layer does not necessarily mean it decreases on the database layer, as you can miss out on the efficiencies of processing data in a layer designed for just that purpose.
 
Memory
 
The good news on Memory is that standard memory levels are getting higher and, except for Big Data situations, you'll be able to get more of your databases into Memory. The bad news is that the bean counters will cram more database severs onto the same physical hardware, and the memory any particular database instance will be pretty static.

This article was written by Gary Myers, and is filed under "Oracle Database Development - A View from Sydney'. It covers database, scalability, disk, network, CPU, memory, transactions and user volumes.