MBAs kill IT with efficiency pr0n

If you feel like a good read pick up a copy of Antifragile by Nassim Nicholas Taleb. Specifically read Chapter 3 temptingly titled “The Cat and The Washing Machine”.

A summary of the book in one line is: Antifragile things get better under stress. This is distinct from fragile things that break under stress, and robust things that persist.

Some antifragile examples used are:

  • Human bodies that improve under certain levels of stress (eg. strength training) and;
  • The airline system, which gets better with each single accident. Each accident improves the system (MH370 may prove to be an exception).

Chapter 3 argues that antifragile things tend to be organisms like “The Cat” whereas fragile things tend to be like the “The Washing Machine”.

What does this have to do with IT Infrastructure? Historically we’ve aimed to build robust infrastructure. We’ve built environments with dual power supplies, backup data centres and RAID disk arrays. We build environments to be robust under stress. They don’t improve.

When IT hardware was expensive and scarce it may not have been justified to buy back-up hardware. You may have bought a single mainframe for one site, leaving your business exposed to a potential extended outage. That is, fragile.

Now in the days of IT abundance, IT infrastructure is vast and geographically dispersed. If one server dies in the cloud does anyone notice? Shadow IT, iPads, BYOD and opportunistic vendors make IT growth much more organic that it once was.

It isn’t urgent if a power supply or HDD dies anymore. What is tricky is understanding the complexity of key business functions and how they map to IT. Also, software is eating infrastructure. Physical assets can be robust, but never antifragile. Software is much more malleable and can improve.

As an aside I chatted to a lady at a cafe the other day. She was a nurse about to study LEAN. I was horrified. LEAN is all about minimising waste and maximising value, or in my mind, managerial efficiency pr0n. I imagine the LEAN consultants received less-than-lean money from the public health system. LEAN works well for manufacturing, but applying a model born out of Toyota to hospitals is MBA madness.

I have nothing against efficiency. I just don’t think LEAN works outside of manufacturing. In manufacturing you have slow-changing product life cycles, mass production and well understood customer needs.

Applying LEAN thinking to IT assumes IT is like “The Machine” where in fact it is becoming more like “The Cat”. In IT we don’t fully understand customer needs so we have tight iterations and close proximity to the end-user (ie. Agile). Product life cycles can be incredibly short. Look at mobile apps. Digital software products are not mass-produced. A single version is produced that copied/distributed one-by-one as needed. A large portion of IT work is not documented. It is done by skilled artisans who don’t see the point of writing down stuff for efficiency experts.

In the naughties IT was consolidated to achieve scale. It was then optimised using methodologies like LEAN and outsourcing to manage cost, but IT never simplified. Complexity is always, at least, preserved it seems.

LEAN, consolidation, outsourcing. We ended up with many gutted-out, off-shore script shops. A disaster waiting when something undocumented or unexpected occurs. Previously salaried employees returned as contractors. IT in some ways became more fragile.

In fact, Taleb argues, and I agree, excessive size, efficiency/optimisation and complexity generally makes things more fragile. Efficiency leads you to put all you eggs in one basket. To lean on single vendors (pun intended) and have big hidden risks. Times have changed. Efficiency is becoming dangerous.

That’s not to say we gold-plate IT. I like Agile. Even though it not always implemented well, Agile recognizes IT as “The Cat” it is. It works with small teams, learns and improves (fails fast) quickly.

I suspect a good number of IT professionals will continue to balance efficiency/fragility with robustness when in fact the we should start looking at anti-fragility. We should look at more ways for IT to improve itself under stress.

We want failure to make IT better and stronger. Isn’t this the big takeaway from ITIL problem management? Isn’t this what Chaos Monkey does for Netflix? A non-functional process destroying things at random to improve the Netflix ecosystem.

What other examples can you think of where IT improves itself under stress? And do MBAs with the latest efficiency fad represent a risk to IT systems?

Is the cloud just… web services?

My work colleague turned around to me and said, “Web Services aren’t APIs”. That didn’t sound right so I started investigating. What is the difference and why is it important?

Applications have been built and constructed from re-usable and shared components since the bloody epoch. It’s why software is eating everything. Once a standard interface – and usually a corresponding library – is agreed upon there’s little reason to go back to first principles again. It’s division of labour, driven by pure need, evolving organically.

Then along came ubiquitous, always-available connectivity and expanding bandwidth. Why bother compiling and distributing an API/Library when the same API call could be made to a shared network service? A web service.

So a web service is a type of API. Not all APIs are web services though. APIs are the public interface for a piece of code – it is what you can call from within your code. It may be a web service, but it could also be a Java library or similar.

Wikipedia has a web services as “… a software function provided at a network address over the web”. That’s a pretty good definition. It must also be loosely coupled, which means that both the service caller and owner agree on the interface but know little about each other. The back end class can be substituted as needed.

Companies are themselves creating their own web service APIs. Here’s an API for business listings at Australia’s Yellow Pages and here’s one I’m looking at now to do some useful work on my expanding Evernote database. Both of these are web services with some extra API management capabilities. Grab yourself key and start developing!

Amazon was miles ahead of the curve on this. This famous post by Steve Yegge describes the famous Jeff Bezos mandate (in 2002!) that all teams at Amazon must:

  1. Expose their data and functionality through data services,
  2. Teams must communicate only through these interfaces (no direct database links etc.),
  3. All interfaces must be designed to be accessed externally

and a few other things. Talk about ahead of the curve.

It’s the future mesh of interconnectivity. One big “programmatical” mash-up.

There are the web services you own and the ones you don’t. With regards to the ones you do own, you’ll have to think about how you maintain some control of the integration. Two possible models are shown below. Do you have an integration bus/relay/proxy in each cloud so as to minimise network traversals, or do you have a few centralised integration points to make control and standards (and non-functional requirements like logging, access control, templates etc.) easier.

Web service cloud options

For the web services you don’t own but use, how much do you rely on single vendors, how do you manage keys and how do you manage different standards/skill sets etc.

I’ve always thought of “The Cloud” as a technology theme that is behind many new implementations, but I’m starting to think “the Cloud” is just a biblical plague of web services. It’s even hidden the in term “as-a-service”! All SaaS, PaaS and IaaS providers have public web service APIs built in from day one.

IaaS could mean twice the work for IT

In my last blog I wrote about Cloud Management platforms, and how they enable integration of multiple clouds (public and private).

One purpose of this is to drive standardisation of infrastructure. This is the usual drive for standards, strategies, life cycling and consolidation that has been with us for years.

Tech-heads get excited about new stuff like node.js, ruby on rails, Ubuntu, skinless servers etc., but in isolation these technologies provide absolutely no benefit to a company. They cost money to buy, build and support. When these component technologies are combined with application logic and data though, they can add immense value. This value must exceed – by a decent margin – the sunk cost of deployment and support.

IT vendors move between mass standardisation/commoditisation and differentiation – sometimes doing both things at the same time. AWS, GCE, and Azure strive to provide a base server at the cheapest cost – ie. commoditisation – but at the same time offer differentiated, non-standard services like Azure Service Bus and Redshift – to get the customers in (and keep them).

Also, over time enterprises accumulate legacy bits and pieces that are too old, too important or too expensive to replace. There they (dis)gracefully age until something serious happens.

All these drivers work against simplification and standardisation. A good friend I use to work with was asked by the IT Group Manager what he would do if, in his role as IT Infrastructure Manager, he had a blank cheque and infinite time. He said something like trash the whole lot and start again. From scratch. Clean out the “detritus”.

Head In A Cloud (jb912 via Flickr)

If you’re an established enterprise you’ve probably spent the last couple of years trying to understand The Cloud and what it means. The years before that you probably virtualised the hell out of everything on some hypervisor, and before that you tried to get everything on commodity servers. Along the way you migrated many workloads to the latest and greatest, but not everything. You probably still have non-cloud, non-virtualised, non-commodity gear around. Do you really believe all workloads will end up on IaaS?

(If you’re a start-up, it probably makes sense to start in one cloud. As you grow though you may end up getting stuck using their proprietary technology. That may be ok, or not. Ask yourself, could Netflix migrate off AWS, and what cost?)

You have standards for deploying servers in-house (something like Linux or Windows on a hypervisor), you have standard network switch configurations and whatever passes for standards in the storage space.

You don’t want to manage a second (or third or fourth) set of standards for your IaaS provider(s).

Comparing some pretty standard IT infrastructure objects against some AWS objects:

[table th=”1″]

In-house technology, AWS technology

VM template,AMI

Load balancer,ELB

Firewall policy,Security Groups


At best they are only packaged differently (VMs vs AMIs) and their guts are pretty similar. At worst, they have different capabilities, configurations, and therefore standards and expertise (load balancers vs ELB).

If you buy the Hybrid cloud direction we’re heading according to Gartner and your own observations then…

It’s two – or more – standards trying to be one, and that’s possibly double the work for IT.

Another argument for Cloud Management Platforms such as RightScale, Scalr and Kumolus? Thoughts?

What is a Cloud Management platform?

The Cloud keeps diversifying. New offerings come on the market each month. IaaS vendors innovate and create new services to help their customers (and if it also locks them in that’s okay too).

That said there is a core set of capabilities everyone looks for in an IaaS cloud. You may want to build an Ubuntu image with a particular number of vCPUs and memory. You don’t really want to concern yourself with the cloud server types that a vendor like Rackspace or AWS provides.

You don’t want to specialise in understanding the API of every cloud vendor when configuring your continuous deployment tools either.

You just want a server, operating system and some software that you can move between clouds. You’d love it if cloud providers agreed on some standards. If only.

Cloud management platforms – like RightScale (Ed: corrected) or Scalr -manage this disconnect.

Most cloud management solutions are aligning their security models, virtual networking and API strategy with AWS. They aim to provide their customers with the ability to migrate workloads between owned and non-owned clouds. They are both open source and proprietary.

Cloud management platforms represent both an immediate necessity for managing complexity and also an opportunity to start building sophisticated IT operations management platforms that allow better planning for the IT function, an ERP for IT if you like.

The following diagram shows where Cloud Management platforms reside in the Cloud stack (as distinct from Cloud platforms).

Cloud Computing ecosystem
It’s important to understand the difference between a Cloud Management platform (managing lack of standards between clouds) and a Cloud platform (turning your hypervisor and other shared resources into a private cloud)

For your cloud management platform to perform its function of integrating with your private cloud it needs to integrate with your internal cloud platform. Most organisations have created their private cloud by taking their existing investment in virtualisation and adding a cloud platform tier.

The expected features of a Cloud Management platform are:

  • Public/Private Cloud integration
  • Manage Compute, Store and Networking consistently.
  • Metering – The ability to track usage and apportion costs.
  • Templates – to support quick provisioning
  • APIs – to enable integration of other business systems such as billing systems and CRM. Integration with configuration management platforms such as Chef and Puppet to describe infrastructure in code.
  • Role-based access – d’uh.

There is a lot of confusion about terminology. This is partly because of the rapidly changing market and also because vendors are developing solutions that overlap different areas. Maybe they are trying to confuse us on purpose! Terms such as Cloud Brokers and Cloud Infrastructure Management are bandied about.

What is your Cloud Management strategy? Do you need one? Or are you picking a single vendor to keep it all simple?

2 more differences between in-house and AWS

In my previous post I talked about three differences between in-house and AWS deployments and specifically how it affects your architectural choices. Today I’ll talk about two more.

Typical AWS StackCaching

This whitepaper by AWS illustrates their design philosophy and is well worth a read – especially from page 11. The two concepts of loosely coupling and massive parallelisation strike at a core difference between traditional and cloud computing architectures.

If you’ve worked with large traditional applications you’ll be aware that it is common for the database and storage tiers to be a performance bottleneck. Databases like Oracle are monolithic beasts that are hard to scale without spending a lot of coin and difficult to replace once they’re in place.

Whenever you see an AWS architecture there’s typically a database-boosting caching tier. Cloud workloads are typically more elastic and read-intensive (ie. web apps) so cloud architectures must handle bursty and read-hungry workloads.

The AWS caching product is ElastiCache, which comes in difference instance sizes just like EC2. ElastiCache is based on the open-source object cache memcached, or the NoSQL DB Redis (your choice).

The first time a database object is accessed it is read from the database and put in the cache. Every subsequent read is then directly from the cache until the time-to-live (TTL) on the object is reached or the cache restarted. Typically you set up multiple instances in difference availability zones for high availability. You must make sure that the TTL is set on database objects so that they get cycled regularly in the cache.

In the diagram above there is a separate caching tier. Another option is to install memcached on the application instances. In this case each cache instance is stand-alone and dedicated to its own application server.

Another increasingly popular alternative to caching is to use exclusively NoSQL databases. NoSQL databases provide eventual consistency but can’t do complex database operations very well. They are easy to develop with.

Security Groups

In an in-house architecture the network is divided up into a number of security zones that each have a trust rating: Low trust, high trust, untrusted etc. Zones instances (subnets, VLANs, or whatever) are then separated by firewalls (see below)

AWS replaces firewalling with the concept of security groups. I haven’t been able to find any information about how AWS actually implement security groups under the hood and that’s one problem with them. You have to blindly trust their implementation. Assuming AWS security groups will have vulnerabilities from time to time, you need extra protection. There’s also little in the way of auditing or logging and a lot of rules and constraints about security group usage too.

For business critical applications, where data and service protection are significant issues, extra security technologies to consider are: host-based firewalls (eg. iptables) and intrusion detection, web application firewall SaaS (eg. Imperva), data encryption technologies, using a Virtual Private Cloud, two factor authentication and vulnerability scanning amongst other things.

To get around this problem one pattern that emerges is that of keeping critical data and functions in a secure location, and sharing a common key with the application running in the cloud. For example, to be PCI-DSS compliant, many organisations hand off their credit card processing to a payment gateway provider so they never have to handle credit card data. The gateway passes back a token and transaction details to the application. The application never touches the sensitive data.

Security groups simplify your set-up though and are great for prototyping. You don’t need a network or security expert to get started. One of the reasons you went to the cloud was probably because you didn’t want to touch the network after all!


The differences I’ve chosen to focus on in this two-part blog are: load-balancing, storage, databases, caching and security groups/firewalls. The reasons I’ve chosen specifically these is because the implementation of, and philosophies behind, each drives a different overall design approach. To build your own hybrid clouds these differences will have to be reconciled.

3 differences between in-house and AWS

To support Gartner’s latest hybrid cloud predictions and become a broker of cloud services, IT will need to minimise the architectural differences between on-premise and cloud infrastructure and use standard cloud building blocks.

On-premises IT and AWS are different though. Traditional enterprise applications have evolved on open, interoperable, heterogeneous platforms, while Amazon evolved out of the requirements of an on-line retailer who needed serious scale.

Here is a typical cloud-based application pattern in AWS. This is a single-site deployment with a data-only DR site. Compare it to the typical enterprise stack. There are some differences.

Load balancing

Traditional load balancers are customisable, tuneable, and built on high performance hardware. Load balancers by the likes of Citrix, F5 and Radware can perform a wide range of functions (eg. load balancing, fire-walling, vulnerability protection, address rewriting, VPN, VDI gateway).

Load balancing in AWS is a bit different to what you’re used to. Amazon’s native offering, the Elastic Load Balancer (ELB) is presented as a service to customers and the hardware is completely abstracted. AWS ELB can load balance across Availability Zones. This is geo-load balancing, a feature you have to pay extra for with on-premise load balancers. It also natively supports auto-scaling of EC2 instances, of course.

AWS ELB has some shortcomings though. You cannot create a static IP for load balanced loads, log HTTP(S) traffic, drain hosts or configure different load balancing algorithms, all of which is standard functionality in an on-premise load balancer.

An alternative load balancing option in AWS is open source software like HAproxy, or spinning up the equivalent of F5 etc. inside an AMI. The benefit of these approaches is that it more closely represents your internal set-up making hybrid load configuration easier. The downside is that it can be more expensive and more effort to set up. These alternatives are shown as greyed out above.


AWS storage has “unlimited” scale so businesses don’t need to worry about outlaying for big storage arrays every year.

For application and database storage in AWS, EBS Storage is really your only choice. AWS EBS IO performance can be poor and unpredictable though compared to on-premises storage options. For better performance you need to pay extra for provisioned IOPS. You can also create your own RAID volumes from EBS block devices but this can break EBS snapshots and lead to higher costs.

The most requested EBS feature is the ability to mount an EBS disk to more than one EC2 instance. On-premises infrastructure has always had the option of using shared storage as a way of creating clusters, transferring data and sharing configuration. This EBS constraint seems like a deliberate choice to force you to create EC2 instances that stand alone with no inter-dependencies, as a way of mandating/supporting the auto-scaling philosophy.


A highly available database hosted in AWS looks quite different to its in-house equivalent. One reason is that workloads in AWS are often web applications with a high read requirement. Databases in AWS are also built using EBS storage with its constraints on shared storage and AWS has been strongly influenced by MySQL and its replication methodology. Here is a basic architecture for a highly available on-premise and AWS database

AWS v Onprem DB

An AWS database can have multiple slave databases, which can also be used for database read operations to improve performance. Replication and the management of failover between nodes is scripted or manual. The storage replication occurs within the database tier so it may be slower. Transaction logs maintain database consistency.

In a typical in-house database there is expensive and complicated clustering software that provides well integrated availability. The data replication occurs within the storage tier, which should be faster and leverage existing storage assets. There is a single floating DNS/IP Address for the entire DB tier, which simplifies application set-up. There is no opportunity to get extra read performance from failover/standby servers though.


There are other differences between AWS and in-house that I’ll cover in a follow-up blog. I’d be interested to know what you like or dislike about the different approaches to infrastructure and how it has affected your planning.

Busting them freakin’ IT silos

This is a rough approximation of most IT Infrastructure organisations I’ve worked in – minus management of course!

Cloud Mgt Org - before - small-1

As you can see it’s very silo-ed. The only reason it functions at all is because IT people are highly developed socially, able to communicate without using jargon, and can comprehend the bigger business issues without being too precious about technology. Pfffft… Bollocks! The only reason it functions at all is because of large cross-silo teams of project managers, infrastructure architects and service managers that run around like Mel Gibson in Gallipoli trying to get the message around about what needs to be done.

A typical workflow for an IT project might be something like this (The cells with green indicate cat-herding, high-cost activities):
Project Engagement

If you’re lucky, some activities may have been automated and orchestrated (shown above as green/orange), which expedites and standardises projects.

This process is implemented in a pretty predictable and fixed way. And for good reason. “Management by Projects” has been a pretty consistent way of running IT even if weekly time-sheets are a menial fiction we all loathe. It’s only the mega-budget, long-term projects that have a special way of going awry.

The organisation in the first diagram won’t scale in the era of cloud computing though. It’s too cumbersome. Everyone loves cloud computing because of the low transaction costs (initial build cost and time-to-market). Instant gratification wins most of the time even if it costs more over the long term.

An organisation structure that can support this instant gratification will be one where the silos have shrunk. I’ve attempted to show this in the updated organisation below:

Cloud Mgt Org - after - small-1

IT Generalists and specialists with a propensity and willingness to cross-skill are moved into new Cloud Infrastructure and Service Operations teams. Over time these teams grow as your “traditional” silos atrophy.

You’ll still need experts in silos, whether in-house or out, to look after physical assets and organisational-wide compute, storage and network domains but the new cloud groups will be responsible for delivering IT and business projects.

Your silo-ed database specialists will have experience with big database technologies that aren’t that flexible and your storage and networking specialists will be experienced with enterprise-grade technologies that don’t necessarily have a cloud equivalency.

In many cases the perception, quite rightly, will be that cloud equivalents for databases, networking and storage are unsophisticated and lacking in features. The perception is right but changing.

The new organisation should drive efficiency, consolidation and consistency making internal IT environments (internal cloud, hybrid cloud etc.) more competitive with external cloud platforms.

This new world requires upfront investment and a faith that if you build they will come. But they – the business – will also need some encouragement. And the best way of doing this is to:

  • Limit their platform options – Stop putting things on legacy platforms. Use your cloud options.
  • Let them “play” on the new system – Users can spin up their own sandpit environments with a light-touch approval process (have them auto-expire… the sandpit environments, not the users!)
  • Encourage development teams – Talk about the concepts inherent in cloud platform development.
  • Develop a cloud business architecture with agreed terminology and artefacts.

It’s not exactly bustin’ them freakin’ silos but it’s a start. Now we only need to deal with the organisational politics to make this all happen….

IT will go on forever

The economist William Stanley Jevons made an observation about coal in his 1865 book, “The Coal question”. It became known as Jevon’s paradox, and it was:

      increases in energy production efficiency leads to more not less consumption

In Jevon’s time the big worry was that England would run out of coal. It was hoped that more efficient coal-use (eg. better steam engines etc.) would lead to a lower consumption and therefore England’s coal reserves would last a lot longer.

“write down my name …” (josef.stuefer via flickr)

Economics is very often counter-intuitive. Jevons argued the opposite would happen. If a doubling of fuel efficiency more than doubled work demanded then overall coal use would increase. If improved technology had a lesser effect coal use would decrease as perhaps expected.

What does this mean for IT? Let’s consider the raw input costs of IT projects: Infrastructure, Software & People.


… is of course getting cheaper and cheaper, while at the same time getting more powerful. Moore’s Law still holds and is driving greater processing power at cheaper cost. There are grumblings about the law’s demise, that chips are getting more expensive to develop, but big brains such as Andy Bechtolsheim assert that the law is still alive and well.

Networking speeds double every two years or so (Nielsen’s Law) and the value of any network increases with more connected devices (Metcalfe’s Law). Connecting to decent bandwidth is cheap but highly valuable and necessary. With the rise of mobile networking the same trends are occurring but now the network is available anywhere.

Manufacturing costs have plummeted. One just has to look at the cost of a Raspberry Pi. Better manufacturing automation, low margins and stiff competition are driving continual investment in lower production costs.

Then of course you have cloud computing which is consolidating data centres into mega data centres and banking huge scales of magnitude. Did you know that every dollar of revenue to Amazon Web Services results in three or four lost dollars to established vendors? This is showing up in the results of IBM and Oracle.


Cloud computing extends up into the software realm too and this impacts those previously mentioned tier-1 vendors. COTS software is now SaaS.

Agile development is reducing the risk of development projects and this lowers cost further. A minimum viable product can be produced to prototype new platforms at very low cost.

Open Source has been underwriting software cost reductions for almost 20 years.

The age of trolling software rent seekers may be coming to an end. We live in hope.


Massive outsourcing and off-shoring has had some impact on stagnating wages in IT, thereby limiting – or perhaps burying – costs. (Phew got that distasteful line out of the way.) Countering this there is downward rigidity in wage costs also or, as normal non-economists say, wages don’t go down much. The people cost can be one of the most expensive parts of any project.

That said, collectively people have become much more efficient. IT departments are more efficient through consolidation of data centres, standardisation, automation, orchestration, Green IT and commoditisation of particular platforms (eg. VOIP, Email, OS). IT operating budgets shrink or stagnate, but more is done.

Where IT departments are not efficient and have high transaction costs (eg. Deploying a DB costs $100,000 and takes 4 weeks) the department is being circumvented and a cloud solution deployed, even if the overall cost is higher over time.

The outcome

Overall costs in IT continue to drop for all the above reasons. In line with Jevon’s and England’s coal problems though, IT doesn’t bank the savings and use less computing resources. The opposite happens. IT consumes more. Why is that?

Projects we’d never have done in the past because they were too expensive, or difficult to tie to economic transactions, now become viable (eg. Engagement systems such as Social marketing). Companies can pursue a wider range of projects. Smaller companies can develop capabilities previously only available to large organisations. And these new capabilities once pursued and attained in a market, become a cost of doing business. They can also support the creation of further advanced capabilities. More demand for “work” is generated in IT that is gained by the extra efficiency.

Some costs and problems across the enterprise get worse, especially those that have to deal with the complexity of the entire ecosystem.  For example, Performance management, Change transformation, security, SOA, Enterprise Architecture, orchestration, Cloud architecture, networking, data stewardship, power and cooling and application testing.  These costs and problems are systemic though. They don’t stop new projects. In fact over time they create their own requirements and therefore projects.

Where will this all end? Back to Jevons again, it was expected to end with the exhaustion of coal. What resource will be exhausted first in IT? Electricity? Software skills? Our ability to manage complexity?  Processing power? Physical data centre costs and space? I can’t see any end in sight yet folks… feed the beast!

11 problems with AWS – the lookup table

I thought it’d be worth summarising the whole blog series in one easy Google-able post, so here is a summary of the 11-article rant I went on.

Access to Cloud / Ladder to Heaven (FutUndBeidl via Flickr)

There is no particular order to this. It was a brain dump of all the obstacles I could see if an organisation decided to aggressively pursue an AWS first strategy. Many of them apply across the whole cloud spectrum (Is that a mixed metaphor?). I will get around to developing a road-map document to address these issues. If you’re interested in a free draft of it send me a message through your preferred social network in the right side bar.

[table th=”1″]

Blog title,Issue,Description

11 problems with AWS – part 1,Data Sovereignty,Data in the cloud crosses many legal boundaries. Whose laws must you follow?
11 problems with AWS – part 2,Latency,You cannot speed up those little electrons. Those inter- and trans-continental traversals add up!

11 problems with AWS – part 3,Vendor Lock-in,All those value-adds hook you in like a heroine.

11 problems with AWS – part 4,Cost Management,Keep throwing resources at it man! The monthly bill wont look that bad!

11 problems with AWS – part 5,Web Services,Distributed? This app is all over the place like a mad woman’s custard!

11 problems with AWS – part 6,Developer Skill,How good are our developers? Do they get Auto-scaling? Caching? Load balancing?

11 problems with AWS – part 7,Security,Where did you put my deepest secrets?

11 problems with AWS – part 8,Software Licensing,Am I allowed to run this in AWS? OR Why IBM is tanking.

11 problems with AWS – part 9,SLAs,99.95% is not 99.999%

11 problems with AWS – part 10,Cloud Skills,Pfft! Cloud skills! We did this 30 years ago on the mainframe!

11 problems with AWS – part 11,Other Financial things,Talk to your accountant about tax and insurance… urgh!


Enjoy and share and let me know what I’ve missed.

Software is eating infrastructure

Infrastructure people increasingly deal in software, not hardware. Software is eating the world.

Servers are “guests”. Orchestration is no longer a nice to have but required. Cloud Management software. Application and log monitoring tools. Even storage vendors spruik their cloud values more than their hardware “creds” these days.

In the past applications would run on one big old server in the corner. Every night someone would change a backup tape. Occasionally someone would have walked up to and power-cycled the server. Over time servers got cheaper and smaller, data centres consolidated and grew and servers got remote management cards then became virtualised. Infrastructure guys got more and more distant from the hardware. Then the basic operations jobs got outsourced. It’s no wonder we need software for everything. As storage and networks get commoditised – like servers were before them – the consumption of the profession by software will be complete.

The past was “big tin” with leading-edge hardware and unmatched reliability and power inside the box. The future is tiny disposable units of compute, storage and network that move across an ethereal fabric. These units have a life cycle of potentially minutes. The big ol’ dinosaurs will have been replaced by the most elemental of hardware life forms. The management of this ever growing and sprawling environment will be performed by increasing layers of software. There is no other way.


An organisation could be thought of as a series of tea strainers with infrastructure as the teacup.

Every issue unresolved or missed at higher levels gets dripped down to the next strainer and there’s always unexpected tea that gets through to the teacup. CPU and memory get thrown at a poorly sized application. Storage specialists fix your database problems. And “the network” fixes everything else.

Even with cheap hardware and ever increasing software layers, tea is still going to keep dripping through all those strainers (especially now that we’re dealing with software!).

The ongoing drama of aligning:

  • A companies’ broader culture and proficiency
  • A companies’ actual needs
  • The always shifting technology landscape

is timeless.

The personality and skills required are those of people who have lived the “pain” and happily catch all the missed bits in their “teacup”. Infrastructure thinking has been crystallised by late night call-outs and unreasonable demands by those less technically savvy. That’s why we’re quite pessimistic and “failure-focused” compared to everyone else. (I do personally build loads of redundancy into my life!)

Infrastructure will always be there, hidden in these layers of software, process, methods and patterns that keep a companies’ core business going. It is just going to get a lot messier.

You see IT Infrastructure isn’t really about server “specs” and stuff. Sure that’s part of it, but mainly it’s about keeping the business going no matter what the technology looks like.

Load more