Skip to main content

The Lifetime of a CMS Installation

(cross-posting from here)

CMS analyst Janus Boye has blogged about the expected lifetime of a CMS installation, i.e. for how long an installed CMS can be expected to be in production. His guess is a lifetime of 3 years. On the blog's comments Janus and I got into a discussion about the accuracy of that guess where he asked Day to publish actual real data about this topic.

I like this idea because publishing this data provides a benefit to our potential new customers: a reliable indicator (without any hand-waving or gut feelings) of the CMS's lifetime that can be used in business plan

The data

The data I have used is taken from Day's support contracts. Only customer data from outside ouf Europe was used (simply because it was available to me). This selection is likely to bias the results towards shorter lifetimes as Day's oldest customers are based in Europe. The basic assumption is that the life time of the CMS is equivalent to the duration of the support contract. The used end point of each contract period is the date up to which the contract is paid for as of today.


You might argue that there could be customers that have a contract but do not actually use the product anymore, which could in fact be the case (I do not know of any). On the other hand, I am aware of customers that still use the product and have terminated their support contract. Therefore, in order to reduce selection bias I did not remove any data points due to this particular consideration.


Each customer was counted once for each product he purchased, i.e. a customer that has two distinct support contracts for CRX and CQ was sampled twice. I discarded all OEM contracts because they are of their different nature (they would skew the result towards longer lifetimes). Finally, I also dropped a data point where the support contract was cancelled because the customer went out-of-business alltogether.


I believe that this data set is reasonably unbiased to provide meaningful results with respect to the question of the lifetime of a customer's CQ/CRX installation.

The Method

Luckily for Day, the data is what is called "right censored". That means that it is unknown for how long an existing support contract will go on - actually the majority of the available data points are right censored.

The scientific discipline that is concerned with analyzing data of this kind is called "survival analysis". One is interested in the survival function which maps a set of events onto time. The survival function is a property of a random variable, i.e. it needs to be estimated (in the statistical sense of the word).


One well know estimator for the survival function is the Kaplan-Meier estimator (which is non-parametric, i.e. there are no underlying assumptions about the distribution of the data). In a nutshell:


The Kaplan-Meier estimate of the survival function, S_hat(t), corresponds to the non-parametric MLE estimate of S(t). The resulting estimate is a step function that has jumps at observed event times, ti. In general, it is assumed the ti are ordered: 0 <1>i is di, and the number of individuals at risk (ie, who have not experienced the event) at a time before ti is Yi, then the Kaplan-Meier estimate of the survival function and its estimated variance is given by:

The quantity of interest is the mean survival time (and its respective estimate) which is given by:

Because S(t) may not converge to zero, the estimate may diverge. Therefore the integral is only taken up to a finite number. A reasonable choice of is the largest observed or censored time.

Results

Resisting a geek's urge to implement the estimator myself I used the freely available R to calculate the results. Here is a plot of the Kaplan-Meier estimate for the survival function with 95% confidence bounds (time is in days):

And finally, the estimated value for the mean survival time, i.e. the estimated lifetime of a Day CMS installation is: 2453 days with a standard deviation of 154 days. That's about 6.7 years. Mind you, this result is likely to be lower than if the whole customer base had been analyzed.

Comments

Popular posts from this blog

Python script to set genre in iTunes with Last.fm tags

Now that I have started to seriously use iTunes I figured it might be nice to have the genre tag set in a meaningful way. Since I have a reasonably large collection of mp3s doing that manually was out of question - I wrote me a Python script to do that. There seems to be a large demand for such a functionality (at least I found a lot of questions on how to automatically set the genre tag) so maybe someone else finds the script useful. It is pasted below. General Strategy The basic idea is to use Last.fm's tags for genre tagging. In iTunes the genre tag is IMO best used when it only contains one single genre, i.e. something like "Electronica", not something like "Electronica / Dance". On the other hand dropping all but one tag would lose a lot of information, so I decided to use the groupings tag for additional information that is contained in the list of tags that an artist has on Last.fm. In the example above that would be something like "Electronica, Dan

The misuse of the term "RESTful" in the Rails community

Today I went to a talk at the local Ruby on Rails group. The speaker was quite clueful. He had even implemented his own DSL to describe his business problem. Obviously, the guy was not a noobie in Ruby. However, what really turned me off was his usage of the word "RESTful". For him, it seemed to be a way to describe the inner workings of his application, like, say, "separation of concerns". RoR guys are generally not the most clueless people, but nobody in the audience challenged him about this. It seemed to be the generally accepted usage of the term in the Rails community. This made me think that DHH and Rails have done two things to REST: First, they greatly help to evangelize the term "RESTful" Second, they hijacked the meaning of the term and changed it from "architectural style" to "application architecture" As it happens I listened to a podcast from the Pragmatic Programmers on my way home. It was about the .Net Ruby implementati

What is Multi-Tenancy? A closer look

Lately, I had a lot of conversations about multi-tenancy (MT). So I finally wrote up my thoughts on that term. In this post I will argue that MT is a value that depends on a continuous variable. Therefore, any statement about a system being “MT” can only be made in the context of the given requirements. It is not a property of the system itself . I will also show that perfect multi-tenancy is indistinguishable from single-tenancy (ST). MT is a value that depends on a continuous variable Imagine a step-function "ST-MT" (values are either 0 or 1) that determines if a given system is MT (1) or ST (0). That function will look like this: ST-MT = function (system, business requirements) Look at  the function’s arguments: the first one is obvious – the result will depend on the system itself. The second one is more interesting: it is the cumulative set of business requirements . Typically, these requirements will include: Resource sharing: systems typically declare