Skip to main content

The Lifetime of a CMS Installation

(cross-posting from here)

CMS analyst Janus Boye has blogged about the expected lifetime of a CMS installation, i.e. for how long an installed CMS can be expected to be in production. His guess is a lifetime of 3 years. On the blog's comments Janus and I got into a discussion about the accuracy of that guess where he asked Day to publish actual real data about this topic.

I like this idea because publishing this data provides a benefit to our potential new customers: a reliable indicator (without any hand-waving or gut feelings) of the CMS's lifetime that can be used in business plan

The data

The data I have used is taken from Day's support contracts. Only customer data from outside ouf Europe was used (simply because it was available to me). This selection is likely to bias the results towards shorter lifetimes as Day's oldest customers are based in Europe. The basic assumption is that the life time of the CMS is equivalent to the duration of the support contract. The used end point of each contract period is the date up to which the contract is paid for as of today.


You might argue that there could be customers that have a contract but do not actually use the product anymore, which could in fact be the case (I do not know of any). On the other hand, I am aware of customers that still use the product and have terminated their support contract. Therefore, in order to reduce selection bias I did not remove any data points due to this particular consideration.


Each customer was counted once for each product he purchased, i.e. a customer that has two distinct support contracts for CRX and CQ was sampled twice. I discarded all OEM contracts because they are of their different nature (they would skew the result towards longer lifetimes). Finally, I also dropped a data point where the support contract was cancelled because the customer went out-of-business alltogether.


I believe that this data set is reasonably unbiased to provide meaningful results with respect to the question of the lifetime of a customer's CQ/CRX installation.

The Method

Luckily for Day, the data is what is called "right censored". That means that it is unknown for how long an existing support contract will go on - actually the majority of the available data points are right censored.

The scientific discipline that is concerned with analyzing data of this kind is called "survival analysis". One is interested in the survival function which maps a set of events onto time. The survival function is a property of a random variable, i.e. it needs to be estimated (in the statistical sense of the word).


One well know estimator for the survival function is the Kaplan-Meier estimator (which is non-parametric, i.e. there are no underlying assumptions about the distribution of the data). In a nutshell:


The Kaplan-Meier estimate of the survival function, S_hat(t), corresponds to the non-parametric MLE estimate of S(t). The resulting estimate is a step function that has jumps at observed event times, ti. In general, it is assumed the ti are ordered: 0 <1>i is di, and the number of individuals at risk (ie, who have not experienced the event) at a time before ti is Yi, then the Kaplan-Meier estimate of the survival function and its estimated variance is given by:

The quantity of interest is the mean survival time (and its respective estimate) which is given by:

Because S(t) may not converge to zero, the estimate may diverge. Therefore the integral is only taken up to a finite number. A reasonable choice of is the largest observed or censored time.

Results

Resisting a geek's urge to implement the estimator myself I used the freely available R to calculate the results. Here is a plot of the Kaplan-Meier estimate for the survival function with 95% confidence bounds (time is in days):

And finally, the estimated value for the mean survival time, i.e. the estimated lifetime of a Day CMS installation is: 2453 days with a standard deviation of 154 days. That's about 6.7 years. Mind you, this result is likely to be lower than if the whole customer base had been analyzed.

Comments

Popular posts from this blog

Python script to set genre in iTunes with Last.fm tags

Now that I have started to seriously use iTunes I figured it might be nice to have the genre tag set in a meaningful way. Since I have a reasonably large collection of mp3s doing that manually was out of question - I wrote me a Python script to do that. There seems to be a large demand for such a functionality (at least I found a lot of questions on how to automatically set the genre tag) so maybe someone else finds the script useful. It is pasted below. General Strategy The basic idea is to use Last.fm's tags for genre tagging. In iTunes the genre tag is IMO best used when it only contains one single genre, i.e. something like "Electronica", not something like "Electronica / Dance". On the other hand dropping all but one tag would lose a lot of information, so I decided to use the groupings tag for additional information that is contained in the list of tags that an artist has on Last.fm. In the example above that would be something like "Electronica, Dan

Running the iTunes genre tagger script with OS X Automator

Due to public demand here's a little recipe how to run last post's mp3 tagger without using the command line on OS X: Open Automator Start a new "Application" project Drag the "Run Shell Script" action into the right workflow panel, set the "pass input" drop-down to "as arguments" and edit the script to (see screenshot below): for f in "$@" do /opt/local/bin/python /Users/michaelmarth/Development/Code/mp3tagger/tag_groupings.py -d "$f" done (you will have to adapt the paths to your local setup) Save the application and happily start dropping mp3 folders onto the application's icon.