Category: engineering

This one weird trick will improve your productivity and deliver social justice

Shorter me: economists should study business more, and in an ideal world, industrial sociology, before they try to do cognitive psychology

Peter Dorman at Econospeak takes issue with a Robert Frank piece about workplace safety, which has all the whoopee doo Econ-101 problems you’d expect. I think, though, that there is a really big issue Frank is wrong about that he shares with economists generally and that Dorman has missed. It is a problem of framing.

Frank (and economists generally) frame safety as something you buy (“safety devices”) not something you do (“a safe process of work”, as the UK Health & Safety at Work Act puts it). Safety is a product, not a process. The model in Frank’s head is that there’s a marginal cost of production curve, and you add an overhead cost of safety to it, shifting the curve up (aka an x-inefficiency).

In general, effective safety measures are usually something you do, and scattering costly “devices” around an unchanged process is a classic failure mode. Not least because they might instil a false sense of safety and lead people to take risks. Consider the Shower Jobby and his “cycle superhighways”, aka “some blue paint slapped on an urban motorway”. This video is a great visual illustration of the point. I had no idea it was so bad.




In this case, adding some “safety devices” to an unsafe process has not only failed to make it safer, it seems to have rendered it more dangerous because the participants – cyclists, drivers, and Transport for London – think it is safer.

Of course, economists do actually have a framework to analyse this point! And they’re usually very keen to expound it!

In the process view, though, it becomes clear that greater safety is not necessarily a cost.

Accidents cost money, in the same way that quality failures cost money. At the very least, in the most cynical 19th century Yorkshire mill-owner’s view, they cause downtime, quality problems, and damage to expensive equipment. In a less cynical and more general sense, accidents are just one of the sources of excessive variability in the production process, like late change requests, tools whose tolerances are too large, or a virus outbreak among the Windows boxen. If accidents are happening, this is a symptom of problems with the process.

Reworking production processes to eliminate the sources of variability is precisely what industrial managers are meant to do all day.

For some reason, if you do this to reduce rework or machining waste, that’s awesome, but if you do it to reduce accidents, that’s a cost imposed by stupidheads – even if you do it with only cynicism in your heart, in order to eliminate the downtime and expense of hosing the body parts out of the conveyor belt. You see the power of framing.

Correctly considered, accidents are another source of unwanted process variability and therefore anything that reduces them is an opportunity for improvement. The model in your head now should be one with two marginal cost curves of different gradients, one where accidents are happening and one where they aren’t.

This is actually a separate question from whether the cost of accidents is dumped on individuals or the state, or whether the perpetrator pays. However, if the perp pays, they are more likely to worry about them, which may mean that safety regulation or pressure from union representatives can lead to efficiency gains.

As I said earlier on, and as Peter Dorman says, the annoying thing here is that the behavioural economics stuff could actually be useful here. Depending on whether you frame safety as an add-on gadget, or as an aspect of a well-tuned production system-of-systems, you’ll either practice it or you won’t! Also, if you have to practice it because it’s the law, you might be nudged – I believe this is the term, Your Honour – into adopting a process view and benefiting from it. And if you do that, you’re probably more likely to actually achieve safety than if you see it as a bolt-on minimal concession to show the English, as they say in Brazil.

Dorman also points out that behavioural economics has a lot to say about both managers’ and workers’ perceptions of risk. People find a lot of ways to deny and minimise dangers. And this is especially the case if they don’t believe anything can be done about it, or if they identify safety issues with social groups they perceive as hostile or just different.

But thanks to the power of framing, Frank can’t say anything about this. I think what’s happening here is that the process view challenges the role of employers as all-powerful within the firm. What the last 667 words have been basically saying is that constraint can be a source of creativity. We recognise this in all sorts of ways. When The Economist says that such and such a union workforce is sleepy and whatever, they’re saying that they need to be constrained to the objectives of efficiency. But for some reason they rarely think this about management.

This is the sort of thing I was driving at with the call centre series. Management and workers, but especially management, have got to a local maximum that’s basically pathological. Because improvement is framed as either a cost, or else a selection from the too-hard basket, nobody does anything.

As Chris Dillow once said, cognitive biases have a lot in common with ideology.

Our economic future: Other.

So, reading the LSE Growth Commission report. There’s the usual stuff about infrastructure, and they want both an infrastructure bank and an infrastructure planning commission (I remember that!) and crappy provision for small and medium-sized firms’ financing needs, hence a KFW-analogue. So far, so radical consensus.

There’s also a weird fetish for academies; apparently we need them because British schools aren’t doing well enough by the most disadvantaged kids. Well, it wasn’t that long ago that academies/specialist schools/whatever were there to stretch the sharp-elbowed middle classes’ gifted and talented kids and therefore to keep them in the system. It’s like Paul Krugman’s joke – how many European finance ministers does it take to change a lightbulb? Austerity! – just with academies.

If the most disadvantaged kids’ problems are really so ingrained that we need to tear everyone else’s schools up, you might think it would be a better idea to stop disadvantaging quite so many of them.

I snark, but actually there is a good, new idea in the report – as well as GDP and inflation, median household income should be reported and treated as a policy target. I’d sign up for that, on condition that it is calculated in real terms, using RPI or something similar rather than CPI aka “inflation with all the important stuff like food, housing, and energy taken out”.

Meanwhile, IPPR Juncture (oooh, IPPR, look at him, probably buys rocket salad on the internets and has a beeper) has a really interesting piece about 1970s industrial policy from Alan Bailey, the head of industrial policy in the Treasury at the time.

Two points here. The first is that Bailey notes and regrets that the policy framework didn’t really care about the service sector.

The fact that the services sector was a large and growing part of the economy was briefly recognised, but the remainder of the white paper, and the sectoral analysis, concentrated on manufacturing industry; the needs of services were to be handled separately (if at all).

As I said in this post on the NESTA innovation report, why is there no Council on Industrial Service Design?

The second is this:

I recall that the best-performing sector under ‘General Engineering’ was ‘mechanical machinery not elsewhere specified’, the statisticians’ residual ragbag

Bailey recalls this as an example of the statistics, or possibly the assessment process, being flawed.

But I think there may have been something else going on – what if the statistics were right, and a lot of productivity was concentrated in firms whose outputs were purely intermediate, often very specialised, but not easily allocated to a sector defined by the end user?

This basically describes the idea of the “industrial commons” or “clustering”, and it strikes me that the British engineering firms that survived Tory Macroeconomics Experiments 1 and 2 are very much like that. There aren’t so many that have a well-known final product; there are a lot that make recondite intermediate products and do rather well.

Jakob Whitfield’s awesome blog has a case in point. When Frank Whittle was looking for technical partners to work on the jet engine, he found them all over the UK; Firth Vickers in Sheffield forged the main turbine, High Duty Alloys Ltd. of Slough the compressor, and although most of the people he asked thought it was impossible, a Scottish firm he found at the British Industries Fair took on making the combustion chamber.

Later, when the project became a national priority, all kinds of other firms were involved via the Gas Turbine Coordinating Committee. Ricardo, still going today, provided engineering consulting on control systems. My favourite, though, is the Leicester Shoe Machinery Company, which helped Power Jets invent several new machine tools to productionise various bits of the engine and lent them engineers to advise on the problems of mass-producing them.

This stuff is important. I guess it’s the charismatic megafauna vs. smelly jungle thing again; the West Midlands lost Triumph and Rover, but it kept Ricardo and a huge range of subcontractors. Depressingly, though, I suspect that it might be hard to operationalise this as policy in a way that wouldn’t tend to concentrate investment in the West Midlands and the South-East.

not at all defanged

Remember that thumbsucker I did on the Great Firewall? Well, here’s some data, via this post (thanks, Jamie). It seems that Fang Binxing, China’s Chief Bellhead, boss of the Beijing University of Post & Telecoms, and king of the great firewall, really is in trouble due to his special relationship with Bo Xilai. He briefly came up on the web to threaten to sue a Japanese newspaper which thinks he was detained for investigation. Then, the former head of Google in China (who obviously isn’t neutral in this) prodded him, and he denied having the power to block the offending story.

The FT, meanwhile, thinks Zhou Yongkang, the head of the security establishment, is on the out. That shouldn’t be overstated because he’s due to retire, but he has been doing a rubber chicken circuit of second-division official appearances, and his key responsibilities have been taken over by others.

Fang is supposedly being replaced by Yan Wangjia, CEO of Beijing Venustech, who was responsible for engineering the Great Firewall. Her company’s Web site is convincing on that score. Here’s the announcement that they got the contract to provide China Mobile with a 10 gigabit DPI system:

Recently, Venustech successfully won the bid for centralized firewall procurement project of China Mobile in 2009 with its 10G high-end models of Venusense UTM, thus becoming the first company of its kind to supply high-end security gateway to telecom operators.

It is said this centralized firewall procurement project is the world’s largest single project of high-end 10G security gateway procurement ever implemented, drawing together most of world-renowned communication equipment vendors and information security vendors such as Huawei and Juniper. Through the rigorous test by China Mobile, Venusense UTM stood out, making Venustech the only Chinese information security vendor in this bid.

Looking around, it sounds like they are the hardware vendor of the Great Firewall, specialising in firewall, intrusion detection, and deep-packet inspection kit for the governmental, educational, and enterprise sectors “and of course the carriers”. Well, who else needs a 10Gbps and horizontally scaling DPI box but a carrier? Note the careful afterthought there. Also, note that they’re the only people in the world who don’t think Cisco is a leading network equipment vendor.

slow-motion procurement failure

Quietly, the Eurofighter project seems to be running into trouble. First of all, Dassault got the Indian contract and the Indians claim that Rafale is dramatically cheaper. Further, they weren’t impressed by the amount of stuff that is planned to come in future upgrades, whose delivery is still not certain. These upgrades are becoming a problem, as the UK, Germany, and Italy aren’t in agreement about their schedule or about which ones they want. Also, a Swiss evaluation report was leaked that is extremely damning towards the Gripen and somewhat less so to Eurofighter.

This is going to have big consequences for European military-industrial politics. So is the latest wobble on F-35.

The politics of call centres, part two: sources of failure

So, why did we get here? Back in the mists of time, in the US Bell System, there used to be something called a Business Office, by contrast to a Central Office (i.e. what we call a BT Local Exchange in the UK), whose features and functions were set down in numerous Bell System Practice documents. Basically, it was a site where the phone company took calls from the public, either for its own account or on behalf of a third party. Its practices were defined by Bell System standardisation, and its industrial relations were defined by the agreement between AT&T and the unions, which specified the pay and conditions for the various trades and workplace types inside the monster telco. If something was a Business Office according to the book, the union agreement covering those offices would apply.

In the Reaganite 80s, after the Bell System was broken up, someone realised that it would be possible to get rid of the union rules if they could re-define the site as something else. Not only could they change the rules, but they could move the site physically to a right-to-work state or even outside the USA. This is, it turns out, the origin of the phrase “call centre”.

In the UK, of course, call centres proliferated in parallel with utility privatisation and financial deregulation. A major element in the business case for privatisation was getting rid of all those electricity showrooms and BT local offices and centralising customer service functions into `all centres. At the same time, of course, privatisation created the demand for customer service in that it was suddenly possible to change provider and therefore to generate a shit-load of admin. Banks were keen to get rid of their branches and to serve the hugely expanding credit card market. At another level, IT helpdesks made their appearance.

On the other hand, hard though it is to imagine it now, there was a broader vision of technology that expected it all to be provided centrally – in the cloud, if you will – down phone lines controlled by your favourite telco, or by the French Government, or perhaps Rupert Murdoch. This is one of the futures that didn’t happen, of course, because PCs and the web happened instead, but you can bet I spent a lot of time listening to people as late as the mid-2000s still talking about multimedia services (and there are those who argue this is what stiffed Symbian). But we do get a sneak-preview of the digital future that Serious People wanted us to have, every time we have to ring the call centre. In many ways, call centres are the Anti-Web.

In Britain, starting in the 1990s, they were also part of the package of urban regeneration in the North. Along with your iconic eurobox apartments and AutoCAD-shaped arts centre, yup, you could expect to find a couple of gigantic decorated sheds full of striplighting and the precariat. Hey, he’s like a stocky, Yorkshire Owen Hatherley. After all, it was fairly widely accepted that even if you pressed the button marked Arts and the money rolled in, there was a limit to the supply of yuppies and there had to be some jobs in there as well.

You would be amazed at the degree of boosterism certain Yorkshire councils developed on this score, although you didn’t need top futurist Popcorn Whatsname to work out that booming submarine cable capacity would pretty quickly make offshoring an option. Still, if Bradford didn’t make half-arsed attempts to jump on every bandwagon going, leaving it cluttered with vaguely Sicilian failed boondoggles, it wouldn’t be Bradford.

Anyway, I think I’ve made a case that this is an institution whose history has been pathological right from the start. It embodies a fantasy of managing a service industry in the way the US automakers were doing at the same time – and failing, catastrophically.

The politics of call centres, part one

What is it that makes call centres so uniquely awful as social institutions? This is something I’ve often touched on at Telco 2.0, and also something that’s been unusually salient in my life recently – I moved house, and therefore had to interact with getting on for a dozen of the things, several repeatedly. (Vodafone and Thames Water were the best, npower and Virgin Media the worst.) But this isn’t just going to be a consumer whine. In an economy that is over 70% services, the combination of service design, technology, and social relations that makes these things so awful is something we need to understand.

For example, why does E.ON (the electricity company, a branch of the German utility Rhein-Westfälische Elektrizitätswerke) want you to tell their IVR what class you are before they do anything else? This may sound paranoid, but when I called them, the first question I had to answer was whether I owned my home or was a tenant. What on earth did they want to know that for?

Call centres provide a horrible experience to the user. They are famously awful workplaces. And they are also hideously inefficient – some sites experience levels of failure demand, that is to say calls generated due to a prior failure to serve, over 50% of the total inbound calls. Manufacturing industry has long recognised that rework is the greatest enemy of productivity, taking up disproportionate amounts of time and resources and inevitably never quite fixing the problems.

So why are they so awful? Well, I’ll get to that in the next post. Before we can answer that, we need to think about how they are so awful. I’ve made a list of anti-patterns – common or standard practices that embody error – that make me angry.

Our first anti-pattern is queueing. Call centres essentially all work on the basis of oversubscription and queueing. On the assumption that some percentage of calls will go away, they save on staff by queueing calls. This is not the only way to deal with peaks in demand, though – for example, rather than holding calls, there is no good technical reason why you couldn’t instead have a call-back architecture, scheduling a call back sometime in the future.

Waiting on hold is interesting because it represents an imposition on the user – because telephony is a hot medium in McLuhan’s terminology, your attention is demanded while you sit pointlessly in the queue. In essence, you’re providing unpaid labour. Worse, companies are always tempted to impose on you while you wait – playing music on hold (does anybody actually like this?), or worse, nagging you about using the web site. We will see later on that this is especially pointless and stupid.

And the existence of the queue is important in the social relations of the workplace. If there are people queueing, it is obviously essential to get to them as soon as possible, which means there is a permanent pressure to speed up the line. Many centres use the queue as an operational KPI. It is also quality-destroying, in that both workers and managers’ attention is always focused on the next call and how to get off the current call in order to get after the queue.

A related issue is polling. That is to say, repeatedly checking on something, rather than being informed pro-actively when it changes. This is of course implicit in the queueing model. It represents a waste of time for everyone involved.

Repetition is one of the most annoying of the anti-patterns, and it is caused by statelessness. It is always assumed that this interaction has never happened before, will never happen again, and is purely atomised. They don’t know what happened in the last call, or even earlier in the call if it has been transferred. As a result, you have to provide your mother’s maiden name and your account number, again, and they have to retype it, again. The decontextualised nature of interaction with a call centre is one of the worst things about it.

Pretty much every phone system these days uses SIP internally, so there is no excuse for not setting a header with a unique identifier that could be used to look up data in all the systems involved, and indeed given out as a ticket number to the user in case they need to call again, or – why not – used to share the record of the call.

That point leads us to another very important one. Assymetric legibility characterises call centres, and it’s dreadful. Within, management tries to maintain a panopticon glare at the staff. Without, the user faces an unmapped territory, in which the paths are deliberately obscure, and the details the centre holds on you are kept secret. Call centres know a lot about you, but won’t say; their managers endlessly spy on the galley slaves; you’re not allowed to know how the system works.

So no wonder we get failure demand, in which people keep coming back because it was so awful last time. A few companies get this, and use first-call resolution (the percentage of cases that are closed first time) as a KPI rather than call rates, but you’d be surprised. Obviously, first-call resolution has a whole string of social implications – it requires re-skilling of the workforce and devolution of authority to them. No wonder it’s rare.

Now, while we were in the queue, the robot voice kept telling us to bugger off and try the Web site. But this is futile. Inappropriate automation and human/machine confusion bedevil call centres. If you could solve your problem by filling in a web form, you probably would have done. The fact you’re in the queue is evidence that your request is complicated, that something has gone wrong, or generally that human intervention is required.

However, exactly this flexibility and devolution of authority is what call centres try to design out of their processes and impose on their employees. The product is not valued, therefore it is awful. The job is not valued by the employer, and therefore, it is awful. And, I would add, it is not valued by society at large and therefore, nobody cares.

So, there’s the how. Now for the why.

The RQ-170 hack and the drone bubble

The fact that a majority of this year’s graduates from USAF basic pilot training are assigned to drone squadrons has got quite a bit of play in the blogosphere. Here, via Jamie Kenny, John Robb (who may still be burying money for fear of Obama or may not) argues that the reason they still do an initial flight training course is so that the pilot-heavy USAF hierarchy can maintain its hold on the institution. He instead wants to recruit South Korean gamers, in his usual faintly trendy dad way. Jamie adds the snark and suggests setting up a call centre in Salford.

On the other hand, before Christmas, the Iranians caught an RQ-170 intelligence/reconnaissance drone. Although the RQ-170 is reportedly meant to be at least partly stealthy, numerous reports suggest that the CIA was using it among other things to get live video of suspected nuclear sites. This seems to be a very common use case for drones, which usually have a long endurance in the air and can be risked remaining over the target for hours on end, if the surveillance doesn’t have to be covert.

Obviously, live video means that a radio transmitter has to be active 100% of the time. It’s also been reported that one of the RQ-170’s main sensors is a synthetic-aperture radar. Just as obviously, using radar involves transmitting lots of radio energy.

It is possible to make a radio transmitter less obvious, for example by saving up information and sending it in infrequent bursts, and by making the transmissions as directional as possible, which also requires less power and reduces the zone in which it is possible to detect the transmission. However, the nature of the message governs its form. Live video can’t be burst-transmitted because it wouldn’t be live. Similarly, real-time control signalling for the drone itself has to be instant, although engineering telemetry and the like could be saved and sent later, or only sent on request. And the need to keep a directional antenna pointing precisely at the satellite sets limits on the drone’s manoeuvring. None of this really works for a mapping radar, though, which by definition needs to sweep a radio beam across its field of view.

Even if it was difficult to acquire it on radar, then, it would have been very possible to detect and track the RQ-170 passively, by listening to its radio emissions. And it would have been much easier to get a radar detection with the advantage of knowing where to look.

There has been a lot of speculation about how they then attacked it. The most likely scenario suggests that they jammed the command link, forcing the drone to follow a pre-programmed routine for what to do if the link is lost. It might, for example, be required to circle a given location and wait for instructions, or even to set a course for somewhere near home, hold, and wait for the ground station to acquire them in line-of-sight mode.

Either way, it would use GPS to find its way, and it seems likely that the Iranians broadcast a fake GPS signal for it. Clive “Scary Commenter” Robinson explains how to go about spoofing GPS in some detail in Bruce Schneier’s comments, and points out that the hardware involved is cheap and available.

Although the military version would require you to break the encryption in order to prepare your own GPS signal, it’s possible that the Iranians either jammed it and forced the drone to fall back on the civilian GPS signal, and spoofed that, or else picked up the real signal at the location they wanted to spoof and re-broadcast it somewhere else, an attack known as “meaconing” during the second world war when the RAF Y-Service did it to German radio navigation. We would now call it a replay attack with a fairly small time window. (In fact, it’s still called meaconing.) Because GPS is based on timing, there would be a limit to how far off course they could put it this way without either producing impossible data or messages that failed the crypto validation, but this is a question of degree.

It’s been suggested that Russian hackers have a valid exploit of the RSA cipher, although the credibility of this suggestion is unknown.

The last link is from Charlie Stross, who basically outlined a conceptual GPS-spoofing attack in my old Enetation comments back in 2006, as a way of subverting Alistair Darling’s national road-pricing scheme.

Anyway, whether they cracked the RSA key or forced a roll-back to the cleartext GPS signal or replayed the real GPS signal from somewhere else, I think we can all agree it was a pretty neat trick. But what is the upshot? In the next post, I’m going to have a go at that…

Can you hear me now?

Well, here’s a contribution to the debate over the riots. The Thin Blue Trots’…sorry…Police Federation report has been leaked.

Among the failings highlighted by the federation, which represents 136,000 officers, were chronic problems, particularly in London with the hi-tech digital Airwave radio network. Its failings were one reason why officers were “always approximately half an hour behind the rioters”. This partly explained, it said, why officers kept arriving at areas from where the disorder had moved on.

The Airwave network was supposed to improve the way emergency services in London responded to a crisis after damning criticism for communication failures following the 7 July bombings in 2005.

It is being relied upon to ensure that police officers will be able to communicate with each other from anywhere in Britain when the Olympics come to London next summer. The federation wants a review into why the multibillion-pound system collapsed, leaving officers to rely on their own phones.

“Officers on the ground and in command resorted, in the majority, to the use of personal mobile phones to co-ordinate a response,” says the report.

It sounds like BB Messenger over UMTS beats shouting into a TETRA voice radio, as it should being about 10 years more recent. Not *this* crap again!

There’s surely an interesting story about how the UK managed to fail to procure a decent tactical radio for either its army or its civilian emergency services in the 1990s and 2000s. Both the big projects – the civilian (mostly) one that ended up as Airwave and the military one that became BOWMAN – were hideously troubled, enormously overbudget, and very, very late. Neither product has been a great success in service. And it was a bad time for slow procurement as the rapid technological progress (from 9.6Kbps circuit-switched data on GSM in 1998 to 7.2Mbps HSPA in 2008, from Ericsson T61s in 2000 to iPhones in 2008) meant that a few years would leave you far behind the curve.

And it’s the UK, for fuck’s sake. We do radio. At the same time, Vodafone and a host of M4-corridor spin-offs were radio-planning the world. Logica’s telecoms division, now Acision, did its messaging centres. ARM and CSR and Cambridge Wireless were designing the chips. Vodafone itself, of course, was a spinoff from Racal, the company that sold army radios for export because the official ones were ones nobody would import in a fit. BBC Research’s experience in making sure odd places in Yorkshire got Match of the Day all right went into it more than you might think.

Presumably that says something about our social priorities in the Major/Blair era? That at least industrially, for once we were concentrating on peaceful purposes (but also having wars all over the place)? Or that we weren’t concentrating on anything much industrially, and instead exporting services and software? Or that something went catastrophically wrong with the civil service’s procurement capability in the 1990s?

It’s the kind of story Erik Lund would spin into something convincing.

If you’re out of luck and out of work, we could send you to the western mountains of Libya

The Libyan rebels are making progress, as well as robots. Some of them are reported to be within 40 miles of Tripoli, those being the ones who the French have been secretly arming, including with a number of light tanks. Now that’s what I call protecting civilians.

They are also about to take over the GSM network in western Libya like they did in the east. How do I know? I’m subscribed to the Telecom Tigers group on LinkedIn and so I get job adverts like these two.

ZTE BSC Job: URGENT send cv at [e-mail] for the job position or fw to your friends : Expert Telecom Engineer ZTE BSC.Location:Lybia,Western Area,1300USD/day,start immediate

URGENT send cv at [e-mail] for the job position or fw to your friends : ERICSSON MGW/BSS/BSC 2G/RAN Implementation Senior Expert Engineer.Location:Lybia,Gherian,Western Mountains,1300-1500 USD/day

In fact, one of the ads explicitly says that the job is in the rebel zone and the other is clear enough. What the rebels are planning to do is clear from the job descriptions:

must be able to install a ZTE latest generation BSC – platform to be integrated with 3rd party switching platform,solid knowledge of ZTE BSC build out and commissioning to connect up to 200 existing 2G/3G sites

To put it another way, they want to unhook the existing BTSs – the base stations – from Libyana and link them to a core system of their own, and in order to do this they need to install some Chinese-made Base Station Controllers (BSCs – the intermediary between the radio base stations and the central SS7 switch in GSM).

Here’s the blurb for the Ericsson post:

Responsible for commissioning and integrating an Ericsson 2G BSS network (2048-TRX Ericsson BSC plus Ericsson BTSs) in a multi-vendor environment. Will be responsible for taking the lead and ownership of all BSS commissioning and integration, leading the local team of BSS engineers, and managing the team through to completion of integration.

Experience of Ericsson MGW implementation, and integration of MGW with BSS, is highly desirable. Experience of optical transmission over A-interface.

Compilation, creation and coordination of BSC Datafill. This will include creating, generating, seeking and gathering of all Datafill components (Transport, RF Frequencies, neighbor relations, handovers, Switch parameters, ABIS mapping, etc.) based on experience and from examination of existing network configuration and data. Loading of Datafill into the BSC to facilitate BTS integration.

Working with the MSC specialists to integrate the BSC with the MSC. Providing integration support to BTS field teams; providing configuration and commissioning support to the BSC field team.

So they’ve got some Ericsson BSCs, the base stations are Ericsson too, and an MSC (Mobile Switching Centre, the core voice switch) has been found from somewhere – interesting that they don’t say who made it. That’ll be the “3rd party switching platform” referred to in the first job. They’re doing VoIP at some point, though, because they need a media gateway (MGW) to translate between traditional SS7 and SIP. They need engineers to integrate it all and to work out what the various configurations should be by studying what Gadhafi’s guys left. (It’s actually fairly typical that a mobile network consists of four or so different manufacturers’ kit, which keeps a lot of people in pies dealing with the inevitable implementation quirks.)

The successful candidate will also have some soft skills, too:

Willing to work flexible hours, excellent interpersonal skills and the ability to work under pressure in a challenging, diverse and dynamic environment with a variety of people and cultures.

You can say that again. Apparently, security is provided for anyone who’s up for the rate, which doesn’t include full board and expenses, also promised.

They already have at least one candidate.

Scaling and scoping the NYT paywall

Amusingly for a comment on scalability, I couldn’t post this on D^2’s thread because Blogger was in a state. Anyway, it’s well into the category of “comments that really ought to be posts” so here goes. So various people are wondering how the New York Times managed to spend $50m on setting up their paywall. D^2 reckons that they’re overstating, for basically cynical reasons. I think it’s more fundamental than that.

The complexity of the rules makes it sound like a telco billing system more than anything else – all about rating and charging lots and lots of events in close to real-time based on a hugely complicated rate-card. You’d be amazed how many software companies are sustained by this issue. It’s expensive. The NYT is counting pages served to members (easy) and nonmembers (hard), differentiating between referral sources, and counting different pages differently. Further, it’s got to do it quickly. Latency from the US West Coast (their worst case scenario) to nytimes.com is currently about 80 milliseconds. User-interface research suggests that people perceive a response as instant at 100ms – web surfing is a fairly latency tolerant application, but when you think that the server itself takes some time to fetch the page and the data rate in the last mile will restrict how quickly it can be served, there’s a very limited budget of time for the paywall to do its stuff without annoying the hell out of everyone.

Although the numbers of transactions won’t be as savage, doing real-time rating for the whole NYT website is going to be a significant scalability challenge. Alexa reckons 1.45% of global Web users hit nytimes.com, for example. As comparison, Salesforce.com is 0.4% and that’s already a huge engineering challenge (because it’s much more complicated behind the scenes). There are apparently 1.6bn “Internet users” – I don’t know how that’s defined – so that implies that the system must scale to 268 transactions/second (or about 86,400 times the daily reach of my blog!)

A lot of those will be search engines, Internet wildlife, etc, but you still have to tell them to fuck off and therefore it’s part of your scale & scope calculations. That’s about a tenth of HSBC’s online payments processing in 2007, IIRC, or a twentieth of a typical GSM Home Location Register. (The usual rule of thumb for those is 5 kilotransactions/second.) But – and it’s the original big but – you need to provision for the peak. Peak usage, not average usage, determines scale and cost. Even if your traffic distribution was weirdly well-behaved and followed a normal distribution, you’d encounter a over 95th percentile event one day in every 20. And network traffic doesn’t, it’s usually more, ahem, leptokurtotic. So we’ve got to multiply that by their peak/mean ratio.

And it’s a single point of failure, so it has to be robust (or at least fail to a default-open state but not too often). I for one can’t wait for the High Scalability article on it.

So it’s basically similar in scalability, complexity, and availability to a decent sized MVNO’s billing infrastructure, and you’d be delighted to get away with change from £20m for that.