Category Archives: incidents

tacoma-narrows

IT Lacks Engineering Discipline and Rigor

Every week we read the news about new, spectacular security breaches. This has been going on for years, and sometimes I wonder if there are any organizations left that have not been breached.

Why are breaches occurring at such a clip? Through decades of experience in IT and data security, I believe I have at least a part of the answer. But first, I want to shift our focus to a different discipline, that of civil engineering.

Civil engineers design and build bridges, buildings, tunnels, and dams, as well as many other things. Civil engineers who design these and other structures have college degrees, and they have a license called a Professional Engineer. In their design work, they carefully examine every component and calculate the forces that will act upon it, and size it accordingly to withstand expected forces, with a generous margin for error, to cover unexpected circumstances. Their designs undergo reviews before their plans can be called complete.  Inspectors carefully examine and approve plans, and they examine every phase of site preparation and construction. The finished product is inspected before it may be used.  Any defects found along the way, from drawings to final inspection, results in a halt in the project and changes in design or implementation.  The result: remarkably reliable and long-lasting structures that, when maintained properly, provide decades of dependable use. This practice has been in use for a century or two and has held up under scrutiny. We rarely hear of failures of bridges, dams, and so on, because the system of qualifying and licensing designers and builders, as well as design and construction inspections works. It’s about quality and reliability, and it shows.

Information technology is not anything like civil engineering. Very few organizations employ formal design with design review, nor inspections of components as development of networks, systems, and applications. The result: systems that lack proper functionality, resilience, and security. I will explore this further.

When organizations embark to implement new IT systems – whether networks, operating systems, database management systems, or applications – they do so with little formality of design, and rarely with any level of design or implementation review.  The result is “brittle” IT systems that barely work. In over thirty years of IT, this is the norm that I have observed in over a dozen organizations in several industries, including banking and financial services.

In case you think I’m pontificating from my ivory tower, I’m among the guilty here. Most of my IT career has been in organizations with some ITIL processes like change management, but utterly lacking in the level of engineering rigor seen in civil engineering and other engineering disciplines.  Is it any wonder, then, when we hear news of IT project failures and breaches?

Some of you will argue that IT does not require the same level of discipline as civil or aeronautical engineering, mostly because lives are not directly on the line as they are with bridges and airplanes. Fine. But, be prepared to accept losses in productivity due to code defects and unscheduled downtime, and security breaches. If security and reliability are not a part of the design, then the resulting product will be secure and reliable by accident, but not purposely.

In air travel and data security, there are no guarantees of absolute safety

The recent tragic GermanWings crash has illustrated an important point: even the best designed safety systems can be defeated in scenarios where a trusted individual decides to go rogue.

In the case of the GermanWings crash, the co-pilot was able to lock the pilot out of the cockpit. The cockpit door locking mechanism is designed to enable a trusted individual inside the cockpit from preventing an unwanted person from being able to enter.

Such safeguards exist in security mechanisms in information systems. However, these safeguards only work when those at the controls are competent. If they go rogue, there is little, if anything, that can be done to slow or stop their actions. Any administrator with responsibilities and privileges for maintaining software, operating systems, databases, or networks has near-absolute control over those objects. If they decide to go rogue, at best the security mechanisms will record their malevolent actions, just as the cockpit voice recorder documented the pilot’s attempts to re-enter the cockpit, as well as the co-pilot’s breathing, indicating he was still alive.

Remember that technology – even protective controls – cannot know the intent of the operator. Technology, the amplifier of a person’s will, blindly obeys.

The security breaches continue

As of Tuesday, September 2, 2014, Home Depot was the latest merchant to announce a potential security breach.

Any more, this means intruders have stolen credit card numbers from its POS (point of sale) systems. The details have yet to be revealed.

If there is any silver lining for Home Depot, it’s the likelihood that another large merchant will probably soon announce its own breach.  But one thing that’s going to be interesting with Home Depot is how they handle the breach, and whether their CEO, CIO, and CISO/CSO (if they have a CISO/CSO) manage to keep their jobs. Recall that Target’s CEO and CIO lost their jobs over the late 2013 Target breach.

Merchants are in trouble. Aging technologies, some related to the continued use of magnetic stripe credit cards, are making it easier for intruders to steal credit card numbers from merchant POS systems.  Chip-and-PIN cards are coming (they’ve been in Europe for years), but they will not make breaches like this a thing of the past; rather, organized criminal organizations, which have made a lot of money from recent break-ins, are developing more advanced technologies like the memory scraping malware that was allegedly used in the Target breach. You can be sure that there will be further improvements on the part of criminal organizations and their advanced malware.

A promising development is the practice of encrypting card numbers in the hardware of the card reader, instead of in the POS system software.  But even this is not wholly secure: companies that manufacture this hardware will themselves be attacked, in the hopes that intruders will be able to steal the secrets of this encryption and exploit it. In case this sounds like science fiction, remember the RSA breach that was very similar.

The cat-and-mouse game continues.

Why wait for a security breach to improve security?

Neiman Marcus is the victim of a security breach. Neiman Marcus provided a statement to journalist Brian Krebs:

Neiman Marcus was informed by our credit card processor in mid-December of potentially unauthorised payment card activity that occurred following customer purchases at our Neiman Marcus Group stores.

We informed federal law enforcement agencies and are working actively with the U.S. Secret Service, the payment brands, our credit card processor, a leading investigations, intelligence and risk management firm, and a leading forensic firm to investigate the situation. On January 1st, the forensics firm discovered evidence that the company was the victim of a criminal cyber-security intrusion and that some customers’ cards were possibly compromised as a result.

We have begun to contain the intrusion and have taken significant steps to further enhance information security.

The security of our customers’ information is always a priority and we sincerely regret any inconvenience. We are taking steps, where possible, to notify customers whose cards we know were used fraudulently after making a purchase at our store.

I want to focus on one of Neiman Marcus’ statements:

We have … taken significant steps to further enhance information security.

Why do companies wait for a disaster to occur before making improvements that could have prevented the incident – saving the organization and its customers untold hours of lost productivity? Had Neiman Marcus taken these steps earlier,  the breach might not have occurred.  Or so we think.

Why do organizations wait until a security incident occurs before taking more aggressive steps to protect information?

  1. They don’t think it will happen to them. Often, an organization eyes a peer that suffered a breach and thinks, their security and operations are sloppy and they had it coming. But alas, those in an organization who think their security and operations are not sloppy are probably not familiar with their security and operations. In most organizations, security and systems are just barely good enough to get by. That’s human nature.
  2. Security costs too much. To them I say, “If you think prevention is expensive, have you priced incident response lately?”
  3. We’ll fix things later. Sure – only if someone is holding it over your head (like a payment processor pushing a merchant or service provider towards PCI compliance). That particular form of “later” never comes. Kicking the can down the road doesn’t solve the problem.

It is human nature to believe that another’s misfortunes can’t happen to us. Until it does.

Why there will always be security breaches

At the time of this writing, the Target breach is in the news, and the magnitude of the Target breach has jumped from 40 million to as high as 110 million.

More recently, we’re now hearing about a breach of Neiman Marcus.

Of course, another retailer will be the next victim.  It is not so important to know who that will be, but why.

Retailers are like herds of gazelles on the African plain, and cybercriminals are the lions who devour them.

As lions stalk their prey, sometimes they choose their victim early and target them. At other times, lions run into the herd and find a target of opportunity: one that is a little slower than the rest, or one that makes a mistake and becomes more vulnerable. The slow, sick ones are easy targets, but the healthy, fatter ones are more rewarding targets.

As long as their are lions and gazelles, there will always be victims.

As long as there are retailers that store, process, or transmit valuable data, there will always be cybercriminals that attempt to steal that data.

Rest in peace: officers Renninger, Griswold, Owens and Richards

Bookmark This (opens in new window)

Update 12/12/2010: Donate to Lakewood Police Independent Guild to benefit the families of the four slain officers

Today, four Lakewood WA police officers were assassinated in cold blood while conducting police business in a local coffee house. This happened very close to where I live, less than a month after a Seattle police officer was gunned down.

In my work I collaborate with and support law enforcement. I appreciate what they do for us.

These four officers leave nine children behind. This aspect makes this especially tragic.

References:

Tacoma News Tribune

Seattle Times

the Zune failure and Microsoft

Bookmark This (opens in new window)

Many articles in the press have chronicled the failure of Microsoft’s first-generation 30GB Zune MP3 players.  They all simply froze on December 31, 2009. The remedy: they had to be powered on and allowed to completely discharge, and then wait until after 12:00 GMT on 1/1/09 before they could be used again. Total downtime – around 24 hours.

Microsoft is yearning to expand its market space into embedded systems in automobiles, military systems, and other areas. Am I being overly fearful of the consequences of a Microsoft whose products are even more deeply embedded into the machinery of our lives?  Today is one of those days when I am distrustful of technology as a path for an easier life.

Articles:

Leap year Zune glitch persists for some (CNN)

Original Zune confounded by leap year, shuts down (Seattle Times)

Zune support page explanation

Zune insider blog