Western Canada Water Conference and Exhibit Saskatoo Saskatoo n, Saskatchewan September 20 – 23, 2011
SPREADSHEET PREADSHEET RI SK - WHAT I S GOOD PRACTI PRACT I CE?
P. Coleman1 1. A ECOM ECOM Wa Water ABSTRACT
Spreadshe Spreadsheet et software software prov proviides des an engine gineer er with with a powerful powerful platfor platform m to prepare design sign calculations. However, this is not without risk. Software professionals, partly in response to tighte tighter fi financi nancial al reg regul ulati ations, ons, argue that many spreadshe spreadsheets have errors rrors in in them. Thi This s paper (1) raises the awareness of spreadsheet error risk, (2) reviews the role of spreadsheets in in design sign and and (3) recommends best practice. practice. INTRODUCTION
EUSPRI SPRI G (E (Europe uropean Sprea Spreadshe dsheet Ri Risks Int Inte erest rest Group) maintains a list of publ publicly cly reporte reported errors errors (htt (http:// p://www.eu www.eusp spri rig.org). g.org). The T hefoll ollowing owing are exam xamples (EU (EUSPR SPRI G 2009 2009): ): The The Nevada City City budget spreadsheet apparently worked correctly until so sometime in late December 2005 when, City finance director Ron Chandler says, it developed a problem, causing the 2006 budget to show a $5 million deficit in the water water and sewer fund. fund. Cha Chandl ndler er said that it it took him him most of the day day Wednesday sday to fix the problem. While he was working on it he found some other errors in the spreadsheet that needed ded to be correcte corrected. "Once "Once it's it's a PDF PDF it can't change," change," Chandler said. The The Center for for Region ional Strategies ies recently co confirm firmed that a researcher's errant cut-and cut-and-paste from a spreadsheet caused caused one measure of the region's region's level level of educational attainment to appear a lot worse than it is. The Center for Regional Strategi Strategie es, a se self-describe scribed "think"think-an and-do tank" tank" housed at Vi Virgini rginia a Tech, re reported ported that that a dism dismal 11 percent of the regi region's on's population population olde older than 25 had had bachelor's chelor's degrees or higher. higher. That That num number should should have been 20 percent. rcent. Stuart Stuart Mea Mease, a spokesm spokesman for the Cente Centerr for for Re Regiona gional Strategi Strategie es, said said "It "I t was just a si simple ple cutcutand-paste nd-paste error," rror," he sai said. "I "I don't don't know how it it happened, but it it did. did. We We apologi pologize ze for our mistake and want to correct it."
The These and other examples les reinfo inforrce what the las last 15 years of sp spreadsheet risk isk re research observed (Panko 2008): Research on spreadsheet errors rrors began over fi fifteen years ago. Duri During ng that tim time, there has been been am ample ple evidence evidence demonstrati onstrating ng that sprea spreadsheet error errors s are com common and nontrivial. Quite simply, spreadsheet error rates are comparable to error rates in other human cognitive activities and are caused by fundamental limitations in human cogni cognition, tion, not me mere slopp sloppiiness. Nor Nor does does ordina ordinary ‘be ‘ being careful’ reful’ eliminate errors or reduce them to acceptable levels. Despite these findings, spreadsheets are still being poorly used (Chadwick and Sue 2001): The The use of spreadsheets in busine iness is a litt little like like Chr Christ istmas for for childr ildren. The They ar are too excite ited to ge get on on with ith the game to re read or th think ink ab about th the 'rules les' which are generally boring and not sexy. This This said, id, sp spreadsheets are not only int integral to to the fun function ion and operation ion of th the glob lobal financial system; they are also the preferred platform for engineering calculations. Herein lies the risk – a powerful tool poorly used without adequate steps to prevent errors. The The ob objec jective ives of of this pa paper ar are to to (1) (1) raise ise awareness of of spreadsheet er error risk risk, (2) review review the role role of spreadsheets in in desi design gn and and (3) recomm recommend end best best practice. practice. SPREADSHEET RISK
The There are three reasons why the use of sp spreadsheets put an engine ineering ing company at risk isk from design errors: 1. Spreadsheets are a powerful programming platform that is primarily used by nonprogrammers (C (Chadwick dwick and and Sue 2001) 2. 90% or mor more e spreadsheets have err error ors s (Bewi (B ewig g 2008) 2008) 3. L ack of guida guidance on what sprea spreadshe dsheet “best practice” practice” is is (Grossm (Grossman an 2002) 2002) What non-pr non-pr ogr ogramme ammers nee need to know about buil building ding spr spre eadsheets
Computer programming is not just about writing code (Hunt and Thomas 2000): Progra Program mming is is a craf craft. A t its its sim simple plest, it i t comes comes down down to getti getting ng a computer to do what what you want want it it to do … Y ou try to capture capture el elusive usive requir require ements and and find fi nd a way of expressi expressing ng them so that a mere machi achine can do them them justice. ustice. Y ou try to document ent your work so that others others can can unde understand it, it, and you try try to engine gineer er your work so that others can build on it. This This is why the firs first two of for forty six tips ips provide ided in the Pra Pragmatic Pro Programmer (Hu (Hun nt Think k! About You Your Work. The authors and Thom Thomas 2000) 2000) are Care About Your Craft and Thin
argue that that there there is is no poi point nt in in developi veloping ng software software unless unless you are comm committed to doing doing it it well and think carefully how you layout your work. These and other disciplines are second nature to an experience experienced d program programmer. This This is not the case for for many spreadsheet users. Cha Chadwick ick cite ites a quote fro from an KPMG audit report report prepare prepared by J. J . K avana vanagh (Cha (Chadwick dwick 2003) 2003) : ‘End users users are are putting putting their companies nies at risk ri sk by setting setting up spreadsheets withou withoutt realizing zing that thi this demands the disci discipl pliine of traditi traditiona onal program programming. Our [K PMG] PM G] findi ndings ngs are are di disturbi sturbing ng,, but th they are are not not rea really surpri surprisi sing ng,, as 78% 78% of models had no formal quality assurance to ensure they were built to specified requi requirem rements and were fi fit for for purpose purpose’ [K avanagh J. J . 1997] 1997] The Therefor fore, to reduce risk isk, companies ies need to establish lish clea lear guide ideline lines on how spreadsheets are constructed, constructed, checked checked and documented ented (Chad (Chadwick wick and Sue 2001): A major impedim diment to im imple plementing ting ade adequate di discipl scipliines, of course, course, is that that few few spreadsheet developers have spreadsheeting in their job descriptions at all, and very few f ew do spread spreadshe sheet et developm velopment as their main task. task. I n addi additi tion, on, because because spreadsheet developm velopment ent is is so dispe dispersed, rsed, the im implem plementati entation on of poli policies cies has has to be left to individual department managers. While organizations might identify critical spreadsheets and only impose hard disciplines on them …, this would still mean that many corporate corporate deci decisi sions ons would would continue conti nue to be made on the basis sis of questiona stionabl ble e anal analyses. yses. What engi enginee neers nee need to unde underrstand about the the pre pr evalence of er rors ors in spre spreadsheets
Raymond R. Panko (University of Hawaii) published his seminal paper in 1998 What We Know About Spreadsheet Errors. I n this this pape paper, he summarized arized a number ber of studies studies that identified a high rate of errors among spreadsheets being used for financial reporting. He updated this this seminal pape paper in 2008 and conclude concluded (P (Panko 2008): 2008): A ll in all, all , the research research done to date in spreadsheet developm velopment presents presents a very disturbing picture. Every study that has attempted to measure errors, without exception, exception, has found found them them at rates that would would be unacceptable ble in any any organization. These error rates, furthermore, are completely consistent with error rates found in other human activities. With such high cell error rates, most large spreadsheets will have multiple errors, and even relatively small "scratch pad" spreadsheets will have a significant probability of error. Despite the evidence, individual developers and organizations appear to be in a state of denial. They do not regularly implement even fairly simple controls to reduce errors, much less such bitter pills as comprehensive code inspection. One corporate officer probably summarized the situation by saying that he agreed with the error rate numbers but felt that comprehensive code inspection is simply
impractical practical.. In I n other words, words, he was saying ying that that the company pany should should continue continue to base critical decisions on bad numbers. A major impedim diment to im imple plementing ting ade adequate di discipl scipliines, of course, course, is that that few few spreadsheet developers have spreadsheeting in their job descriptions at all, and very few f ew do spread spreadshe sheet et developm velopment as their main task. task. I n addi additi tion, on, because because spreadsheet developm velopment ent is is so dispe dispersed, rsed, the im implem plementati entation on of poli policies cies has has to be left to individual department managers. While organizations might identify critical spreadsheets and only impose hard disciplines on them …, this would still mean that many corporate corporate deci decisi sions ons would would continue conti nue to be made on the basis sis of questiona stionabl ble e anal analyses. yses. Foll ollowing owing thi this pape paper, EuSpR uSpRI G (Europ (E urope ean Sprea Spreadshe dsheet Ri Risks Int I nte erest rest Grou Group) p) was founded in in March M arch 1999 as a coll collaborati aboration on between between sprea spreadshe dsheet researchers archers at the University niversity of Gre Greenwich, the University niversity of Wale Wales Ins Insti titut tute e Cardiff Cardif f and HM Custom ustoms & Excise xcise.. I ts mission ssion was was to bring bring toge together acade academics, profes professi siona onal bodi bodies and and industry practitioners throughout Europe to address the ever-increasing problem of spreadsheet integrity. To date, EuS EuSpRIG has ide identified ified five five risk isk categories ies:
Human Error – To err is human, hence the majority (>90%) of spreadsheets contain errors. Fraud – Because of the ease with with which which program code and data is is mixed, spreadsheets are are the perfect rfect envi environm ronmen entt for for perpetrating rpetrating fraud. fraud. Overconfidence – Because spreadsheet users do not go looking for errors, they don’t don’t fi find any or man many. Spreadshe Spreadsheet et users users are theref therefore ore overconfi overconfident in in their use of spreadsheets. I nterpretati nterpretation on – Transl T ranslati ation on of a busi busine ness problem problem into the spreadsheet domain ain can lead to a positi position on where decisi cision on make akers rs may act in in the belief that decisi cisions ons can be made with with confi confidence on the output from from the spreadsheet despite spite evidence vi dence to the contrary contrary.. A rchivi rchiving ng – The case case of fail faile ed Ja J amaican can commercia rcial banks demonstrate onstrates how poor archiving (i.e., version control) can lead to weaknesses in spreadsheet control that contribute to operational risk.
Why is guidance on spr spre eadsheet “be “best st practi practice ce” ” im i mportant? portant?
Most engi engineers accept accept that there is is a right way to lay out design calculations. This right way is often enshrined in a Quality Procedure. The Quality Procedure usually provides instructions on (1) layout, (2) checking, (3) version control and (4) archiving or filing.
Spreadshe Spreadsheet et Bes Bestt Practi Practice ce documents ents provide provi de the the same type of instructions nstructions for for the same reasons reasons – first to preve prevent err errors ors and second second to catch and correct errors errors when they they occur. occur. The The search for for co codified ified best practice ice is point intles less and to be avoide ided. Ins Instead, th there is a consensus among many practiti practitioners oners as to what are desi desirabl rable e and undesi undesirabl rable e spreadsheet characteristics. Therefore, if the user constructs and documents the spreadsheet in a logical manner so that it can be checked, then the application of simple rules as to what should or should not be done will improve the quality of the work that is produced. There are numerous examples of this type of guidance in the literature e.g., (Bewig 2008) and (O’Beirne 2005). Resea esearch shows that that best best practices practices can resul result in in dram dramatical aticallly better better results results than than naïve practices practices (Grossm Grossman 2002). 2002). I ndivi ndividua dual perf performa ormance increased ncreased by a factor of 10 and team performa rformance by 3 to 5 tim times. DESI ESI GN CAL CULATI CULA TI ON OR TECHNICAL SOFTWARE
Calcula Calculations are are (AE (AECO COM M 2011a): Computations that transform data inputs into a result that will be used in the work. Calculations include those generated by mathematical or physical methods to determ determine a resul result. Prior to computers, this definition was clear in the sense that calculations were done with a cal calculator culator and and were written written on pap paper. When computers started started being used, used, technical software were program programs s where where the user input data and the program printed out a result (e.g., process simulator). The user could not change the code that for form med the basis sis of the program. program. Theref T herefore, ore, once a program programwas “validated”, its use was authorized until a new version was released. The The advent of of modifiable technical software (e.g., spreadsheets) blurred the distinction between a design calculation and a piece of technical software. I f a spread spreadsheet sheet was used used to produce the equival quivalen ent of a paper and pen pen calcul calculati ation, on, then the spreadshee spreadsheet should should be printed printed off off and the paper copy checked in in the sam same was a pape paper and pe pen cal calculation (AE (A ECOM 2011 2011a a): Spreadshe Spreadsheet et cal calculati culations ons shal shall be docum documented ented and organi organized so that formulae formulae used in the spreadsheet can be checked for accuracy of incorporation into the spread spreadsheet, using a cal calcula culator or other method. thod. A fter ter vali validation tion of the spreadsheet calcul calculati ations ons the spreadsheet shall be protected to prevent prevent ina i nadvertent dvertent modification of the embedded formulae. Sprea Spreadsheets are capabl capable e of much more than this. this. For F or example, ple,
A spreadsheet can can contain contain macros. I n this this case, the macros must must be val validated and ical Software. Only when this is done, can the documented ented first fi rst as Technica spreadshe spreadsheet be evaluated valuated as a Calculation. A spreadsheet can can be bui buillt once and and then then used used numerous tim times. In I n this this case, the ical So Software. I f the sheet is template plate would woul d be evaluate evaluated d as Technica is protecte protected, the the Calculation reviewer only reviews the input and outputs trusting that the rest of the shee sheet is is correct. correct. A spreadsheet can be linked to another another sprea spreadsheet or another program program. However, linking a design calculation to another spreadsheet or program is discouraged because the accuracy of the Calculation is dependent on the integrity of the links in the spreadsheet.
SPREADS PREA DSHEE HEETS TS AND VERSION CONTROL CONTR OL
A cal calculat culatiion (1) is is prepa prepared and the then checked (2). (2). If I f there are errors, rrors, the calcula calculation tion is is corrected. This calculation may go through number iterations within a discipline (e.g., process) before it is released (3) to the design team. F igure ure 1: V ersion Control Control Duri Dur ing Developme lopment
The The header on on standard calcu lculat lation ion paper co contains ins th the name of th the origin igina ator (a (author), reviewe reviewer (checker) (checker) and the date. In I n some systems, the cal calculati culation on is is assi assigne gned a unique unique number that will follow it through the project. Figure 2: Calculation Paper Header
Once a new version of the spreadsheet is approved, all previous versions contained in the filing or document management ent system would would be marked “superseded”. I f another member ber of the design design team team wanted to refer refer to the calcul calculati ation, on, they should should al always retrieve retrieve it from from the file or docum document management system.
This This is more difficu ifficult to to manage if the file is elec lectronic. ic. The There are no ea easy means to to determine if the fil fi le is is the most curren current becaus because e the same fil fi le can can (a) be on several systems and (b) can be modified. The The simples lest means to manage this is to (a) agree a file naming ing convention ion with ith the date and revision and (b) file it on the server in a read only directory. The current version is always always the electroni electronic c version version stored on the serv server. er. It I t is is common practice practice to ensure that that each pa page of the calcul culation ha has in its its foote ooter “DOCU “DOCUMENT ENT UNCONT NCONTROL ROL L ED WHE WHEN PRI PRI NTED” to remind the the de design sign team that the current current version version is is the version version on the the server. Once a calculati culation on has has been been devel develope oped, d, it i t may may go through a second second process where it it is is assi assigne gned a revisi revision on number ber and is is entered into into the design basis. Not all cal calcul culations form part of the design basis although all calculations may be subject to scrutiny under electroni ctronic discovery. discovery. The The design ign ba basis is the collec llection ion of of material ial that th the design ign te team will refer fer to to wh when progressi progressing ng the desi design. gn. These T hese documents and only only these documents are relea released to subconsultants, partners and clients. Figure 3: Entering a calculation into the design basis
The The calcu lculat lation ion is checked and the project project manag anager verif verifies that the reviewe reviewer’s r’ s comments ents have been acted cted on. The The revise ised calcu lculat lation ion is assign igned a revisio ision n number and entered int into the design basis. Sprea Spreadsheets present another set of chall challenges in in that many quality assurance systems requi require re tha that the calcula calculations tions be printe printed off and signe signed and initi nitialed. aled. I n thi this case case,, the paper, not the electronic copy, is the record document. This said, the electronic version must be kept safe because the next version will be built on it. For this reason, most companies have one version of the calculation cover and quality assurance page form all calculations whether electronic or paper.
F igure igure 4: Calculation Cov C ove er Shee Sheet
The The cover sheet typica ically adds fou four new piec ieces of info inforrmation ion:
Revision number Why the calculation was issued to design team Who checked the calculation and approved its release Who, in particular, is it issued to
I t is is reasonable ble to rel release ase pa paper calcul calcula ations tions outsi outside de the company but it it is is not withou withoutt risk ri sk to release electronic calculations. Electronic calculations can be (a) modified, (b) misused or (c) contain intell intellectual ctual property that that belongs ongs to the company. Therefore refore, it it is is not not uncommon that spreadsheet cover sheets include disclaimers like the following: To addres ress risk risk of misu isuse: “The user recognizes that electronic files stored hereon are subject to undetectable alteration and are provided for informational purposes only. The user assumes all risk associated with any unauthorized alteration or reuse, or misuse.” To prote rotect int intellec llectua tual pro prop perty: “The “T he design sign shown on the document ent is the property of XX X X L imited ted and is is not to be used, copie copied, communi unicate cated d or discl disclose osed, d, in whol whole e or in in part, part, excep exceptt in in accordance ccordance with with a contract, contract, li l icense or agreement in writing writing fromX X X X L td.” td.”
I t is is clea clear from from the the above above discussi discussion on that both Qual Quality Man M anagem agement ent Systems and and Leg Legal al Departments ents must regularly regularl y revi review ew how software software is is used during during design. sign. Sprea Spreadsheets are jus just one example where new technolog logy brought new capabilitie ilities s as well as new risk isks.
SPREADSHEET PREADSHEET S AND AND PROFESS PROFESSI ONAL PRACTI PRACT I CE
I n 2011 2011,, the Prof Profe essional ssional Engi ngineers of Ontari Ontario o publi publish she ed Guideline: Professional Engineers Using Software Based Engineering Tools (PEO 2011). This document was preceded preceded by the 1993 document Guideline: The Use of Computer Software Tools by Professional Engineers and the Development of Computer Software Affecting Public Safety and Welfare (PEO 1993). These T hese documents reinf reinfor orce ce that the user must be qualified, the software validated and results verified. Figure 5: Use of Software in Engineering Practice
Qualified User
Verified Results
Validated Software
I n othe other words,
Engineers must have a suitable knowledge of the engineering principles involved in the work being conducted, and are responsible for the appropriate application of these principles. Engineers are responsible for verifying that results obtained by using software are accurate and acceptabl acceptable. e.
The The Eng Engine ineer should be able to: 1. 2. 3. 4.
Demonstrate that that the right ri ght software software is is bei being used Explain how the software pertains to the engineer task being executed Prove rove the the use userr is i s traine trained and and is is provide provi ded d with with ade adequate support support Demonstrate that that they have assum assumed the responsi responsibi billity for for the software’ software’s s output 5. Show how new versions versi ons of the software software are vali validated and tested tested before they they are used 6. Tra Trace the inp input data and output to a particu icular lar software run 7. Provi rovide evide vidence nce that qual quality assurance assurance procedures procedures are in in place place and that they are being followed
Software Validation Example: Upgrade from Excel 2003 to Excel 2007
I tem tem 5 warrants warrants further discussion discussion in in li light ght of the upgrad upgrade e from from Excel 2003 2003 to Excel 2007. 2007. Excel 2007 was a significant upgrade over the previous versions of Excel. Excel 2007 increased the available columns from 256 to 16,384, the available rows from 65,536 to 1,048,576 1,048,576 and formul ormula a nesting sting from 7 to 64. 64. Excel also also introdu introduced ced a new fi file format. at. Microsoft warns the user that (Microsoft 2011): … For example, previous versions of Excel support a maximum of 65,536 rows in a worksheet. Office Excel 2007 removes this limitation. Upon opening a worksheet with 100,000 rows of data, previous versions of Excel must truncate rows rows beyond beyond 65,536. … While working in Office Excel 2007, if a user attempts to paste content that is not supported by previous versions (for example, a chart or diagram created in Office Excel 2007), compatibility mode downgrades the content to a form that is recognizable by previous versions. … Given the differences in the two versions, Good Practice would be to:
save old old sheets sheets to the new forma f ormat and then check the cal calculati culations ons and and vali validate the macros develop new sheets in Excel 2007 and do not save back to previous file formats if saving saving back back to a previ previous ous versions, versions, be prepa prepared red to re-che re-check ck the spreadsheet
SPREADSHEET PREADSHEET S AND AND EL ECTRONIC ECT RONIC DI SCOVERY
The The Ont Ontario Rule Rules s of Civil Civil Pro Procedure (On (Ontario 2011) (1.03) define fine “do “document” as including “data and information in electronic form,” which in turn includes data and information “created, recorded, transmitted or stored in digital form or in other intangible form by electronic, magnetic or optical means or by any other means that has capabilities for creation, recording, transmission or storage similar to those means.” Rule 30.02(1) requires that “Every document relevant to any matter in issue in an action that is or has been in the possession, control or power of a party to the action shall be discl disclose osed” d” in in di discoveri scoverie es. Thus, hus, if if a party to a dispu dispute te has an an electronic ctronic docum document, anywhere in their possession, it is potentially subject to discovery. Discovery is usually the pretrial disclosure of pertinent facts or documents by one or both parties to a legal action or proceeding. I f a litiga tigant is see seeking king to establ stabliish that the defendant is li liable for an error or omission ssion in its professional services, it will look for documents relevant to that error or omission, which could include every version of a drawing or specification a defendant has, any shop drawing drawing that that they may may have have reviewe reviewed or any req reque uest for for informa nformation tion they may have have
obta obtaiined. This def definitely woul would includ include e any de design sign she sheets or si simulation ulation fil files in in the their possession. ERRORS
Panko and A urige urigem mma class classiified spreadsheets errors errors into into two categories gories (Panko and A urige urigemma 2010): 2010): Cul Culpa pable ble V iola olations tions and Blameless Errors. rrors. A Culpable Violation covers errors related to the user not following company policies (e.g., (e.g., quali quality assuran assurance) ce) or worse frau fraud. These are not the types of errors to be discusse discussed in this section. A Blameless Error is an uni uninte ntende nded err error or whi which ch may be eithe eitherr quantitative or qualitative (see Table 1). The difference between a spreadsheet with a qualitative error versus a quantitative error is that a sheet with a qualitative error may give, in some instances, the correct result. The Therefor fore, qu qualita litative ive errors are lat latent errors (i. (i.e e., errors th that do do no not make the current spread spreadsheet wrong). A common qua qualitati tative ve error is when a parameter ter (e (e.g., floccula occulation tion well diameter) is calculated in one place and but are hard coded into another formula. The The error only manifests its itself wh when the sheet calcu lculat lates a new retention ion time changing ing the flocculation well diameter. Quantitative errors are incorrect formulas or data cells that cause the sheet to give the wrong answer. These may be pla planni nning or executi execution on errors. rrors. A pla planni nning (logi (logic) c) error is is cause by an error in spreadsheet layout or logic while an execution (mechanical) error is caused caused by by a sli slip or a lap lapse when the shee sheet is is being being created created.. Table Table 1: Clas Classifica ificatio tion of Blam Blameles less Erro Err ors Type Type Descriptio ription n Note Note Quantitative Mechanical Point Pointiing to the the wrong cel cell or Easiest to detect mistypi styping ng a number Logic Wrong formula because of an error in Harder to detect reasoning Something is left out out Dif Di fficult to detect tect
Quanti Quantita tati tive ve L atent tent
A n error that only only manifests itsel tself when an input input changes
V ery di diffi fficul cult to to detect
The The most commonly made errors by qualified profes fession ionals are quantita itative ive unint intention ional errors. Commonly made errors include: 1. Pointing to the wrong cell 2. Changing some but not all of a series of copied cells 3. I ncomplete plete range ranges
4. 5. 6. 7.
Tem Temporary fixe fixes (fo (formula changed to value lue) Confusi onfusion on between tween relati relative ve and absolute bsolute refere references nces I ncorrect ncorrect units units Function arguments in the wrong order
Uninte nintentiona ntional errors rrors are not not li limited ted to spreadshe dsheets. A n uni unintentiona ntional error cause caused d the coll collapse of the I ron Workers Mem Memorial orial Bri Bridg dge e and the death of 19 workers in 195 1958. 8. I n this case, the engineer unintentionally read from the wrong column in a design table. This is analogous to reading from the wrong cell in a spreadsheet. BEST BEST PRACTI PRACT I CE
A t one level, best practi practice ce is is to document, ent, test/che test/check ck and control spreadsheets. The There re are a num number of referen references that discuss discuss these these three items, many any of which which can be found throug through h the the EUSPR SPRI G websi website te (www.eusprig.org www.eusprig.org)). A few of of the most im important portant pri princi ncipl ple es are are li listed sted be below. Principl nciple e 1: Desi Design your spre spr eadsheet so that it it can be checked! checked!
A spreadsheet develope veloper knows knows that the spreadsheet wil will need need to be checked if if the resul results ts are to be used by others others on the the desi design gn team team. The The onus is on the spreadsheet author to ensure that the sheet can be easily checked. A poorly poorly designe designed spreadsheet can can take 5 to 10 tim times as long long to check as a properly rl y designed sheet. This has a direct impact on the cost to the project. Dermot Balson wrote (Balson 2010): spreadsheets easy to check and maintai aintain n) was This This particu icular lar guide ideline line (to make spreadshee fairl airly easy easy to sell sell, because cause everyone had had had the experien ri ence ce of opening ning up incomprehensible old spreadsheets, and it was hard to argue against making life easier for your colleagues, especially when you might have to check their spreadsheets! Principle nciple 2: Use version contr control ol
V ersion rsion control tracks who prepa prepared/mod red/modiified the docum document, who checked checked it, it, who authorized thori zed its its rel relea ease to the design sign team, why it i t was rel relea eased and the date dates of each of these actions. This information is contained in the Calculation Cover sheet in the document. V ersion rsion control also provide provides a means to esta establ bliish which which version version of the spre sprea adshe dsheet is is the most current current version. version. The author may opt to either keep a record record as a separate sheet in in the sprea spreadshe dsheets as as to what what changed from from the previ previous ous version version or insert a note where cell cells were were changed.
A nyone who reli relies on a result result from from a spreadsheet must be abl able to answer the question stion “Is “I s this the most current version”. Principl nciple e 3: Document the spre spreadsheet
The The spreadsheet sh should document wh what is the objec jective ive of the calcu lculat lation ions, wh why th these calculations are being done, what is their scope, what codes etc. apply, and what might change the applicability of the calculations (e.g., further data from the Client). Dunn recommends that (Dunn (Dunn 2010) 2010): I f there there is is the sl slightest doubt about how the sprea spreadsheet treats each section section of the analy analysi sis, s, or if the sprea spreadshe dsheet uses one of a num number of of possibl possible e assumptions ptions about about how a particular item or set of items is calculated, this should be documented; the spreadsheet will generally be more readable if there is also an appropriate level of descriptive documentation as an aid to finding the way around. Principle ncipl e 4: Be able to check the pri printed nted copy of of the spreadshe spreadsheet using a cal calculator
I f a spreadsheet is a design sign calculati culation, on, then it should should be abl able to be be checked using using a cal calcula culator when when it is is pri printe nted off. off. I f this this is is not not feasi feasibl ble, e, then then the sprea spreadshee dsheet is is either (a) laid out wrong or,(b) should be viewed as technical software (and should be checked as such). Principle 5: M ust ust be stand alone
A spreadsheet should should be desi designe gned to stand alone. one. I f the sheet sheet refere references nces an outside outside reference, then a copy of the references should either (1) be pasted into the sheet (e.g., a tabl table from a textbook), textbook), or (2) (2) be stored with with the the calcul calculati ations ons on the the server. The author recently ntly checked checked a set of design sign cal calculat culatiions to fi find that that the the author’ uthor’s s refe reference rence did not contain the design criteria the calculation claimed it did. This error (and loss of face) could could have have been avoide avoided d if if the author had include ncluded the pertine rti nent page pages s with with thei their spreadsheet. Principle ncipl e 6: Do not link li nk to to anothe anotherr spreadshe spreadsheet unle unless ther there is no other other choi choice
The The Allied Allied Iris Irish h Ba Bank fra fraud co cost th the bank ov over U$ 700 million illion. The The fra fraudster sim simply replaced replaced the source spreadsheet that that fed fed thebank spreadshee spreadsheet with wi th one his his own making. ki ng. This This fra fraud high ighligh lights some of the iss issues when one spreadsheet reads in data fro from another spreadsheet. I f there there is is an error in i n the source she sheet, the error wil wi ll propagate propagate through all all the she sheets tha that it it fe feeds into. I f there is a circular circular reference reference, Exce Excell wil will not detect it. it. If I f the link is lost because one of the sheets is moved, then it will not update properly.
Principle ncipl e 7: A ll numbers and sheets shoul should d be visibl visible e
Quality assurance rests on the premise of transparency – nothing is hidden from the reviewer or user. Hiding cells or she sheets will wil l increase the risk risk of an error occurring occurring.. This T his is what what occurred occurred when Barclays arclays Capital Capital purchased purchased assets ssets from from L ehman Brothers. On Tuesday, 5 November 2008 (Feechan 2008), lawyers for Barclays Capital appeared before the the US Ba Bankruptcy Court in in New York Y ork to try and extrica extricate te the company from from taki taking ng on on Le L ehman Brothe Brothers li liabili bil itie ties acci accide denta ntally include i ncluded in in a PDF copy copy ma made of an asset spreadshe spreadsheet. The The cler lerk resize ized the rows to make it easier ier to read befor fore he print inted the spreadsheet to a PDF file. What the clerk did not know was that the sheet contained hidden rows and colum columns. Whe When he resized resized the rows, rows, these becam became visi visibl ble e and were were printe printed d to the PDF file. This added 179 contracts that were not to be included in the sale. Principle ncipl e 8: Number Number s enter nter ed only once (and grouped grouped together ther)
Data and cell cells should should be grouped grouped together ther in a logi logical cal manne anner so that data is is entered once and data/formula are grouped by function. Dunn (Dunn 2010) provides the following advice: Regi egions of the spreadsheet designed designed for data input, nput, cal calcul culati ation on and and the the presentati ntation on of output to be separated, rated, and it it should should be read readiily apparent to the reade readerr whe where re these these regi regions li lie. I nputs nputs and and calculati culations ons to be be modular odular in in design, sign, each secti section on represe representi nting ng a logically discrete portion of the total computation. The subject matter of each section should be clear, and it should be obvious to the reader as he scrolls through which section he is looking at any time. Related formulae should be in physical proximity; and the spreadsheet should read from from left left to righ ri ghtt and and from from top to bottom bottom. I n the calcula calculations tions and inpu inputs, ts, related topics should where possible be recorded adjacent to each other; inputs should should be recorded recorded in in the same order as are are the cal calculati culations ons which which draw on them. Both BioWin oWi n and GPSGPS-X X , process process simul simulati ation software software, provide provide example ples on how to to organize organize data for wastewate wastewaterr trea treatment ent plan plant design. sign. BioW oWiin organi organizes zes parameters under five cate categories: K inetic, tic, Stoichiom Stoichiometric, tric, Settl Settliing, ng, Biof Bi ofiim and othe other. Sim Similarly, rly, da data on process units are organized into three categories: Dimensions or Physical Properties, Flow Spli Split and Operation. Operation. Mass bal balance spreadsheets should should use an approach sim similar to this.
Principle nciple 9: I ndicate/prote /pr otect whe wherre cells contain contain input input or or form for mulas
Cells typi typical cally conta contain text (e.g., titl title es, note notes), form f ormula, li links and data. ta. Data may include include constants, constants, arguments ents for for functi functions ons and input. A t minim nimum, the sheet sheet should should clea clearly rl y mark where data is input into a cell by a user. The The lite literature does not agree on the iss issue of co colou lour partly because of th the availab ilability ility of colour colour printe printers. This his said, colour colour can can be be an easy way way to ide identif ntify cells cell s tha that are are inp inputs uts and cells which read in data from another sheet in a workbook. Formulae are normally identified by placing an explanation next to the pertinent cells. Principle 10: I nclude units and and uni unit conve conversions
Unit conversion is a common source of error and should never be embedded in a formula. Unlike nli ke MathCAD, thCA D, all data in Excel is is unitl unitle ess. I t is is up up to to the the use user to track the uni units and provide the unit conversions. This is particularly true when moving between metric and US customary customary units. units. There was a case in Can Canad ada where a tank was undersi undersized zed by 20% 20% because the designe signer used used UK UK gal gallons which which somewhe somewhere in in the design sign process became US gallons. Principl nciple e 11: Beware ware of functi functions ons whose whose behavi behaviour our depe depends nds on on one of its its ar argume guments
Many any functi functions ons in in Excel Excel have arguments that change change thei their beha behavi viour. our. If I f the argument is is not specified, the function will default to a preset behaviour. Y EARFRAC EARFRAC is an excel fun function ion that calcu lculat lates the fra fraction ion of th the year re represented by the number of whole whol e days between two dates. The function unction has three arguments: start date, end date and and basis. sis. The defaul defaultt value value for the basis basis argument is i s 360 days per year. Good practice practi ce is is that the user always always specif specifies the argument and the argument appears as an input in in the spreadsheet. Principle 12 12: Use of roun r ounding ding functions should should be explicit xplicit
EXCEL includes a number of rounding functions (e.g., ROUND, FLOOR, CEILING). When hen they are used, the raw value value shoul should be displ displaye ayed adjacent to the rounded rounded value value. The There is is a danger th that th the ro rounding ing may no not be be appropriat iate (i. (i.e e., lat latent er error) for for all all inputs. Principle ncipl e 13: Under nder stand stand the diff diffe er ent ef effect fect operation ation have have on abso absollute and relative ative refer ences nces
The The lite literature disa isagrees about the use of ab absolut lute refer ferences. The The important point int is that absolute and relative cell references behave differently during a cut and paste operation.
Principle Pr inciple 14: T rap err ors
The There are a number of different techniqu iques for for trapping ing errors (O' (O'Beirn irne 2009):
Use the the built in EX EX CEL CE L error che checking cking Expected range: check a result to see if it is in an expected range (e.g., >0) Cross foot: foot: sumrows rows and and colum columns and and check if i f they they are equal Balance: nce: a mass ba balance shoul should ba balance (e.g., COD in – COD COD out =M = Metha ethane produced expressed as COD) Percentag ercentages es and normali alized ratios: ratios: they they should should add add up to 100% 100% or 1 Room for expansion: start and end sums at blank cells (to allow for room for expansion) nsion)
Principl Pr inciple e 15: I ter terations ations
I dentif ntify area areas and and provide provide logic ogic where where Goal Seek or Solver are used to obtain obtain a value. value. STRUCTURED TRAI NING
The The pr primary challen llenge of of trainin ining g is that most ind individ ividu uals fee feel they do not need it (Chadwick and Sue 2001). Therefore, training should focus on the safe use of spread spreadsheets in in an engi ngineering ring envi environm ronment sim similar to QQ-V alidu ali dus s Ltd L td Spreadshe dsheetSafe course (http://www.spreadsheetsafe.com/ (http://www.spreadsheetsafe.com/)) or O’B O’ Beirne eirne’s Spread Spreadsheet Che Check and Control textbo textbook ok (O’ (O’B Beirne 200 2005). 5). TOOL S
A udi uditing ting tool tools expa expand nd on the features tures in in EX EX CEL to ana analyze and and audi udit spread spreadsheets. ets. Many are are designe signed pri prim maril rily for for tracking tracking and checki checking ng financia ncial sprea spreadshe dsheets. ets. A listing sting of these can be found on the EuSprig website. The There are a few tools that are applica licable to checking ing design ign calcu lculat lation ions. One is Spreadsheet Detective published by Southern Cross Software (www.SpreadsheetDetective.com www.SpreadsheetDetective.com). ). I t reduces reduces the tim time it takes takes to check check a sprea spreadshe dsheet because it shades cells (identifying the types of cells), lists all formulae, creates a dependency tree for a single cell and identifies links. EI GHT PRI NCIPL ES OF OF SPREADS SPREADSHEET HEET ENGINEERING
To summarize ize, there are eigh ight princ inciple iples s critic itica al to spreadsheet engine ineering ing: 1. Follow best practices for spreadsheet design and version control as they improve perf performa ormance and and reduce risk. ri sk. 2. Plan knowing how a spreadsheet will be used by others (e.g., will they use it or will they just use the results).
3. K now what what is requir require ed of the spread spreadsheet (e.g., wil will it be given given to a Client or appended to a report). report). 4. Predict future use and adjust design to account for this (e.g., can the spreadsheet be re-used on another project). 5. Follow good software design practices when building the spreadsheet (e.g., grouping inputs into one part of the sheet). 6. A ccount ccount for for situa situation-dep tion-depe endent Be Best Practi Practice ce requi requirem rements (e.g., when when can you break the rules). 7. Design your your spreadshe spreadsheet et knowing knowing that someon someone e else may modif odify or check your spreadsheet. 8. Tak Take the time to do it righ ight the firs first time. CONCLUSIONS
A sprea spreadshe dsheet is is a de design sign calcula culation, tion, possibly possibly even even a pi piece of technical technical software software.. The Therefor fore, Best Pra Practice ice dict ictates that they be be produced and controlled lled as per a Quality lity and Document Mana Management System. System. I t takes takes a lot of effort eff ort to develop and maintain aintain sound, sound, proper, and effecti eff ective ve sprea spreadshe dsheet practices. practi ces. The The spreadsheet's very ease of use encourages sloppy sloppy habits, habits, and even seasoned engineers can find themselves falling into bad habits. A t its its worst, sprea spreadshe dsheet slopp sloppiiness, refle refl ected cted in poor des desiign, di difficult cult manipu nipullation, tion, and and lack of documentation leads to the question – does a company have effective control over its designs. This This is a question ion that should never need to be asked of a design ign profession ional. REFERENCES
AECOM (20 (2011a). QUALIT QUAL ITY Y MANAGEMENT MA NAGEMENT SY STEM – IMPL I MPLEMENTI EMENTING NG PROCEDURES Technical Procedures. Preparation and Review of Calculations. Procedure 4-3. Balson, D. (2010). (2010). Changing Use Userr Atti A ttitud tude es to Reduce Reduce Sprea Spreadshe dsheet Risk. Procee Proceeding dings s of EuSpRIG uSpRI G 2010 2010 Conference. onference. “Pra “Practica cticall steps steps to protect protect organi organisa sati tions ons from out-ofout-of-control spreadsheets” Bewig, wig, P. L . (20 (2008 08). ). "How "How do you you know know your sprea spreadshe dsheet is is righ right? t? Pri Principl nciple es, Tec Techniqu iques and Pra Practice ice of Spreadsheet Style." le." fro from http://www.eusprig.org/hdykysir.pdf. Chadwick, dwick, D. D. (2003). (2003). "Stop "Stop That Subversi Subversive ve Spreadshe dsheet!" et!" FI P, I ntegrity grity and and I nternal rnal Control ontrol in Inform Informa ation tion Syste System ms, Vol V ol 24, 24, pp. 205-211 205-211,, K luwer, 2003 2003.. Chadwick, D. and R. E. Sue (2001). "Teaching spreadsheet development using peer audit and se self-audit methods thods for for re reduci ducing ng error." rror." European Sprea Spreadshee dsheet Ri Risks Int. I nt. Grp. Grp. 2001 9595-105 105 ISB ISBN N:1 86166 86166 179 179 7. 7.
Dunn, A. A . (20 (2010). 10). "Sprea "Spreadshe dsheets ets - the Good, the Bad and the the Downrigh Downri ghtt Ugl Ugly." y." Proc. Proc. Europe uropean Sprea Spreadshe dsheet Ri Risks Int. I nt. Grp. Grp. (E (EuSpR uSpRI G) 2010 2010 157 157--164 164 ISB I SBN N 978-1-905 978-1-90540 404-504-506. EUSPRI SPRI G (2009). (2009). "EuSp "EuSpRI RIG G Horror Horror Stories." Stories." Re Retrieved J uly 6, 201 2011, from http:// http://www.eu www.eusp spri rig.org/ho g.org/horrorrror-stories.ht stories.htm m. Feechan, G. G. (20 (2008). 08). "H "Hidde dden sprea spreadshee dsheet rows hit hit Barcl Barcla ays with with toxic toxic Le L ehman contracts." from http://www.notjustnumbers.co.uk/2008/11/hidden-spreadsheet-rows-hitbarclays.html. Grossm Grossman, T. T. A. A . (200 (2002). 2). Spre Sprea adshe dsheet Engi ngineering ering:: A Resea search Fram Framework. Proc. European Spreadshe dsheet Risks Risks Int. I nt. Grp. Grp. 2002 23-34 23-34 ISB I SBN N 1 8616 86166 6 182 182 7. Hunt, unt, A . and D. Thomas (2000). (2000). The T he Pra Pragmatic tic Porgram Porgrammer: From F rom Journe J ourneyman to Master. Boston, Addison-Wesley. Microsoft (2011). "Compatibility mode in the 2007 Office system (Updated 2011-0526). 26)."" from http:// http://te techn chne et.mi t.microsof crosoft.com/en/en-us/l us/liibrary/cc17 ry/cc1789 8998 98(of (offfice.12,pri ce.12,printe nter).a r).aspx. spx. O'Beirne, P. (2009). Checks and Controls in Spreadsheets. Proc. European Spreadsheet Risks Ri sks I nt. Grp. (Eu (EuSpRI SpRIG) G) 2009. O’B O’ Beirne, rne, P. P. (20 (2005). 05). Spre Sprea adshe dsheet Che Check and and Control. Control. 47 Ke K ey Practi Practices ces to detect tect and and prevent vent errors. errors. . System Systems Pub Publlishi shing, ng, Wrexford, Wrexford, Irelan I reland. d. Onta Ontario rio (20 (2011). "Courts "Courts of J usti stice Act. ct. R.R.O. R.R.O. 1990 1990, REGU REGUL L ATI ON 194 194. RUL RU L ES OF CIVIL PROCEDURE." from http://www.elaws.gov.on.ca/html/regs/english/elaws_regs_900194_e.htm. Panko, R. R. (2008). "What we know about spreadsheet errors. Originally published in theJ ourna ournal of End User ser Com Computi puting ng’’s Specia cial I ssue ssue on Scaling Up End Use Userr Devel Development. Vol V olum ume 10(2), 10(2), pp. 15-21. Revi Revise sed d versi version on (2008) (2008) avai available ble at http:/ http://pan /panko. ko.shidl shidle er.ha r.hawaii.edu waii .edu/S /SSR SR/M /Mypa ypape pers/wha rs/whatknow.htm." Panko, R. R. R. R. and and S. Auri A urige gemma (2010). (2010). "Revi "Revisi sing ng the Panko-H nko-Halverson Taxonomy of Spreadsheet Errors." Decision Support Systems 49(2): 235-244 PEO (1993). Guideline: The Use of Computer Software Tools by Professional Engineers and the the Developme Development of Compute puterr Software Sof tware A ffecting cting Publ ublic Safety and and Welf Welfare, Professional Engineers of Ontario. PEO (2011). Guideline: Professional Engineers Using Software Based Engineering Too Tools, ls, Por Porfes fesiso isonal Eng Engine ineers of Ontraio. io.