PHP Brilliance Advanced Coding Mojo Thunder Raven-Stoker © 2014 - 2016 20 16 Thunder Raven-Stoker
Tweet This Book! Please help Thunder Raven-Stoker by spreading the word about this book on Twitter!! Twitter The suggested hashtag for this book is #phpbrilliance #phpbrilliance.. Find out what other people are saying about the book by clicking on this link to search for this hashtag on Twitter: https://twitter.com/search?q=#phpbrilliance
I would like to dedicate this book to the loving memory of my Father, a man who was very much an engineer in the traditional sense. He was also a man who enjoyed ribbing me with the idea that “a programmer doesn’t make owt! He just pushes buttons all day.” This one’s for you, Dad, with all its incumbent button-pushing.
Contents The bit at the front
. . . . . . . . . . . . . . . . . . . . . . .
i
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii
Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Prelude . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
2
More Pub Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
Object Oriented Thinking . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
The Four Central Tenets . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Encapsulating our ideas. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
Code, but in the abstract sense. . . . . . . . . . . . . . . . . . . . . . . . .
51
Inheriting Inheriting vast wealth is not always good. . . . . . . . . . . . . . . . . . .
73
Prodding the polymorph. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
Talking points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
112
Brain Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
126
CONTENTS
Extending our Object Oriented brain
. . . . . 128
Progressive progression, objectively. . . . . . . . . . . . . . . . . . . . . .
129
More pub time through interfaces. . . . . . . . . . . . . . . . . . . . . . .
132
Putting a name on our spaces. . . . . . . . . . . . . . . . . . . . . . . . . .
147
Expressing good traits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
160
Finding Closure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
177
Talking points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
199
Brain Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
205
Standing Standing on Principles Principles . . . . . .
. . . . . . . . . . . . . . 207
Building on bedrock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
208
Ghostbusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
210
Favour Interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
219
And favour Composition too. . . . . . . . . . . . . . . . . . . . . . . . . .
223
Tell, Don’t Ask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
250
Instantiaphobia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
259
Do shoot the messenger messenger . . . . . . . . . . . . . . . . . . . . . . . . . . . .
279
Don’t Talk To Strangers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
285
Talking Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
295
Brain Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
305
CONTENTS
Moving on to SOLIDs . . . . . . . . . . . . . . . . . . . . .
308
Introducing the SOLID principles . . . . . . . . . . . . . . . . . . . . . . .
309
Placeholders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
310
The Single Responsibility Principle . . . . . . . . . . . . . . . . . . . . . .
311
The Open / Closed Principle Principle . . . . . . . . . . . . . . . . . . . . . . . . . .
312
The Liskov Substitution Principle . . . . . . . . . . . . . . . . . . . . . . .
313
The Interface Segregation Principle . . . . . . . . . . . . . . . . . . . . . .
314
The Dependency Inversion Principle . . . . . . . . . . . . . . . . . . . . .
315
Talking Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
316
Brain Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
317
Applying Software Architectures
. . . . . . . . . 318
Introducing Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . .
319
To MVC or not MVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
320
Service Oriented Architecture . . . . . . . . . . . . . . . . . . . . . . . . .
321
API Oriented Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . .
322
The Architectural Fortress . . . . . . . . . . . . . . . . . . . . . . . . . . .
323
Talking Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
324
Brain Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
325
CONTENTS
DON’T PANIC!
. . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Appendices . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 327
PHP7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
328
Up next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
346
The bit at the front
i
Preface Thank you for picking up this book. Our journey starts here, just you and me. We’ll be navigating sections and paragraphs, following a treacherous trail. Some have gone before and sadly haven’t succeeded in reaching the final destination. But you, there’s something different about you. I’m not sure what it is yet, but once we’ve embarked on this journey together, I’m sure I’ll be able to put my finger on it. Let’s go.
Who this book is for. In a nutshell, I’m pitching this book at good developers who are keen to shine, to develop an almost supernatural ability to foresee and avoid future bug scenarios, to be awesome. To be brilliant. This isn’t a beginner’s guide to PHP, nor is it intended to be a book for mid-level developers either. The intended audience that I’m writing for is one that mainly comprises developers at the Senior end of the scale, though I will freely admit that the term Senior Developer means wildly different things to different people and across different organisations. As such then I’m attempting to gear the content towards an anticipated reader who is already familiar with the language and has been using it in an object oriented fashion for a number of years. How many years? Again, this is an exceedingly difficult thing to quantify. One developer who has spent five years only repeatedly installing WordPress and has not engaged in any personal study won’t be as far along the curve as another developer who has spent the last six months dedicating themselves to accreting the mastery and art of managing components via Composer. In any event the reader, that is, you, will have a certain amount of commercial or commercial-like experience already under your belt. Quite likely, you’ll have tried and tested a few of the MVC frameworks that litter the PHP landscape, perhaps even have settled on one as being your personal favourite. ii
Preface
iii
You’ll know some of the mantras and perhaps even fervently follow them; always programming to an interface and favouring composition are certainly two qualities firmly embedded in the mind of a developer that exhibits PHP Brilliance. In other words, this book is for you if you’re the kind of developer that left the basics behind a good while ago, having developed a certain fluency with the language and generally only tend to consult the PHP manual when you need to remind yourself whether it’s needle then haystack or haystack then needle .
Why did I write this book? Now that’s a very good question and one that comes with a significant number of answers. That’s not me hedging, by the way, there were, and still are, several key motivators behind my taking up this potentially insane project. I’ll set them out below but please do bear in mind that the order is arbitrary - each one carries pretty much the same weight as do all of the others. There’s a massive clue to my first motivator in the preceding section. What on earth is a Senior Developer anyway? . The term has an exceedingly fluid definition and in large part, is highly subjective. A Senior Developer in Company A may only have achieved that rank through time served, whereas a Senior Developer in Company B has gotten there through a demonstrable ability with the language. Even so, the Senior Developer in Company B might only be comparable in ability with a Midlevel Developer from Company C because Company C has an evil bastard of a CTO heading up the tech team. Irrespective of the scale of ranks that individual companies apply to their teams, it’s certainly the case here in the UK that recruiters have much lower standards for who gets put forward for Senior roles. At one point, you only needed to put MVC and Zend on your curriculum vitae / resume in order to get submitted as a candidate for a senior dev position. Now don’t get me wrong, I’m not suggesting that this book is going to be the new yardstick for what a senior developer is or isn’t. I might be ambitious but perhaps not that ambitious about what can be achieved with this body of work. What I am aiming for though, with regard to this particular motivator, is a way to find a common base for what I believe a senior developer ought to know. Another motivator if you like, is perhaps more of a reaction to the sheer volume of poor quality resources that are available online, in books and even through some
Preface
iv
educational institutions. This is a quiet rebellion! It pains me to think that a newcomer to our wonderful community of PHP developers can go online, search for tutorials on object oriented PHP and end up filling their tender, virginal noggins with all sorts of bad and in some cases, downright dangerous advice. Setting aside the conspiracy theories about search engines and their page rank algorithms, no-one is yet curating the internet and whilst that is so, there does seem to be some value in collating the good bits. Now I could just build myself a web site and link to all the good stuff but… There are a lot of very good tutorials out there on the internet. Lots of just the right information. This is great, but they don’t often include the warnings or highlight the pitfalls of the very things that they’re teaching you about. Whilst they tell you what you can do, they don’t always tell you whether you should. They certainly don’t always include the pitfalls and gotchas and things to avoid on this topic or that. This is intentionally a key feature of the book that’s unfolding before you; I want to highlight those very pitfalls and problems that can crop up in each particular topic rather than just tell you that you can do this or your can do that. A case in point comes early on with the chapter on Abstraction . There are certainly no shortage of tutorials that will tell you that abstraction promotes code reuse, helps avoid code duplication. This is wrong but I’ll save the explanation for the relevant chapter. So in some regards, you might consider this book to be a way to encourage the unlearning of bad habits that many of us have picked up along the way as well as a way to highlight the good stuff. Lastly, there’s an ulterior motive here, an almost selfish reason for this book. Over the last few years, I’ve picked over hundreds and hundreds of CVs, examined thousands of lines of sample code and spent hours upon hours interviewing candidates for various positions on the projects that I’ve had a hand in. What a dream it would be were those candidates able to discuss this book with me in an interview. What do they agree with? What don’t they agree with? Why not? I know I said that I didn’t intend for this book to be a yardstick for what a senior developer is or isn’t, but it certainly could be one that would help me gauge what sort of level a candidate is at, irrespective of any superlatives that the recruiter might have used when embarking on a drive to find new talent. With all of this in mind…
v
Preface
What is this book about? In a word, brilliance. PHP Brilliance is a meta-yardstick that I’ve come to use myself. I say “meta-“ since experience tells me it’s going to be a moving target. Five years ago, it would have featured the MVC design pattern quite strongly. Now, the world of PHP MVC frameworks has matured to the point that it’s standing on the threshold of its twilight years. In five years time, such frameworks may not even be mentioned, except in the historical sense. PHP Brilliance then is something that I have grown to consider to be a certain standard of programming knowledge as it applies to the PHP developer today. In its scope, it takes in the core fundamentals of object oriented programming in PHP. It spans numerous principles and methodologies. It covers the pros and cons of various architectural considerations and it examines and extols good working practices.
Publishing, the leanpub way. I’m a big fan of the lean process model. I’m sure that much will be come apparent when we get into the good practices section of the book and I love what those good folks over at leanpub.com have done to take the concept and apply it to the process of publishing a book. If you’re not familiar with the lean publishing manifesto¹, as put out by lean pub.com, I’m quoting it here. Please do visit the web site for a more fulsome explanation. Lean Publishing is the act of publishing an in-progress book using lightweight tools and many iterations to get reader feedback, pivot until you have the right book and build traction once you do.
Ironically, and maybe even hypocritically, I’ve either applied or am intending to apply many software development processes to reach the final goal of a finished book. To begin with, at the time of writing I’ve completed the entire first draft of what I think should be in this book. This isn’t lean, which would naturally abhor such waste. ¹https://leanpub.com/manifesto
Preface
vi
This is waterfall, a process that should bring most developers out in hives. To avoid embarrassing acts of self-flagellation, I’m going to term this process “marshalling together all of my anticipated source materials”. It’s my excuse and I’m sticking to it. Traditionally, of course, book publishing has been entirely a waterfall-like process. Authors dream up a project that they hope at least some portion of the human population will want to read. Publishers buy into this idea (hopefully!). The finished manuscript, guided by an editor, is sent to proof-readers, reviewers and technical editors before being sent to the printers. Hopefully during this time, the marketing and sales machine has been fired up in order to drum up interest before the various distribution channels get fed the stock that they’ve hopefully been eagerly awaiting. There’s a huge disconnect between the original proposal and the final product, which can be anywhere between 6 to 12 months and more. For PHP Brilliance, the next step is the first step towards publishing. Before the end of March 2015, the first instalment will be out and available to buy. At this point, I shall start a sprint like structure where I shall undertake to release updated versions of the book every two weeks. I’m not really adopting sprints though; I not even sure I could persuade my cats to partake in a daily standup no matter how many treats I attempt to bribe them with. The two-weekly iterations are, for me, a small way to apply discipline to the process of editing, revising and releasing the book and a big way for me to put up my side of the bargain. Anyone buying into the publishing of this book will be paying for the completed book before the completed book is done. Through the adoption of a two week sprint cycle, I’m hoping to build confidence in the iterative process of adding value to this particular product. Lean processes are all about delivering value and eliminating waste. Lean is a cyclical process and where it wins out over Waterfall is the feedback stage. Lean actively encourages feedback from the earliest development stages in order to direct the future progress towards delivering the most value for the end user, whilst simultaneously eliminating as much waste as possible. Traditional publishing gives you the finished product and then collects feedback, at which point the feedback comes too late to guide the content of the product. Lean publishing allows the end user to influence and direct what the final product becomes. This is where you come in. I absolutely, definitely, passionately want your feedback. Your reviews, your comments and especially your criticisms. If you can write to me
Preface
vii
with details of what you think is good, what you think is bad, what you think is missing, that would just be so many kinds of awesome. More detail here, less detail there, dial back on the jokes for pity’s sake. If you have the time and inclination to get involved in the feedback stage of the lean process, I would love to hear from you. You can write to me here:
[email protected], ideally putting “PHPBrilliance” in the subject line so that I can filter effectively. If you’re not able to get in on the feedback cycle, well then I hope you enjoy what you’re about to read and find the content to be helpful and valuable.
Who does this guy think he is? For starters, I love programming. I wrote my first “program” back in 1983 at the age of 12 (so now you know how old I am!) though I would be hard-pressed to call that programming since it was written in Sinclair Basic on a 48k ZX Spectrum. I’ve had a love of computers and what you can do with them ever since. Which in turn meshed nicely with the increasing availability and popularity of the Internet in the 90s. Sure, some of it has diminished in popularity, especially since the web arrived to send things like WAIS and gopher scurrying back into the dusty halls of academia. Like many at this time, I learnt to develop web applications using Perl and the CGI. Then PHP 3.x arrived on the scene in 1998 and that kinda changed everything. Already fluent in Java and Perl at that time, I switched (cue tears of misery from Perl and Java advocates!). PHP syntax was so much more comprehensible than Perl’s. Results came far more quickly with PHP than with Java (you don’t have to recompile a PHP application after fixing a typo). It was at that point that web application development became a career choice as well as a hobby. Just in time to catch the first dot com boom. I’ve been doing it ever since. Aside from being a right tech-head, I also like my beer. And archery. My piano and my playstation. And the vampire genre in all its wondrous and bloody glory. I love my cats too even though they keep trying to get in the way of my writing. And I love the crazy, filthy, chaotic mess that is London, which is why I live here. Thunder March 2015
Changelog One of my readers pointed out to me recently that it isn’t always easy to know what has changed since the last time I pushed an update. As a result, I’ve added this page right at the start, which links the new and updated content since the last release. I hope it helps! • Jan 22nd 2016 – Updated content * Part One Brain Check * Part Two Brain Check * Part Three Talking Points * Part Three Brain Check
1
Prelude “Somebody in this camp ain’t what he appears to be. Right now that may be one or two of us. By spring, it could be all of us. ” - MacReady
2
More Pub Time Before we begin the book proper, there is a key principle that we should look at first. At the time of writing, it is highly unlikely that you have ever heard of this principle before now. I would like to say that it comes to us as the result of some super secret research conducted over many years by a shady, off-the-books, clandestine government agency and that it somehow fell into my hands. I would very much like to say that, in the interest of the greater good, I am now leaking the details of that principle into the public domain so that it might be read, understood and digested for the benefit of all mankind. I can do no such things though. Sadly, the reality of the situation is much more mundane. I made it up, specifically to act as the unifying theme for the content that appears within these pages.
Introducing the “More Pub Time” principle If I were to express the principle in a manner that is both short and to the point, it might be expressed thus: By pre-emptively acquiring the right knowledge, a programmer may accelerate their own personal development towards mastery and the production of robust, high quality software applications. In turn, common bug-prone scenarios may be avoided and highly valuable pub time preserved. - The More Pub Time Principle²
I’m not a big fan of jingoisms though, so let us proceed to break that down into something that makes a great deal more sense. ²https://morepubtime.com
3
More Pub Time
4
What is the right kind of knowledge? The principle as stated up there leaves itself wide open to being accused of baldly stating the obviously. If a developer has all of the right kind of knowledge in their head, then it should naturally follow that the quality of their application code will be of a high standard. But what is the right kind of knowledge? To answer that question, we need to qualify a few things first, including something of a definition for evaluating code quality itself. Fortunately for us, we can stand upon the shoulders of giants and examine the work that has already been done in this arena. • The functional quality reflects how well a particular application serves its intended purpose. • The structural quality reflects on the non-functional requirements and how well the application code supports the delivery of the application’s functional requirements. Fitness for purpose , as it is commonly known, isn’t something that we will be considering here; building an online word processor that can compete with a desktop version will have very different functional requirements to trying to build out a social network specifically for Alaskan tropical fish enthusiasts. The functional quality of any given software project will be judged by criteria that are very specific to that project. In other words, does it do what it is supposed to. However, we certainly can examine the non-functional requirements of any given software project in order to judge the quality of the software and its constituent code. Typically, this is done by assessing four key factors.
Reliability Can we trust our application code to do the right thing? It might sound a bit daft on the face of it but this is something that we absolutely must be able to do. Our users will rely on our application’s code to deliver the functionality that they need, else they will stop being our users and go look elsewhere.
More Pub Time
5
The companies that we work for will rely on our application’s code to be able to deliver the necessary functionality either to increase sales, reduce costs or any other way that might improve the bottom line. Failure to provide for this kind of reliance might result in the company going bust or us getting fired. The keyword here is failure. The reliability of an application, or an individual portion of code inside an application, isn’t just down to how well the code responds when given the right data in a low stress environment - coding just for the happy path does not provide us with the reliability that we crave! No, the reliability of a piece of code or an application as a whole comes down to how well that code responds to errors, failures and defects. If a caching layer is down due to hardware or networking issues, will our application choke and die? Or will it handle the situation gracefully? If a user provides us with garbage in an HTML form, either innocently or maliciously, does our code validate and sanitise appropriately, chucking up the right kind of error message whilst rejecting the content of that form? Modelling and assessing for reliability comes in many different flavours, far too many to discuss here, but they all have a common theme: Does the application do the right thing when conditions are favourable and does it still do something acceptable when conditions are unfavourable? If the application chokes when it is fed with garbage data, if our web site falls over once the database’s connection pool becomes saturated, then our quality in terms of reliability could be said to be very low. As we could be said to be out of a job.
Maintainability I’ve ranked these four quality considerations in order of importance, which is why maintainability comes in second. Reliability is right there at the top of the list since we need our code to do what it is supposed to, and respond appropriately when it can’t. However, maintainability is a most definite second on the list when considered in order of importance and for very good reason. Whilst the other three members of this list express their focus on a given snapshot of the codebase, frozen in time at any given point, maintainability focusses on both the frozen snapshot and a time-ranged consideration.
More Pub Time
6
This is important for a number of reasons. In an online world, and especially in an online world of applications written for the web using languages such as PHP, the applications themselves very rarely reach a state of completion. Unlike their counterparts in the desktop world, there isn’t a point in an online applications lifetime where the business declares that the product is done and it’s time to get the DVDs pressed and the product on shop shelves. This is especially true for development teams that use agile methodologies and lean processes, tagging and releasing incremental changes over time in order to improve the product offering for the targeted end user. This is where the time-ranged consideration comes in. Code that is maintainable is code that is resilient to the inevitable changes that come along. It is the opposite of code that is brittle and/or fragile. It is code that is sufficiently malleable that it will bend and flex appropriately to accomodate the seemingly endless flood of change requests emanating from the business team without hiccup, downtime or the loss of that most valuable commodity: Pub time.
Security Security is perhaps the most commonly misunderstood aspect of software quality. Naturally as developers we are keen to protect our application’s data from the more malicious elements of the outside world but security has a much broader sphere of influence beyond the common, though no less critical, act of guarding against known attack vectors. As far as security is concerned, it becomes necessary to consider the topic in much broader terms. One thing that is common to all applications is that they collect, manipulate and store data. It doesn’t matter whether it’s a game, a social network, a corporate intranet or a blogging site, it’s the data that is king in every case. As far as the security aspect of software quality is concerned then, we must widen the definition beyond preventing hack attacks to one that recognises the preservation of data integrity is paramount. If the data within our application is “mostly ok” then we’re in trouble since that implies that some of it is bad. We need all of the data in our application to be good data at all times, we need it be where it is supposed to be and not end up in places where it’s not supposed to be.
More Pub Time
7
As a result, assessing the security aspect of software quality entails a lot more than making sure we are protecting ourselves against SQL injection attacks, or hashing user passwords properly. It entails making sure we have the right kind of validation in place and that we’ve put that validation code in the right location. It entails making sure that a multi-stage data manipulation process can be rolled back to its original state should one of those stages fail. It entails a lot of things but the final outcome needs to be this: All of the right data in all of the right places and none of it anywhere else. Making sure that we’re ready for the user that has ;DROP TABLE users;@hotmail.com as an email address is one thing, but if a member of our team creates an algorithm that accidentally sets every user’s first name to the four character string “John” then we’re already sailing down crap creek anyway and the paddle is nowhere to be seen.
Efficiency I’ve known developers in the past who have placed an inordinately high degree of importance on creating highly efficient, fast running, low memory usage code. Ones that have declared code can only be beautiful when it’s fast. Thankfully, most of us already know better. In the days, months and years before the explosion of cloud computing and virtualisation, the performance of an application could often be improved by an activity known as “throwing more tin at it”. Adding another server to the rack or installing fast hard disks or more RAM was commonly cheaper than paying a developer’s salary for the time required to optimise inefficient code. Nowadays, we can still “throw more tin at it” but in a virtual sense by spinning up another instance. Nevertheless, creating code that is performant and efficient is still very important if we hope to preserve our all-important pub time. Code that is already reliable, secure and maintainable can always be optimised for speed and efficiency. But code that is written for speed first can rarely be optimised for reliability, security and maintainability after the fact. That’s not to say that speed and efficiency aren’t important of course. An online application that takes thirty-seven minutes to turn a request into a response had
More Pub Time
8
better be delivering absolutely critical information at the end of the cycle if it has even the slightest hope of retaining users. The reality for the vast majority of online applications is that the request-response cycle needs to complete in a matter of seconds, not minutes. Fortunately for us in the online world, there are a number of techniques that we can employ in order to improve the perceived performance of an application and therefore improve the user experience. Rather than despatching that account activation email as part of the sign-up procedure, we can queue it for an independent process such as a cron or a daemon to take care of. Rather than parsing and processing an uploaded CSV as part of the request-response cycle, we can move the file to a watched directory and allow the status of the file’s processing to be reported back to the front end by a series of subsequent, asynchronous calls. Of course, we can optimise the actual performance and efficiency of our application code and not just the perceived performance and efficiency. The developer that performs a SELECT * FROM users and then loops through the entire recordset in memory just to find the row for the user that has just logged in probably needs his or her butt kicked first and foremost before being taught how to write a WHERE clause. Returning a single row in this situation is clearly much more efficient than pulling out the entire table. Despite being at the bottom of this list, efficiency is still a very important factor in a list of four factors for assessing the quality of an application and its code. Software quality is a huge topic in its own right, with hundreds of books and academic papers published on the subject. Here, I’ve only just scratched the surface and only very lightly at that. Nevertheless, I hope I’ve been able to at least impart the vaguest impression of what is meant by quality code being reliable, maintainable, secure and efficient.
Which brings us closer to considering what the right kind of knowledge might be. Before we do that though, we should take a moment to consider the wrong kind of
More Pub Time
9
knowledge.
Hello World! The learning process is well documented. Not just in programming circles but in any kind of trade or craft, a practitioner will go through a number of distinctive phases. At the start comes the novice, picking up the fundamentals and learning the first principles. It might be horribly presumptuous of me to say this but I’d be prepared to wager that a great many of us began this journey by writing “Hello World” on the screen, all thanks to Brian Kernighan way back in 1973. Advancing beyond the novice stage is the journeyman, building upon the fundamentals by acquiring additional knowledge and combining it with experience in order to hone their skills and further their personal development before readying themselves for the final stage, that of the master craftsman. Traditionally, the title of master craftsman would be bestowed upon a practitioner by the guild that oversees their particular trade or craft. When ready, the journeyman would prepare and submit a piece of work as part of the application process and it would be the members of the guild who would judge said piece of work and establish whether they deem it to be worthy or not. In the world of online application creation we aren’t so fortunate to have such a structured approach to personal development; one cannot readily apprentice oneself to an established master craftsman. As such, our journeyman stage is less a joy-filled skip through a sun-drenched meadow and much more a stagger through a perilous swamp filled with traps and pitfalls. It’s also huge, thanks to the Internet. The journeyman developer is literally inundated with opportunities to advance their knowledge and abilities thanks to the plethora of materials available to him or her. There are books aplenty, there are courses to follow both online and in more traditional bricks-and-mortar learning establishments, there are articles and blog posts that are way to numerous to count. Which is precisely where the danger lies; the largely unrestrained and unregulated publication of tutorials and training videos presents a very real danger to the journeyman coder. With such an unfathomably large volume of information available, how does the journeyman coder ever hope to be able to sift the good quality information from the bad, especially when the bad keeps coming at an ever-increasing rate?
More Pub Time
10
As a case in point, as recently as December 2015 I encountered two exceedingly poor quality tutorials linked to in developer forums that I frequent. The first showed how to “easily populate variables using the extract() function”, which is a particularly dangerous “trick” to learn without making the reader fully aware of the dangers involved. However, even that pales into insignificance when compared to the second tutorial, which illustrated how to use AJAX to send search terms from an html form to a PHP backend that would return the search results back to the front end after querying the database. On the face of it, this seems an entirely reasonable thing to post a tutorial about and as a topic, it has certainly been covered any number of times before. However, in this particular tutorial, the example code illustrated the creation of a SQL query by iterating over the $_POST superglobal and using the values directly and without any form of validation or sanitisation. For any time-served developer this should be a horrifying state of affairs; that a tutorial posted in 2015 pays no heed to preventing SQL injection attacks and thus proving the proverb “A little knowledge is a dangerous thing”. Yet it is easy to forget that new coders take their very first steps onto a coding career path every single day. Any such hapless coder copying and pasting this tutorial code into their own endeavours would become vulnerable to those same SQL injection attacks, at least until such time that they acquire the right kind of knowledge that would let them guard against these attacks. Which brings us nicely back around to considering just what does constitute the right knowledge. Regrettably, at the time of writing there isn’t one definitive answer to that question. It is also rather dubious that there ever could be. One of the very great joys that a career in software development brings along with it is the fact that the learning process never ends. No matter how advanced a developer becomes in their mastery of the topic, there is always more to learn. On the one hand, this is because the topic itself is so vast. On the other, it is because the technology progresses relentlessly. Complete mastery of application software development is entirely unobtainable for these two distinct reasons. At any given moment, there is already too much to fit into a single human brain. And today’s latest technique has every chance of being superseded tomorrow.
More Pub Time
11
Despite this, it is certainly my strongly held belief that there is certainly a catalogue of knowledge that every Senior Developer and above needs to have tucked under their belt. Such a catalogue of knowledge can cover a variety of topics, from very specific morsels such as always ensuring that you escape, validate and sanitise your inputs appropriately to much broader concepts such as the key software development principles, rules, laws and even design patterns. But why? The reason itself can quite simply be stated as this; to build better software. More specifically, to build better software expressly in terms of its structural quality so that it is reliable, maintainable, secure and efficient. Fantastic! So now that we’ve nailed down the definitions of software quality and the right knowledge needed to achieve software quality…
What the heck is Pub Time? If we are to preserve our invaluable pub time by pre-emptively learning the right stuff in order to achieve high quality software, we need to know what the chuffing heck is pub time? This is the more jocular aspect of The More Pub Time Principle , which is precisely why it’ll never be accepted by academicians as a valid software principle but in brief it goes like this: Pub time is that time expended in any manner more conducive to a happier and more fulfilling life than staying behind in the office hunting down and fixing software bugs that were entirely avoidable in the first place.
For me, that quite often entails a trip to pub with my friends to have a catch-up over a drink but it isn’t, by definition, limited to going to the pub. Pub time can entail taking your beloved out for a romantic meal. Or it can mean getting home in time to read bedtime stories to the little ones. Or going to watch a play. Or hitting the gym for a workout. Or firing up your games console in order to blow the heads off of alien invaders.
More Pub Time
12
In fact, engaging in any activity that improves the quality of your life and its enjoyment can be considered as “pub time”, in comparison to late nights wasted hunting down and fixing perfectly avoidable software defects.
Conclusion The journeyman stage of a software developer’s career is somewhat ill-defined, and this is certainly all the more true for the developer working in the online world. Much of the web is built upon open source languages such as PHP, Ruby and Python to name but three. Each of these languages have a dedicated following and are supported by fantastic communities and as such, there is a tremendous wealth of learning resources available both online and off. One thing that the Internet will never be able to achieve though is the elimination of the journeyman stage. It simply is not possible for a learner, of any craft or topic, to jump straight from novice to master without having spent time in the journeyman phase. This isn’t a limitation of technology that might be addressed in the future, it’s a limitation of the human mind. Knowledge acquisition is done through the process of learning, leaving us at the mercy of our current biological limitations. What we absolutely can do though is shorten the path that the journeyman takes. When faced with the boggy swamp that separates the land of the novice from the land of the masters, we can pick out a sure, swift and safe path to follow - one that takes us to all the good places to visit and leads us safely past the traps and the pitfalls. This is entirely what The More Pub Time Principle aims to do in providing its rather nebulous definition: By pre-emptively acquiring the right knowledge, a programmer may accelerate their own personal development towards mastery and the production of robust, high quality software applications. In turn, common bug-prone scenarios may be avoided and highly valuable pub time preserved.
There are already a number of initiatives currently in play intending to help the journeyman developer find the faster, safer route through the quagmire. One
More Pub Time
13
rather excellent resource is the PHP: The Right Way ³ project, which I would highly recommend to PHP developers at every level. Another resource is this book. The goal of PHP Brilliance as a book is to put all of the right kinds of information in front of you. From architectural concerns such as MVC vs Service Oriented Architecture vs Microservices to a thorough examination of the SOLID principles, from looking at abstraction and inheritance in a new light to considering the frankenstein method of utilising closures. The goal of PHP Brilliance as a state of mind is to ensure that you are equipped to lead the team in creating beautiful code that is reliable, maintainable, secure and efficient. Life is short so please do take this advice to heart: Preserve your all-too-precious pub time, and do enjoy the journey along the way. Last but by no means least please remember; drink responsibly. ³http://www.phptherightway.com/
Foundations “I dunno what the hell’s in there, but it’s weird and pissed off, whatever it is. ” - Clark
14
Object Oriented Thinking Before we can get onto the really meaty stuff, it’s important for us to take a pause and check on our understanding of what object oriented programming is. Even though this isn’t a book for those first starting out in their development career, it becomes all to easy to form a fixed opinion of OO stuff when we first come across the idea as developers. Further to this, there’s a plethora of online tutorials that all take pretty much the same approach to explaining it and, as such, these tend to propagate some rather fixed thinking in this area. If we’re going to build our palaces and castles in beautiful and elegant PHP code, we must turn our attention first to what we are intending to build upon. In order to raise glorious edifices of logical magnificence that not just blend into the world wide web’s skyline but to actually be part of what defines that skyline, we must focus on the groundwork. Our applications need to not only suffer the slings and arrows of outrageous fortune but also stand firm against the whims and change requests of our business teams and product managers. These change requests are the earthquakes and floods that our application development must withstand. We know that they’re going to come. We can brace ourselves effectively against the flood. Just so long as our foundations are rock solid. So let’s go back to the basics and examine what we already know. All too frequently, a tutorial will take the notion of an object as being a representation of a real world “thing” and how the developer is supposed to hang on to this notion as the author goes on to explain how real world things have a particular set of characteristics and attributes that go on to define what the thing is and what it does. The benefit of this approach is that the examples given are already familiar to the reader and as such allows him or her to connect the concepts with current knowledge and experience. Anyone setting out to learn object oriented PHP will know what a car is. Or that a dog is a type of animal. For anyone approaching object oriented 15
Object Oriented Thinking
16
development from a procedural background, something that is certainly prevalent in the PHP arena, this relationship between code and real-world objects can help the developer reach that “penny drop” moment sooner; that point where he or she will suddenly “get it”. The danger here though, and it’s a pitfall that many of us have fallen into, is that the developer starts to cling on to this idea of linking the objects they create with real world examples. The next project that they take on, they’ll starting hunting down the “nouns” in the project brief and planning their objects around them. This here is a User, that over there is a Product. It’s a perfectly valid start to the process of identifying and designing the objects that will be the key players in our new application. But it is only a start. Unfortunately, that is commonly where the tutorials end. If you’re going to be guiding and mentoring the more junior members of your team, you’re going to see some quite iffy code along the way. Just to make sure that we’re on the same page, so to speak, I would like to set out the path that we’re going to take in order to reach object oriented thinking . It’s starts with a shiny new junior, a likeable chap that we’ll call Joe. The route that Joe has taken through the PHP learning landscape in order to arrive at our office OO ready is not an uncommon one. I’d like to say that it’s entirely fictional but that wouldn’t be quite true. You see, Joe’s path was actually the path that I took albeit with some hearty doses of artistic license added here and there. I have no shame. From the outset, Joe learnt to script in PHP; building out the pages of the sites that he built with one script for each. Here an index.php, there an aboutus.php. Things such as database access and variable assignments could be done at the top of the file, then down below the page itself is built up with html and peppered with inline php constructs. At the top, the program logic, at the bottom, the output. After a while, Joe’s realised that he’s duplicating significant amounts of code across his scripts. This is the point where he starts breaking chunks out into separate files; header, footer, routines for accessing the database, others for building html tables. This of course is all in accordance with the tutorials that he has been following regarding the use of the include and require functions. Before long, he’s creating libraries of commonly used functions which he can port from one project to another. Big old PHP files with names such as database.php,
Object Oriented Thinking
17
html.php and other such collections of useful functions gathered together in a single
file. What happens then when Joe starts reading up about object oriented development? He’s introduced to a Car class that not only has properties for things like the wheels and the engine, but also methods (functions!) for when the car needs to do things like move(), turn() and stop(). Joe thinks it’d be a great idea to wrap his carefully crafted library of functions in class statements. This now is the point where Joe could take one of two paths. Does he instantiate a Database object in order to use those transplanted library methods? Or does he add the static keyword to the function declarations so that he can call the class methods statically? Well, instantiating the object doesn’t really look to be terribly useful. Let’s go with the static. Now Joe’s coding regularly features things like this:
include("Config.php"); include("Database.php" ); $conn = Database::connect($dbname, $dbuser, $dbpass);
Youch. Eventually our developer will make the transition from wrapping his function libraries in class statements to identifying the nouns in his system and building objects around those. This is very much in line with the tutorials that he has followed. What we see next from Joe is the predominant but natural outcome of those tutorials and their habit of finishing a topic early. Joe’s classes have become enormous. The classes at the centre of the application, whether this be a User class or a Product class, are truly huge, spanning thousands of lines of code and with methods so long the start and end of them cannot be viewed onscreen at the same time. Scroll, scroll, scroll. What’s going on here? The developer has fallen back on his procedural code knowledge once more in order to code up object methods that, rather than performing a single function, run through
Object Oriented Thinking
18
an entire process from start to finish. Perhaps the most obvious example of this is an object method, probably located in a class called ‘User’ and most likely named something along the lines of ‘create’ or ‘register’. I’ll grant you that for many web applications, the user registration process can be a convoluted one, performing validation against a number of submitted form fields, creating a user record along with the login credentials, possibly also storing an address and linking the newly created user to it as well as hooking up any number of configuration settings. What has happened here is that the developer has taken a procedural process and simply transplanted it into an object method. What used to live as a single php page for receiving and processing a user registration has now been transplanted wholesale into a single method in the User class. Not a literal copy and paste operation you understand, but a selective extraction and remodelling of the code to squish it in between those opening and closing braces. Joe starts by validating the parameters that were passed in to the register() method. If that’s all fine and dandy, he moves onto performing the database ops necessary to get the data into storage and extract the ids. This part may result in just one query being run, or it could be many; the basic user details could be accompanied by a row of default preference settings in one table, a physical home address in another. If that all proceeds ok, Joe then sends out the welcome-cum-verify email before finally returning a true or false back to the original invoker of the method. In just that one method, we have a minimum of three fracture points - places where the process can fail - leading to a brittle design that can fall over a number of ways and be difficult to maintain at the same time. The validation stage, the database stage and the email sending stage. Now is a good time to introduce a key principle to object oriented thinking. I don’t recall where I first encountered this one but it has stayed with me ever since. It goes like this: An object should either know things or do things, but never both. So many of the applications that we build are going to have a User model object. If such a model object represents what we know about a particular user, and we know that user registration is a process then it naturally follows that our User model class cannot have a register() method. The more that you think about this, the more it makes sense. After all, why should all of our instances of the User class be lugging around a method whose purpose is
Object Oriented Thinking
19
to create the user record in the first place. When would such an instance have need of the register() method again? If you were to take that knows things or does things principle and apply it to the model layer of your most recent application, how many model classes would it suggest that you change? How many of the entities in your model both know the details of the thing that it represents (i.e. hold the data for) and provides ways to manipulate that data beyond the act of setting and getting it? In most cases, the primary residents of our model layer will be objects that represent the data that lives inside our application. In this sense, these are the objects that know things. For instance, suppose we have an application that’s going to be handling lots of User instances. We ought to be confident that each instance knows the name, date of birth and email address of the User that it represents. In all likelihood, a full blown application will have User models that hold a lot more detail than that but this will serve us as a good starting point for the time being. None of these instances should be holding methods that go beyond managing the individual pieces of data that they represent. The methods of our model classes should be entirely introspective. Setters and getters are naturally of this ilk but what about the methods that we can identify as being processors? What do I mean by processors? Processors are methods that do things. A method that validates user input is a processor. A method that triggers the sending of an email is a processor. In almost every case, unless a processor is specifically introspective, it can be moved out into a new object that’s designed to handle, to encapsulate that process. For the registration procedure, ideally what we are looking for is a whole range of objects all collaborating in the user account creation process. Each object will have a tightly defined area of focus, performing a single task and performing it well. Having each tiny piece operating as a part of the whole is our goal here. We’re looking for a range of validator objects responsible for checking each part (a password validator can confirm that the offered password has the right number and range of characters, a date validator can confirm that a submitted date of birth is in the right format, and perhaps importantly, is within the correct range (over 18s only?). When we take this approach, we’re neatly separating the logic that performs validation away from the logic that performs record creation. Continuing in an
Object Oriented Thinking
20
ideal fashion, our process for record creation should be nicely squirrelled away and separate from the objects that represent those data records in the first place. One thing that you may very well notice in this book is how often I’ll draw your attention to apparently circular references. The chapter on inheritance refers you forward to the chapter on the Liskov Substitution Principle. The chapter on the Liskov Substitution Principle refers you back to the chapter on Inheritance and also sideways to the part on favouring composition over inheritance . There are so many ways in which one topic will either rely on or reinforce another that the boundaries start to blur. Wherever the crossovers occur, I will endeavour to point them out to you.
Now that you’ve just read that aside (you did, didn’t you?), I’m going to make my first mention of the Single Responsibility Principle. The Single Responsibility Principle is, to my mind, the absolute single most important one of the five principles that go in to make up the set of SOLID principles. It also pleases me greatly that it’s the first one in the set. Familiarity with the SRP can only help to reinforce the idea that our objects should either know things or do things, but never both . If we have objects in our system that know things and do things at the same time, it’s a reasonably safe bet that we’re already violating the Single Responsibility Principle. When we get to that chapter, I hope to make it clear as to why this will be. Returning to Joe then, we know that his tutorials taught him to build his objects based around the nouns of his system. We also know that those self same beginner tutorials didn’t tell him when to stop adding methods to his objects. The good news is though that we’re now in a much better position to enlighten him as to when he’s putting too much into a single class. Regrettably it’s not so easy to draw a line between the knowing and the doing. Adding processors to an object that’s only supposed to be knowing things is all too easy to do. Worse still, it usually begins with the tiniest little thing and before you know it, the slow but inexorable creep towards bloated classes has begun. How then are you supposed to watch out for this, outside of an all-out code audit? Taking a finger in the air approach, you should start to feel uncomfortable whenever any of the following signs appear in an object method.
Object Oriented Thinking
21
Conditional Conditional statements statements such as an if statement or a switch appear in a method, method, and those conditionals are not used in performing validation but are selecting different logic paths to follow based on a incoming parameter. Try to restrict the use of ‘if’ statements to validation only. In the event that you’re creating branched processes in your code because of the value of a particular property, you’re almost certainly going to be better served by creating an independent object obj ect for each each branch branched ed proces processs and utili utilisin singg someth something ing like like the Strate Strategy gy pattern, or the Chain of Responsibility pattern in order to handle the processing. You can’t see the start and end of a particular method at the same time. time. If a single method occupies more than a single screenful in your editor of choice, you have a problem. Look carefully at those methods to see if you can’t at least break them down - the chances are good that they’re doing more than one task. As a general rule of thumb, I’d suggest ensuring that your methods contain no more than twenty lines of active code. There are lots of comments inside your your methods. methods. Nicely documented code is a good thing, but if you find a method that feels the need to explain every step it’s taking, it’s either taking too many steps or the author thinks you’re a numpty. The best methods are nice and short, with easy to follow code and a terse but helpful explanation of the intent in the docblock above it. These are the types of things that we need to be looking out for when we’re reviewing reviewing the code of our more junior team members, and indeed the code that we produce ourselves. Detecting code smells is a knack that comes with both knowledge and experience but just by being aware of these three things, you’re already well on your way. way. Nevertheless, code smells are certainly rife in PHP. P HP. Somehow it just seems to be something that we in the PHP community have grown up with, although of course there are plenty of examples to be found in other languages too. Even so, they are certainly something that we need to guard against. Much of the advice in the upcoming chapters is geared towards not such much how to avoid code smells, but how to take the right approach to creating an application whose objects don’t stink. Martin Fowler has an excellent an excellent bliki post⁴ on post⁴ on code smells which succinctly explains what they are but in doing so he makes mention of anaemic objects that might benefit from having behaviours added to them. I’m not in full agreement with the brevity of this post since I really am very keen to put forward the notion that any single ⁴http://martinfowler.com/bliki/CodeSmell.html
Object Oriented Thinking
22
class, and the objects that are instantiated from them, should have a laser-focussed intent and purpose. If you’re interested, you can find a list of the more common code smells online, but I would actually be keen to suggest that you look them up after we’re done with the first part of the book. Reading about them afterwards is much more likely to reinforce what you will have already read at that point. Anyway, let’s bring this back on topic. If we were to continue along the path of tightening the focus of our imaginary User class, what questions should we be asking? We’ve already considered the possibility of removing the register() method since we’ve determined that it doesn’t belong within instances of our User class. How about password handling? This is perhaps the second most prevalent wrong thing to to be found within a user model. For sure, we may want to accept and hold the hashed value of a user’s password but do we actually want to incorporate the hashing mechanism within the user class itself? The immediate answer seems to be yes, since it’s something that we’ll be doing only in conjunction with the user’s own data. Nevertheless, we need to consider all of the things that we might want to do with passwords. For starters, we’ll need to be able register for an account, which then to accept a password from the user when user when they register for needs to be hashed appropriately. Obviously. we will also need to be able to check a password when they log in, generating a hash of of the password that they’ve given us and checking it against the hash that we’ve already stored. Already, we have two processors for the most basic of operations. Experience tells us that we are also going to have to provide some sort of password reset mechanism, since some of our users are likely to fall squarely into the can’t remember passwords for toffee toffee camp. Do we also need to implement a lock-out mechanism after three failed login attempts? Answer Answering ing these these questi questions ons leads leads us to the conclu conclusio sion n that that actual actually ly,, buildi building ng passwo password rd handling logic into our User class is maybe not such a good idea afterall. Instead, we can wrap up all of these password related methods inside a new PasswordManager class, instances of which can either be injected into our User instances at creation time, or lazily loaded on request dependent upon our appetite for tight/loose coupling between a user instance and a password management processor. Simply by hiving off two very common processes, that of registration and password management, we’ve not only improved the focus of the User class dramatically, we have also created two additional classes with tightly defined areas of responsibility.
Object Oriented Thinking
23
That in itself is as awesome as a large and tasty pint of ice cold beer. Well, maybe almost as awesome - nothing really comes close to a large and tasty pint of ice cold beer. Ever. So where does this leave us now? All being well, we have progressed from the stage where the most basic tutorials leave off. We are now a little better equipped to guard against thinking of our primary classes as silos for the ever expanding lists of processor methods that our application appears to need. This is rather the key point that I want to make at this stage. All too often, it’s terribly easy to get stuck with the real world nouns idea when thinking about the objects that will come into play within our applications, when what we really need to develop is an ability to think of application objects in an abstract sense. There isn’t a tangible real world equivalent for a PasswordManager but if we can successfully keep that notion of objects only being able to know things or do things at the forefront, we’re a long way down the trail of instinctively knowing what should go where.
Summary For our first chapter, I’ve rather concentrated on the idea of grouping our application’s objects into two distinct camps; the knowers and the doers. This is very much the key theme that I would like to introduce at this point. In a very general sense, our knowers are likely to be the principle citizens that reside in our model layer. They hold and represent the data that lies at the heart of our application. These are the users, the products, the orders and the invoices. They present an interface which is designed to set, retrieve, manipulate or transform the individual elements of data that they are responsible for. Then there are the doers. The objects within our system that cause things to happen and whose interfaces comprise methods that we can call in order to trigger those things. These doers might mask the simplest of processes, such as the hashing of a password, or they might be a facade onto much more complex procedures, governing governing the various stages of user registration for example. Clearly then, I’m really quite keen on this idea. Largely because I’ve witnessed the positive effects that it can have. It’s It’s not so much the idea in itself per se, more that the end results speak for themselves. Smaller, tighter, leaner and meaner objects are so much more efficient and maintainable than the alternative: classes treated like silos
Object Oriented Thinking
24
into which we’ve dumped great quantities of superficially related methods almost as if we considered the class name to be little more than a namespace for a coagulated library of code.
The Four Central Tenets If there’s one thing that developers seem to get a bit lax about when they attain the senior developer developer status, it’s the basics. basics. The simplest simplest bits of object object oriented oriented PHP P HP development. I don’t know why this would be. If you take your average Joe Senior and ask him to explain things like encapsulation or inheritance, the responses you’ll get tend to sound like blog post titles. And titles. And why not? They not? They are after all, the basics of object oriented development anyway. Encapsulation? Yeah, Yeah, you can do data hiding with encapsulation. Abstraction? Great for reusing common bits of code by moving them into a parent class. A core part of the problem here is that nobody really thinks about the four central tenets of object oriented development once they’ve gotten past the beginner stuff. They read it. They got it. They tried the code examples. Bang. Done. What’s next? This, of course, is entirely reasonable. Why should they? With a constantly evolving landscape and some really exciting new tech stuff just around the corner it’s not exactly easy to persuade your average developer to give up that article on PSR-7 and the implications that it brings for HTTP middlewares in favour of reading about encapsulation all over again. What are you proposing? Kinda sounds like it’s gonna be a drag… Indeed, I know that there’s quite a hurdle ahead of me by starting a book with programmi gramming ng topics topics that that are decades ol decades old. d. Neve Nevert rthe hele less ss,, I’m I’m moti motiva vate ted d to turn turn my read reader erss into winners because it’s the developers that do strengthen their understanding of the basics - those guys, you guys, you are the ones that will win in the end. You might feel tempted to go thundering ahead without doing a pre-race check but I can assure you, those wheels are going to come off before you know it. Instead, the developer that takes his time over the fundamentals and the basics is the developer that will find the middle and end stages of the race, the crucial stages, stages, much smoother and easier. He or she will be the one to go cruising past and on to the finishing line, whilst the other guys? They’ll be the ones bogged down in the pit lanes of bug fixing and maintenance. 25
The Four Central Tenets
26
The four central tenets are of course encapsulation , abstraction , inheritance and polymorphism ; topics that have been around for as long as there’s been object oriented programming. Each one gets its own chapter and I’ll deal with them in that very same order. We should turn the page then, and get cracking with Encapsulation , for I am sure that you are as excited at the prospect of reading about it as I have been in the writing of it. Oh yes.
Encapsulating our ideas. When was the last time that you thought about Encapsulation ? The chances are high that it’s been quite a while, especially if you’re already living among the ranks of the Senior Developer. Encapsulation is the very first thing that you learnt about object oriented development and it’s hardly a topic that you’ll have had cause to revisit very often. Despite this, I’m keen for us to go over the topic again. After all, I’ve harped on plenty of times already about making sure that we build upon solid foundations and to do that, we need to get the absolute basics down pat. If, in the previous chapter, you silently nodded when I wrote “Encapsulation? “ Encapsulation? Yeah that’s about data hiding ” then we do need to address this topic! Undoubtedly it is with the very best of intentions that many of those tutorials that we’ve all been exposed to tend to present the notions of encapsulation encapsulation and data and data hiding/information hiding/information hiding as hiding as bein beingg on onee and and the the same same.. Regr Regret etta tabl bly y, no nott all all tuto tutori rial alss do this this and and it’s it’s beca becaus usee of this, and my desire to get everyone on the same page, that we need to go over it again. So what is Encapsulation is Encapsulation then? then? Going by a generally accepted definition, it is “the bundling of data with the methods that operate on that data”. See? Nothing in there at all about the hiding of anything. This should naturally gel though with what I was proposing earlier about how the knowers in in our application should only be carrying those methods that are responsible for maintaining and manipulating the data that the knowers carry. carry. To give a super-brief example in code, something like this will illustrate the idea of encapsulation .
27
Encapsulating our ideas.
28
class User {
public $name; public function setName($name) { $this->name = $name; }
public function getName() {
return $this->name; } }
This is nothing but a painfully simplistic class that shows the bundling of data ($name property) with the methods that operate on that data ( setName() and getName() respectively). Not the slightest hint about hiding anything going on there, particularly when you consider that the $name property and the two methods are all declared as being public. So, the guy that gives you the stock response about data hiding? He’s a bit off the mark but who can possibly blame him? Even the Wikipedia page on Encapsulation seems to promote the very same idea. Make no mistake though, when we come to building out robust, scalable and maintainable applications we do actually want to keep the ideas of encapsulation and data hiding together. So what then is data hiding? For starters, let’s get rid of that term “data hiding” entirely. Correctly, we need to refer to it as information hiding instead, which more appropriately covers both the data aspects and the implemented logic that operates on those data. Information hiding is the process by which we can prevent certain data and logical aspects of our classes and objects from being accessible to its clients and collaborators through the use of access modifying keywords such as private , protected and final . Or to put it in a slightly different way, information hiding is all about the ability to protect the inner workings of our objects from outside interference. But what benefit is that? The answer lies in the ability to protect the integrity of our data. Nothing is more important to the security of your application than this.
Encapsulating our ideas.
29
When we consider the topic of security from the perspective of an online application, we often concern ourselves with the issue of protecting that application from outside attack. We want to keep the bad guys out and our databases safe from evil wrongdoings. This is of course how it should be but I do also want to make sure that when we talk about application security, we reposition our view point to that of “data integrity”. Data integrity means not only that it stays where it’s supposed to, that is, inside the storage systems that we design and implement, but also that it remains correct and true.
Before we move on, let us just make sure that we’re clear on what this information hiding malarkey is all about. If we look again at that super simple class above, the $name property is clearly out in the open. Any of our other code, when it sees an instance of this User class could happily change the value of the $name property directly and indiscriminately. $user->name = 'Joe';
Pretty basic stuff! We can of course prevent that sort of unrestrained data mangling by making sure that we declare the $name property as being private. When we do, we’re locking it away from public view and subsequently forcing any client code to utilise whatever public interface we provide in order to request access to that data. Of course, such an interface as we decide to provide will be laced with traps such as validation, authorisation and the like. There are ways of achieving direct access to an object’s private properties but I’ll save those for later and only as cautionary tales . For the benefit of keeping the text flowing, I shall pretend for now that the value of a private property can in no way be modified directly from the outside. Hopefully, you’ll go along with me on that one.
So now that I’ve expended a little more than a couple of pages highlighting the differences between encapsulation and information hiding as two indepent things, what do I want you to do next? Why, nothing less than put them back together again!
Encapsulating our ideas.
30
Going forward, I want us to treat encapsulation as if it was both about bundling data and related methods together and making sure that we continue to hide the really sensitive bits. I didn’t want to proceed from this point without having already made the distinction between raw encapsulation as it truly is and information hiding as an independent thing . Now we can all move on safe in the knowledge that from here on in, we’re going to treat encapsulation as being both encapsulation proper and information hiding at the same time. The idea being that those of you that started off in the “encapsulation is data hiding” camp and those of you that didn’t are hopefully now more aligned.
Protecting our data So our goal then is to protect our data and preserve its integrity. Here’s a novel idea – make everything private. It’s not quite as loopy as it sounds! An object that only has private properties and no publicly available means by which those properties can be manipulated is an object that provides the maximum security for its data. Remember the saying that the safest, most unhackable web server in the world is the one that’s switched off ? Well, there is a direct parallel that you can draw here - the safest object that there can be is the one with all private properties and no public methods. If all of the properties and methods of an object are private, then nothing untoward can happen to the data that it encapsulates. Of course, it’s not easy to imagine any kind of use case for such an object. But let this be the basis for your approach to designing your object. Start with the idea that there cannot be a valid reason for providing any public properties. One of the greatest strengths of PHP as a language is the speed at which it allows you to develop a working application. Indeed, much of the documentation that comes with PHP’s myriad MVC frameworks describe how to create a working application in under an hour. Paradoxically, this strength, this power of PHP to allow such rapid application development is also the languages greatest weakness. It actively promotes laziness. The kind of laziness that is anathema to, say, a time-served Java developer. Creating high quality PHP code quite often means trying to avoid following the “happy path” and investing the time in going the long way around. Let’s look at this in greater depth.
Encapsulating our ideas.
31
Say we have a user object, the properties of which are username, email address and some sort of credit balance. We can reasonably expect that we’d want to render these property values to the user’s browser. By far, the easiest and quickest way to do this would be to make those properties public and then just echo them out at the relevant point in our code. Let’s just take a look at this in code so that we can see what I’m on about. class User {
public $username; public $email; public $balance; public function __construct($username, $email, $balance) { $this->username = $username; $this-> email = $email; $this->balance = $balance; } } $u = new User('joe', '
[email protected]', 12.95); ...
echo $u->username;
This is our super quick and easy way of setting up our object and then rendering one of our object’s public property. As far as PHP is concerned, it’s perfectly valid code of course. As far as reflecting PHP Brilliance by being part of a robust, secure and maintainable enterprise grade application, it’s nothing like. So what are we to do? The first step to fixing the user object above is to declare the properties private. This is always the right thing to do. If ours was one of those ideal worlds, it wouldn’t even be possible to declare object properties as being public in the first place. Unfortunately, our world is a little way off from being ideal, so we need to be on our guard against such fundamental mistakes. I say on our guard, but the sooner that every PHP developer starts crafting their class definitions with private
Encapsulating our ideas.
32
properties, the better. It’s one of those things that ought to be less second nature and more basic instinct . Let’s take a look at that change. class User {
private $username; private $email; private $balance; public function __construct($username, $email, $balance) { $this->username = $username; $this-> email = $email; $this->balance = $balance; } } $u = new User('joe', '
[email protected]', 12.95); ...
// No longer can we do this... echo $u->username; // More importantly, none of our code can do this $u->username = 'Rumplestiltskin';
Ok, now we’re starting to look a little better. With the code above, it should at least be apparent that the username, email and balance properties are protected against any external influences. The class as it stands isn’t exactly usable in this state since we now no longer have any way to access those values but no matter what our team mates do to the application, as long as they’re not editing this class, there’s no way to wilfully or accidentally change these properties at run time once the instance has been created. Brilliant. We’ve created a super secure User class that isn’t particularly useful but hey, at least those tricksy little properties are now safe and secure. Our work is far from done
Encapsulating our ideas.
33
though – it is at least reasonable to expect that, even if we have no need for displaying those properties, the balance value at least is likely to change through the regular usage of our site. More work? Yes, it’s only right given that we are right at the start of the journey. ‘Tis a winding journey though, and with no guarantees of finding a magical ring to bind them all along the way. You’ll get plenty of cliches along the way, mind. Here’s one. Something about sprints and marathons. Yes, it’s not a sprint. And in the spirit of the aforementioned cliche, let’s take a small diversion. I’ve seen developers time and time again implement the magic __get() and __set() methods in order to manipulate the values of private properties in objects. Some popular PHP frameworks even support the very same approach, which to my mind is a crime most heinous. Not least because the very popularity of said frameworks means that a lot of PHP developers are being exposed to this approach and many of those may even come to think that it’s the right way to code because Hey, even these guys are doing it . Let’s look at that in code so that we can better see what’s going on. // error handling and validation omitted for readability class User {
private $data = array(); public function __set($key, $value) { $this-> data[$key] = $value; }
public function __get($key) {
return $this-> data[$key]; } }
There. The developer’s guilt is assuaged. He’s holding the object’s attributes in a private $data array, which is of course the right thing to do, yes? Let me ask you something: Why is this any different to using public properties? The answer of course is that it is not. Sure, we can start adding logic to the __get() and __set() methods
Encapsulating our ideas.
34
to tighten things up though we’d certainly be fooling ourselves if we thought we were actually improving our code by doing this. This should become more apparent as we get further along the PHP Brilliance trail. Have you seen the TV Series Lost ? Jack, Kate, Sawyer, Locke et al stuck on a mysterious island after a plane crash. The first, second and third seasons of the show make a great deal more sense the second time around. Once you know the ending, the beginning has so many more of those A-ha! moments that make the second sitting enjoyable for reasons different to your first run through. Unavoidably, this book is going to be a bit like that. Admittedly, with fewer polar bears. To be fair, there’s no real harm in using the magic __get() method like this purely from the perspective of its own operation as long as the method itself does implement some kind of error handling for non-existent keys in the data array. On its own, the __get() method does nothing but yield up values. However, its presence unavoidably lowers the resulting quality of our software, if only by making the merest suggestion that it should also be permissable to implement the corresponding __set(). The very first moment that you introduce __set() into your application’s codebase is the very first moment that you open a huge can of worms and you had better be prepared to cook ‘em and eat ‘em and swallow ‘em down as it will not be very long that the bugs come marching after should you happen to leave any dangling. When you use the __set() method simply to modify an object’s property values, you’re effectively wasting your time by coding the thing up. Just set the properties to public and be done with it! Of course, I’m not actually advocating that particular course of action. Quite often you’ll see the __set() method get fleshed out with code and logic of varying complexities. This might seem like a wise and clever thing to do but let’s face it, every time a change request comes in that calls for this logic to be modified, the opportunity to introduce bugs into the application comes right along with it. Personally, I’d like to strike __set() from the language itself as I don’t believe any good can come of its most common usage. When you have at your fingertips the power to control the setting of each element of data by creating your own specific
Encapsulating our ideas.
35
and dedicated method for the purpose, the use of the __set() method is naught but a shortcut. In taking that shortcut, you’re effectively declaring that this one method will suffice for all possible current and future needs for setting data within this object. That’s when a change request will pop up that doesn’t quite fit the bill and you find yourself saying “Heck, I’ll just pop a little if() statement in there to handle this one case…” Those two magic methods are here to stay though, at least for the foreseeable future. I’d advise that you treat them with extreme caution, almost as if they were a pair of snakes. Hold them at arm’s length. By the neck. Whilst wearing some very sturdy snakeproof gloves. Or in other words, I’d settle for it being a rule of thumb that this pair of magic methods only ever be used in very clearly defined sets of circumstances that are appropriate to the application in question. And by circumstances, I don’t mean it’s just quicker to do it this way . Ok, diversion taken. Let’s press on. If you recall, we’d gotten ourselves to the stage where the username, email and balance properties of our User class were declared as private and that these values were passed in via the constructor. As far as the rest of our application code is concerned, those properties don’t even exist. They’re hidden. Invisible. But what good is that? And how on earth do we use those properties in our application. There are clearly situations that are going to arise where we need to display the username - perhaps on a dashboard or profile page. We can also be certain that we’ll to want to send the user an email every now and then, elsewise why even collect that detail in the first place. If these properties aren’t accessible, are not even visible, to the code outside of the User class, how do we get hold of their values? What if we need to update the $balance property following a user’s transaction? Let’s look again the the class that we’ve defined.
Encapsulating our ideas.
36
class User {
private $username; private $email; private $balance; public function __construct($username, $email, $balance) { $this->username = $username; $this-> email = $email; $this->balance = $balance; } }
We’ve used a technique called constructor injection here in order to get the properties values into the object when it’s first instantiated. As a rule of thumb, you use constructor injection only for those properties that the new instance will need to function as a finished object. If we’re saying that our new User instance won’t work properly without the $username and $email properties, then those are the things that we need to make sure are included in the constructor. If however, we are able to say that the $balance property is a thing that we may or may not need to refer to later then we’d take it out. What goes in on the constructor should only the be essentials for the object’s operation in every case . All of the others, such as the $balance property here, might be better served by being lazily loaded on demand. We’ll get to that. Going forward, we’ll drop the optional property and just stick with the $username and $email. public function __construct($username, $email) { ... }
In the meantime, alarm bells should be ringing! I have already gone on a bit about how we should be ensuring the integrity of our data but there’s nothing in here yet to guarantee data quality. Sure, the constructor requires two parameters to be passed to it when the object is instantiated and at least, the PHP interpreter will exit with a
Encapsulating our ideas.
37
fatal error if we don’t do that. As humans though, we can look at the name given to those two parameters and intrinsically understand what they are supposed to be . The PHP interpreter doesn’t though. For now, the PHP interpreter will proceed happily just so long as two values are passed when the object is created. If we somehow had code that provided an integer for the first parameter and boolean false for the second, we would end up with an object that would have been happily created but the values of its properties would be inherently wrong. This is another one of those perceived flaws in the PHP language. Its loosely typed nature allows us as developers to do all sorts of hideously wrong stuff with our coding. Our natural response to fixing up this code would be to start adding validation logic to the constructor in the spirit of declaring that we’ll only accept these parameters if they look like they’re valid . A swift peek at this sort of change might look like this: public function __construct($username, $email) {
if (!$this->isValidUsername ($username)) { throw new Exception('Username parameter is invalid'); }
if (!$this->isValidEmail($email)) { throw new Exception('Email parameter is invalid'); } $this->username = $username; $this-> email = $email; }
It’s better, I’ll grant you that. As long as those isValidXYZ() methods are implemented correctly, then we’ve tightened up the process of creating an instance of this User object. In theory, any User object instance will have a valid stringlike username property and a valid email address property set within it. However, we’ve just introduced a new pain point into our coding. Throwing exceptions in a constructor, or worse, throwing exceptions in a method called by a constructor is like sticking your head down the waste disposal and hoping for a happy outcome. I’ve only used the most generic exception class here for illustrative purposes since exception handling is a topic worthy of its own chapter, at the very least. Why am
Encapsulating our ideas.
38
I getting so uppity about throwing exceptions in a constructor? This in itself is a debate that’s gone on for a long time, one that polarises the audience firmly into the yes and no camps. It’s also something that I will deal with properly at length in the chapter entitled Instantiaphobia but for now, I shall just bring up the point that you can’t unit test a constructor properly. There’s no return value. It can’t be mocked. There are better ways of achieving our desired outcome, and I’ll get to those in due course. Back to our original problem of getting valid values into our object at the point of instantiation. The trick here, the way to tighten things up and prevent the kind of data insecurity that comes with a loosely typed language is to use the type hinting features that the language provides us with. Us PHP developers have not been given type hinting on scalar variable though, not yet at least⁵, which gives us something of a problem to solve. Not that much of a problem though - let’s create validator objects. public function __construct(StringValidator $username, EmailValidator $email) { $this->username = $username-> getValue(); $this-> email = $email-> getValue(); }
That’s a little better. Our User class is now delegating the validation process to dedicated objects, ones that we can explicitly type-hint for in the constructor’s method signature. Through delegation, we’re offloading the responsibility for ensuring that the values we need are indeed valid. Additionally, through encapsulation we’re saying “Hey, I don’t need to see your internals to determine whether you’re doing your job right, just let me invoke your public interface so that I can get the values that I need”. In the meantime, let’s hope that those validators are properly unit-tested. Moving on though, the thing that’s going to let your PHP Brilliance shine through is the way that you seemingly consult your mysteriously mysterious developers crystal ball and see how, in two, three or six months time, someone from the business team is going to wander up to your desk and say something along the lines of: “I think we need to start recording the date of birth of our users. Oh, and the gender. And what they do for a living. Which country they’re from. Oh, and have they ⁵https://wiki.php.net/rfc/scalar_type_hints_v5#vote
Encapsulating our ideas.
39
validated their email address too. Ah, we’d best also have hair colour, eye colour and preferred brand of toothpaste.” You know it’s true. It doesn’t matter whether you follow agile or waterfall development methodologies - the business team will be giving you change requests midcycle regardless. The problem here is that that constructor is going to get very large and very messy in a surprisingly short space of time. Jokingly, I listed an extra eight properties that the business team wanted to add to our user profile. If we just kept adding them we would end up with a constructor comprising thirty lines of code for validation and an additional ten lines of code for the actual property assignments. In other words, a maintenance nightmare. So what do we do? One approach that we might consider taking is to replace the list of property specific parameters with a single array parameter. After all, we can type-hint safely on an array still. public function __construct( array $userData) {
if (!$this->isValid($userData)) { throw new Exception('Provided user data array is invalid'); } $this->username = $userData['username']; $this-> email = $userData['email']; ... }
Now at least our constructor is much more succinct, even though we’ve clearly just taken the retrograde step of reintroducing exceptions to our constructor. Nevertheless, we could even strip out the individual lines of property assignment by holding the $userData array itself as a private property instead. This is certainly a step in the right direction, which would leave us with a User class that looks very much like this:
Encapsulating our ideas.
40
class User {
/** * @var array $userData */ private $userData; public function __construct( array $userData) {
if (!$this->isValid($userData)) { throw new Exception('Provided user data array is invalid'); } $this->userData = $userData; }
/** * @param array * @return boolean */ private function isValid( array $userData) {
// array validation occurs here } }
Our class file just got so much shorter. It also got so much easier to read as well. Shouldn’t we be celebrating around about now? Especially when it’s also perfectly clear that the next time someone from the business team comes up to our desk, adding a new data element for our user object to handle ought to be a simple case of adding another piece of validation to the isValid() method. But there in lies our problem too. We’ve moved the validation process back inside the User class again. This gives us a number of problems to think through. It should be pretty easy to imagine that that next thing we will need from our users will be a mobile phone number for sending SMS messages to. With our code as it stands, we’d most likely add the phone number as an additional element in the $userData array and then proceed to add lines to the isValid() method to validate the phone number that we’ve been given. Bang! Just like with the email property, we’ve got another value being validated inside the User class. A type of value that, in all likelihood, we’ll also want to validate
Encapsulating our ideas.
41
elsewhere. So, what do we do? Naturally, we create a PhoneNumberValidator class so at least the validation logic is reusable in any number of different situations. The problem that we’re then faced with is how do we get an instance of a PhoneNumberValidator into the User object in the first place, now that we’re no longer using those handy validators as constructor params? One thing that we do not do is this: private function isValid( array $userData) { ... $pnValidator = new PhoneNumberValidator($userData['mobileNumber' ]); ... }
That single line of code above leads to something known as tight coupling; something that we want to avoid. What we’re saying here is that, with this line in place, the class User now depends on the PhoneNumberValidator class. This validator class is, in all likelihood, located in a separate file. (If it’s not, then you’ve definitely got other problems too!). This one line ties the two class files together in such a way that the User class can’t exist in an application without the validator. Of course, you could argue that if we have a User class that needs access to a validated phone number, then both classes are likely to exist within the same application anyway. The situation that we’ve created here is actually a little bit worse than that though. In the User class, we’ve not only declared a hard dependency on the PhoneNumberValidator class. We have also hard coded in the way that the validator is instantiated by passing one of the $userData array values to the validator’s constructor. This is one of the most common pitfalls that both junior and mid-level developers will succumb to. If the signature of that constructor method ever changes, then we have at least one location where our application is broken. Not only that but it’s broken in that sweet old classic way that leads to the expression “Well, I fixed this bug here, but now that bit’s broken over there.” “Aha! ” I hear you exclaim, “But in such a situation, one where I was modifying the constructor of a particular class, I would naturally grep the codebase and fix up every location in the code base where objects of that class are instantiated.”
Encapsulating our ideas.
42
Which is fine of course. At least until we reach that point where a colleague starts instantiating objects from class names held in variables. You know the thing that I mean: Dynamic class resolution. // Scary code $className = $propertyName . 'Validator'; $validator = new $className($propertyValue );
return $validator;
The only places where things of this sort should be happening are inside factories, abstract or otherwise, and where the object returned conforms to a clear and well understood interface. If it happens out in the wilds, in the code that forms the application’s normal process flow, then you’re, well, buggered to put it bluntly. The game over bell has knelled, it’s time to start proposing the next rewrite. Instead of trying to wiggle out of it, let’s just say that we’re not going to do it. The concepts of loose coupling and dependency injection will be discussed at length later on in this book. So too will the idea of developing an allergy to using the new keyword . For now though, I’d just like you to take it as read that instantiating validator instances inside our User class is a strictly no-no scenario. I say ‘take it as read’, but if this is already your second or more reading of this book, you’ll already know why this should be so. See? I told you. Polar bears! So how do we get the validator into the User class then? One answer, certainly, is to pass it in on the constructor as well. Our next refactor does exactly this. public function __construct( array $userData, UserDataValidator $udv ) {
if (!$udv->isValid($userData)) { throw new Exception( $udv-> getErrorMsg(), $udv-> getErrorCode() ); } $this->userData = $userData; }
Encapsulating our ideas.
43
Look at what we’ve done there! The suggestion here is that all of the validation logic for the $userData array is tucked away nicely in the UserDataValidator object. As long as we can have faith in this validator and its operation, we’re initialising our User object with validated data in a very succinct fashion. Furthermore, any future changes to the structure of the $userData (Y’know, this week the biz team wants ‘favourite colour’ recording. Next week, it’ll be ‘Shoe size’) will result in exactly zero changes to our User class as it stands now, which can only be a good thing, right? Still, we can take this one step further. We can combine the $userData array and the UserDataValidator into a single object. Let’s call it a UserDataTransport object and code up the changes like this: class User {
/** * @var UserDataTransport */ $private $userData;
public function __construct(UserDataTransport $udt) { $this->userData = $udt; } }
Finally, we’ve arrived at a very short and sweet User class which correctly holds its validated user attributes in a private property but in such a way as to not clutter up the User class itself with validation routines that we might very well need elsewhere too. From the perspective of our User class, it’s now being initialised correctly with valid data since we are also trusting the $udt instance to have done its job correctly prior to being passed in. Some of you might be thinking at this point that all we’ve really achieved here is to move the methods that should be in the User class away into something else, and then proceeded to put a rather useless shell, called User, on top of that. If this was the end of our journey then you’d be quite right. It isn’t the end of the journey though. Whilst we are indeed approaching the final straight, we have gotten to the point where we should take a pause, make a nice cup of tea and cogitate a bit.
Encapsulating our ideas.
44
If you look at what we’ve done here, you can think of that $udt instance as a seed, a kernel, a unit that simply represents the data that we want to persist to whatever storage systems we need to implement in our application. This is a very interesting point that we’ve arrived at. Our persistable data is now encapsulated in a discrete unit that is both separate from and consumed by our model class. The is a key design goal that we want to achieve when building out a large scale application. We want our model classes to concentrate on behaving like the model classes that they are supposed to be. Taking the approach of injecting something like a data transport object into a model class allows us to divorce the model layer from the implementation details of how we handle data persistence. If you’re remotely familiar with one or more of the PHP application frameworks that are out there, you will probably be aware of the many and varied approaches that they take to implementing persistance when it comes to the model layer. In some cases you are expected to implement application models by extending a framework provided base model. In others, you are expected to munge the design of your model classes in order to expose the properties that they manage to decorators and adapters. In either case, you’re required to adapt your model design to their persistance mechanism and in doing so, you’re expected to eschew at least one sound design principle or another. To my mind, this not only sucks but it also blows. By injecting the requisite kernel of data into your model classes, you free yourself from such principle-busting constraints. Your models can properly encapsulate the business logic that they were intended to and the vagaries of storage solutions are immediately and correctly relegated out of the model layer and into a persistance layer where they truly belong. What remains for our User class though? We are still at the point where we are only seeding our model class with the data that constitutes the entity itself and are yet to provide the means of accessing and manipulating the data. Our starting point is through the provision of a standard set of mutators – the setters and getters with which we should be all too familiar. To support these, we’ll also need a private getter for the data transport object itself. Here, let me show you in code.
Encapsulating our ideas.
45
class User {
/** * @var UserDataTransport */ private $userData; public function __construct(UserDataTransport $udt) { $this->userData = $udt; }
private function getDataTransport() {
return $this->$userData; }
public function getUsername() {
return $this-> getDataTransport()-> getUsername(); }
public function getEmail() {
return $this-> getDataTransport()-> getEmail(); }
public function setEmail($email) {
try { $this-> getDataTransport()-> setEmail($email);
return $this; } catch (DataTransportValidationException $e) { $msg = $e-> getMessage(); $code = $e-> getCode();
throw new ModelException($msg, $code); } } }
There are a few things to note here. Firstly, since we don’t currently have a use case for getting the data transport back out of the model entity, I’ve set the access modifier
Encapsulating our ideas.
46
to the getDataTransport() method to private. Our only use case for this method thus far is to serve the user’s data carrier to the getters that form our User class’ public interface. This is as it should be – remember that we’re keen to keep our methods private right up to the point where we have to open them up to collaborators. Additionally, I’ve only provided one setter method here, being the one for the email property. Unless your application supports a user being able to change their username, the chances are good that providing a setter for the username property would be superfluous to requirements. Even so, with the email setter I’m still not providing any validation within the setter method itself. There is a very strong case for this. If we are confident enough that the data transport object itself provides everything needed to ensure that the data it carries is valid and correct, confident enough to allow it to represent our entity’s data at the point of instantiation, then we should also have enough confidence in its integrity to be able to delegate the validation responsibility to it when setting individual properties. For this setter method though, I’ve implemented some raw exception handling. This is simply to illustrate the fact that we recognise that the data transport instance could balk at the value that we are passing to its setEmail() method and subsequently emit an exception. Additionally, since we are crossing application layer boundaries, we are recognising the need to recast the exception to a new type. There’ll be more on exception handling later on in the book. Ok, put the champagne on ice. We are almost there! Our User class is filling out nicely now with the provision of the three public methods that represent the object’s interface. We could certainly go on to add any further getters and setters as our application requires but I shall leave that for you to experiment with rather than take up any further space here. To do so would only serve to belabour the point. Up until this point, I’ve largely only concentrated on the objects that know things. This is primarily because it’s such a critical area of concern. I’ve been keen to stress that we shouldn’t allow our knowers to get bogged down with large quantities of bloated methods, the reasons for which are, I hope, abundantly clear. This does lead us to building some rather anaemic data classes though. Nevertheless, skinny (rather than anaemic) data classes are exactly what we need if we are to weather the storms that change requests bring.
Encapsulating our ideas.
47
Let’s switch our focus now then to the doers , the processors of our application. We have already discussed how we might break out the password handling processes away from our User class through the implementation of a PasswordManager class. To start us off down this route, let’s take a look at how this might be done in code. class PasswordManager {
const PASSWORD_MANAGER_COST = 10; public function getPasswordHash($rawPassword) { $options = array( 'cost' => self::PASSWORD_MANAGER_COST );
return password_hash($rawPassword, PASSWORD_BCRYPT, $options); }
public function verifyPassword($rawPassword, $hashedPassword) {
return password_verify($rawPassword, $hashedPassword); } }
As you can see thus far, our PasswordManager class has a very narrow expression of intent. Even without the attendant docblocks that should be included within this code, you can quite clearly follow the path of its execution. We’ve successfully encapsulated the two priciple methods of ensuring that we’re using the best possible password treatment into a single unit with a very definite purpose. From here, it’s a simple step to implement the other password_hash() related functions that PHP provides in order to complete the utility of this class definition. Incidentally, if you take nothing else away from this chapter, you could do much worse than to copy and paste that PasswordManager class into your own applications. If you’re not already using the password_hash() functions provided by the language, you should. I’ll explain why later on as this very topic gets its own chapter in the Practices section of the book.
Encapsulating our ideas.
48
Following the same principles, there’s no reason why we can’t return to that $bal ance property that we eliminated early on and proceed to code up a BalanceManager class, one that encapsulates everything we need to know about handling a user’s balance. Running an ecommerce site? There’s every chance that you will want to allow your customers to specify more than one billing and delivery address. Doesn’t that then lend itself admirably to the provision of an AddressManager class?
Summary We’ve taken a rather long and winding route to get here. At the beginning, I pointed out that encapsulation in the purest sense of the term is not the same as information hiding . And then I asked you to treat the two things as being one and the same. I rather hope that, instead of thinking of me as some sort of nutter, you can understand my intent – that everyone who finishes this chapter, irrespective of their starting knowledge, shares a common understanding at the end: that when we talk about encapsulation in our daily work, we’re actually referring to the two concepts bundled as one. A shared vernacular, a common tongue, is an important factor in collaborative software development. What I hope to have achieved here is just a little piece of that ideal. On our way through these pages, I’ve explored the evolution of a rather typical, though extremely skinny, model class and in doing so, I seem to have been overly keen on splitting off individual responsibilities into smaller, discrete bundles of code. There are some ancillary motivations for having done so and I’ll own up to them now. On the one hand, treating classes like they are silos into which we can merrily dump utility methods is a habit that I heartily believe we should all leave behind. All too often, I’ve encountered some truly monstrous class definitions, both in quantity of methods and in the size of the individual methods themselves. This is indicative of either poor initial design at the outset or of poor technical leadership during the build. It’s not uncommon for the more junior members of a development team to follow prior example and simply add new methods to the classes that are already there rather than take the bolder step of creating dedicated classes for dedicated streams
Encapsulating our ideas.
49
of work. This isn’t the fault of the junior guys of course, but of the guy(s) leading the development, those that are meant to be steering the architecture, those that are meant to be responsible for mentoring the team. On the other hand, I’ve also been laying down the groundwork for the topics that we’ll cover when we get to the Single Responsibility Principle. By the time that we get to that one, you should find that you are already prepared for it. Nevertheless we should all be at a point of enjoying an expanded consideration of encapsulation, expanded in relation to the seething masses out there in PHP land. Welcome to the club. I hope you enjoy your stay.
I made mention earlier that there were ways to access the private properties of one object from the outside. One of those ways is pertinent to the topic of encapsulation, so I shall mention it here. A good many of you will already know this but I’m going to cover it anyway as I think it’s appropriate that everyone be aware of it. It goes like this: When you have two object instances of the same class, one of those instances is able to both get and set the private properties of the other instance as if they were public properties . The basic premise being that if an object has intimate knowledge of its own internal workings, it follows that it has the same intimate knowledge of the internal workings of instances of the same class. It makes sense when you think about it but as they say, a code snippet is worth a thousand words so here’s a practical example of this. class Point {
// Properties declared as private private $latitude; private $longitude; public function getDistanceTo(Point $p) {
// Access the private properties // of the incoming Point instance // as if they were declared public
Encapsulating our ideas.
50
$remoteLat = $p->latitude; $remoteLong = $p->longitude;
// Perform distance calculation ...
return $result; } }
There you have it. The original Point instance can treat the private properties of the incoming Point instance as if they were public since both are instances of the same class. This in not uncommon amongst object oriented language, but it’s also rarely highlighted as a feature.
Code, but in the abstract sense. If you’ve been fortunate enough to have a formal education in Computer Science then there’s a very high probability that you’ve encountered the term Abstraction a surprisingly large number of times. Academic approaches to Comp Sci will often talk about abstraction in relation to both data handling and flow/process control. If, however, you’ve arrived at this point as a naive developer, then your understanding of Abstraction is likely one gleaned from the PHP manual, books on PHP programming and those pesky online tutorials that I seem to keep grouching about. That’s ok. It’s important to bear in mind from an Object Oriented PHP point of view that when we talk about abstraction, we’re talking both about data abstraction and abstraction of control. We’re going to deal with those in the coming pages. It used to be the case that someone who creates art but who hasn’t undergone formal training or art education was referred to as being a Naive Artist. The works that they produced were referred to as Naive Art. It’s not intended to be a derogatory term though, just a form of classifying art that is produced by someone who hasn’t had the formal training. I’m borrowing the term naive here to refer to someone who equally hasn’t been through the formal training in their field of expertise. In our case, I’m using this specifically to represent those of us who don’t have a degree in Computer Science. And for reference, I too am a naive developer in this regard.
Remember Joe from our previous discourses? The likeable chap who’s new to our team. Well, he’s back again and seemingly more than happy to help us in our explorations of Abstraction . Whereas encapsulation is a topic that we rarely think about, abstraction is one that is hard to get away from. Something that is especially true for anyone who has had exposure to any of the open source MVC frameworks that litter the PHP landscape. As a rough guess, I’d say that this is pretty much all of us then. 51
Code, but in the abstract sense.
52
Joe is certainly a member of that gang. When we ask him to illuminate us as to what abstraction is, his initial response is to tell us that it’s all about code reuse and how it lets us avoid duplication. We can hardly blame him for what he says here. It’s certainly one of the most frequent responses to the question and reasonably so, since it’s so often portrayed as one of the primary benefits of abstraction itself. But where does this idea come from? If you spend any length of time reading any number of articles on class abstraction in PHP, there’s a roughly even split between those that advocate code reuse via abstraction and those that do not. It may even be a little more in favour of the “not” camp. I’m basing this purely on my own feelings you understand, rather than on any qualified or empirical research. Before we start picking over the different ideas about what abstraction is, I’ll make use of a horribly familiar code cliche just to throw down some lines before we begin. abstract class Animal {
private $strength = 1; abstract public function say(); public function isHungry() {
return $this-> strength < 2; }
public function eat(Food $food) { $this-> strength += $food-> getStrengthValue(); } }
class Cat extends Animal {
public function say() {
print "Meow"; } }
I’ll not bother with the dog one though – we’re all far too familiar with this example already. Up above we’ve defined a class called Animal, which we’ve also marked as
Code, but in the abstract sense.
53
being abstract. In PHP, this prevents us from creating new instances of the Animal class directly. To use this code, we have to create child classes that extend from this abstract parent . Within this abstract parent class, we’ve defined one private property ($strength), two public methods ( isHungry() and eat()) and finally, one abstract public method ( say()). By marking the say() method as abstract, we’re declaring that any child classes that inherit from this abstract parent must also implement their own versions of the say() method. This is of course something that we have done in creating the Cat class as a subclass that extends the parent Animal. You knew all this already though right? Just checking. To get back to our original question then, where does this idea that abstraction promotes code reuse and allows you to avoid code duplication actually come from? To try and pinpoint a single source would probably be an act of folly, if not wild speculation. For sure we can all use our favourite search engine to locate articles online that pitch this idea but in truth, I think the major culprit is right under our noses and it’s one that I hinted at earlier on. If it’s true to say that we’ve all been exposed to at least one of PHP’s very many MVC-like framework implementations, then we’ve all surreptitiously been exposed to the idea of abstraction as code reuse. The moment that you extend a base model class in order to implement your own User model, or Blog model, or Product model then you are, in all likelihood, doing so in order to take advantage of some common, theoretically model related code in the parent. It may not be branded as code reuse directly, but there’s a very strong implication in promoting the practice. The very earliest players on the scene did this, implementations such as phpCake and CodeIgniter and it’s a practice that’s still promoted in some of the more modern additions to the scene, Phalcon and Laravel included. When the documentation tells to you extend a base model, it’s telling you to do this because the base model’s class is likely already populated with lots of utility methods. These methods will do everything from interfacing with your application’s database to providing easy ways to access model properties and everything inbetween. The trouble is, these frameworks can be so darned useful! That though, doesn’t excuse them from the gross violations of sound programming practice that are going on under the hood. If you haven’t already taken a look through the code, have a peak now at your most current or most recent MVC framework based application. If you implement models or controllers by extending the framework’s
Code, but in the abstract sense.
54
base model, have a look at the base class and consider how these relate to the very distinct entities that you’ve subsequently created in the child classes. A-ha! I hear you exclaim but are we not merely providing child class specialisations of the framework’s definition of a model? On the face of it perhaps. But if we are to subscribe to the idea that our objects will either be knowers or doers but not both, then we’ve clearly got the case here where the framework is asking us to create objects to, say, represent our User model that also have to lug around methods to trigger database queries and other spurious operations. This is not an idea that we, as Brilliant developers should be promoting. The frameworks are here to stay though, at least for a little while longer. Whilst there is a promising trend developing towards the use of microframeworks such as Josh Lockhart’s excellent Slim and SensioLabs’ equally excellent Silex and an equally promising trend towards widespread adoption of Composer managed component libraries, many of us will still have to live with the monolithics for the time being at least. At least now we’re in a stronger position of knowing where the violations lie. That can only be a good thing, right? I don’t wish to dwell on the frameworks too much in this chapter. It is, afterall, supposed to be about abstraction in the general, more theoretical sense. Nevertheless, I will bring them up again shortly as I turn our attention to something I made brief mention above: Utility methods. Or as I prefer to call them, convenience methods . It’s precisely at this point that I should bring in some additional players. The “Don’t Repeat Yourself” (DRY) principle and Kent Beck’s “Once and only once” (OAOO); two ideas that quite rightly point out that the duplication of code is a no-no that leads to maintenance nightmares and a whole heap of bugs. The basic premise is that, if you have the same code in multiple locations and you ascertain that that code needs to be changed, either through the discovery of a bug or as a result of a change request, then there’s every possibility that the necessary changes won’t correctly or effectively be applied to every location where that piece of code occurs. If the occurences of a piece of duplicated code get out of sync with one another, then it naturally follows that the way in which they operate will diverge. The copy or copies that got missed will clearly now be in a state of wrongness . Naturally then, it makes perfect sense for us to ensure that the relevant piece of code only occurs once and in a single location. When we do that, we know exactly where
Code, but in the abstract sense.
55
to look and can be entirely confident that we only need to make the change in just this one place . As far as our topic of abstraction is concerned though, the problem arises when we see one particular method in one class also being useful were it also to reside in another class. As I mentioned earlier, we’re typically a lazy bunch and sometimes the best way to satisfy our lazy inclinations is simply to move that rather useful method upwards into a parent class from which both classes inherit. If the method’s in the parent, then it’s available to both children. This is, of course, the point where I return to the consideration of the base model classes in frameworks. If you are expected to extended a base model class in order to create your own models, it’s likely then that the base model comes preloaded with convenience methods. I call them convenience methods (and I’m certainly not alone in this) as these are the methods that do things for us, quite unrelated to the intended nature of the resulting class, and are placed conveniently within easy reach. Notice how I said “quite unrelated to the intended nature of the resulting class ” in that previous sentence. I’ll get on to explaining that part shortly since it forms the crux of how we should be treating abstraction in PHP. Before getting to that stage though, we need to tie up the references to DRY and OAOO. We already know that duplicating code is bad, we can feel that one in our bones. Even if we don’t feel it, we can enjoy the benefit of DRY stating it for us quite explicitly. The DRY principle is stated as “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.”
Ergo duplicate code bad, code reuse good. What is important to note here though is that there are many different ways of avoiding code duplication and promoting code reuse. Achieving code reuse through a parent class, abstract or not, is simply the worst of the available options. There are alternatives available that are far more suitable and that will still let us adhere to the DRY principle. You’ll find these alternatives will be discussed at length when we tackle the issues of delegate polymorphism, composition, traits and closures. In the interim, should you encounter an article, a tutorial, a blog post or even another developer that tells you that abstraction promotes code reuse, don’t fall for it.
Code, but in the abstract sense.
56
Abstraction promotes code reuse like going to work every day promotes going on a crime spree. I feel like this statement may benefit from a little clarification so let’s turn our attention back to Joe for a moment, for he’s a likeable chap. Joe’s also as dedicated to the job as all new hires should be. Every day, he turns up at the office, bashes out a few lines of code and pushes them to the master repo before heading home again for his supper. One of the reasons that he does this is that he hopes to get paid at the end of the month, which is as to be expected. When he does get paid though, does he withdraw all of the money, go buy a gun with the compensation of his labour and proceed to rob as many grocery stores as he can get away with? One would certainly hope not, no matter how likeable he is. Turning up to work and doing the job that we’re paid to do doesn’t necessarily mean that we’re going to spend our wages on a gun and go robbing. Going to work and doing our job might allow us to do just that assuming that we have such an inclination, but it’s not generally an idea that’s actively encouraged in polite society. In a similar vein, abstraction might very well allow us the possibility of achieving code reuse but it does not automatically follow that we should do it. And it certainly doesn’t promote the notion or encourage it in anyway. In actual fact, employing abstraction and inheritance as a mechanism for code reuse is a commonly recognised code smell. This is what those frameworks do to your project when they ask you to extend from what is effectively a library of useful functions. They make your code smell. Do bear in mind then that we’re no longer thinking of abstraction as a means of reusing existing code and avoiding duplication. We’re going to see in the upcoming chapters how we can create chunks of code that embody reusability, yet still allows us to stop polluting our parent classes. When I use the term “polluting”, does it give you a hint as to where this is going? Let’s embark now on our journey to abstraction enlightenment. To do this, we will start by taking a look at what a class definition actually is and then proceed to consider how this simple notion may be applied to our own class abstractions. In virtually all languages in use today, whether strongly typed or not, we are supplied with the notion that we can hold our application’s data in variables of certain types. As you are already painfully aware, PHP is not one of those strongly typed languages.
Code, but in the abstract sense.
57
Not only are we relieved of the duty of declaring the type when we initialise a new variable, both we and indeed the PHP interpreter can switch between the types applied to that variable seemingly on a whim. I’m sure you know what I’m talking about, but to set this up properly, let’s consider a sample list of some of PHP’s scalar types expressed in code. $integer = 5; $float = 7.25; $boolean = true; $string = "Hello Joe!";
Clearly illustrated on these four lines are the notions of four data types in PHP. No surprises there then. We can add to this list by including some of the complex data types starting, as we should, with arrays and associative arrays. // Numerically indexed array $sequence = [1,2,4, 8,16,32];
// Associative array $person = [ 'name' => 'James', 'age' => 39, 'gender' => 'male' ];
As with encapsulation, I wouldn’t think you’ve devoted much brain space to these considerations in your daily work recently. Nevertheless, I would like you to do so now. Please consider this simplest of ideas: that you are so used to working with these sorts of data types that you don’t even have to think about which type to use. You need to perform floating point maths, you’d go for a float and in doing so, you know full well that it’s more suited to your requirements than say, using a string. You have a case where you need to represent a simple true/false condition? Why, that’ll be a boolean then. The point that I’m getting at here is that irrespective of whether we are dealing with integers, booleans or simple arrays, as long as we know the data type of the variable that we’re handling, we will also know the particular attributes and characteristics
Code, but in the abstract sense.
58
of said type and can use that knowledge to manipulate the variable appropriately. We are already comfortable with the way that they work. Here’s another example to illustrate the point. // Declare a simple array as a queue of people $queue = ['James', 'Mary', 'Paul'];
// Get the next person in the queue $nextInLine = array_shift($queue);
As developers, we know that this code works precisely because we understand the behaviour of PHP arrays. In exactly the same way, we know that we can’t add elements to or take elements away from an integer value because integer values don’t work that way. List or queue like characteristics exist in simple arrays and not in integers. Integers are simple whole numbers, end of story. Their behaviour is well known to us. Our knowledge of PHP’s data types and the way that they work underpins our decisions of which data type we use in order to code up the solution to a particular problem. Why not then take this idea further and extend it (no pun intended) to cover the creation of class definitions within our codebase? When we, as developers, go on to coding up a class of some description, what I want you to hold in your head is the idea that we’re creating a new data type. We do this precisely because our existing knowledge of PHP tells us that the language doesn’t already have the right data type that we need. As a result, we need to craft our own. To support this notion, I’m just going to return to that first list of data types and make a couple of additions.
Code, but in the abstract sense.
59
$integer = (int) 5; $float = (float) 3.14; $boolean = true; $string = "Hello World"; $vehicle = new Car(); $person = new Person();
Hopefully now, you can see the line of thinking that I’m taking when talking about defining new classes and objects within an application. In pretty much every case where we use an object as a store of data held as object properties, we could in theory use an associative array. Just like this: $person = array( 'name' => 'Joe', 'age' => 26, 'gender' => 'male' );
Poor Joe! I’m sure he’s feeling rather badly used just right now. Nevertheless, the array structure above is perfectly valid of course. Valid yes, but ideal? Highly unlikely, and not when we have the power of objects at our fingertips. Just right now we know we can code up a class that will better represent a person in our application. Just right now we can create a Person data type. So we proceed to create that Person class and in doing so, we are clearly defining the attributes, as properties, and characteristics, as methods, that that data type will exhibit. This in turn informs us of all of the ways in which we can access and manipulate the data held by a variable of that particular type. Does this make sense? We know from our experience of working both with integers and arrays that attempting to use the square brackets accessors on an integer variable doesn’t work. It’s ludicrous to even consider it. // This makes no sense $integer = 5;
print $integer[ 0];
It makes no sense because we know how integers work. We have been using the language long enough to understand the entire scope of their operation. However, using the square bracket accessors on an array variable makes perfect sense.
Code, but in the abstract sense.
60
// This makes perfect sense $array = array('James', 'Mary', 'John'); print $array[ 0]; // Output: 'James'
This makes sense precisely because we know that’s how arrays work. We know that getting access to an element of the array requires us to put a number inside the square brackets. Since we also know that PHP arrays are zero-indexed, to access the first element of that array, we need to use an index value of 0. How do we know? There’s nothing in that last code block that explicitly tells you how to get a handle on the ‘James’ element but I’d wager that if I gave you that array and asked you to print ‘Mary’ to the screen, there’s every likelihood that the line of code that you would use would read very much like, if not exactly the same as print $array[1]; even though Mary is clearly the second name listed in the array. The reason that you would do so is the very same reason as before. It is exactly because both you and I already know how numerically indexed arrays work in PHP. Given the fact that it’s one of the first things we both learnt about the language, we both understand the attributes and characteristics, the essential behaviours of simple arrays like this. Given an infinite number of parallel universes, at least one of those planes of existence will include a version of PHP that has, built into the core language itself, the perfect definition of a Person data type. In that universe, no developer ever needs to write their own Person class because the language already provides the ideal solution to use in every application that they create. In another of those parallel universes, Reece’s Pieces are always free. I’d like to think that there was a third such universe in which the perfect Person data type existed and everyone was issued with a free Reece’s Pieces dispenser. In our universe, no such data type currently exists and we generally still have to pay for chocolatey goodness. Now you coud, in theory, brush off your C skills (or use Zephir) and create a Person class as a PHP extension, presumably for a particular and very specific project. That would generally be considered madness though – think of the maintenance overhead! The thing that normal, sane developers do is to create a Person class in regular PHP code and subsequently instantiate Person objects from that class blueprint. By setting
Code, but in the abstract sense.
61
this new data type down in code, every developer on our team can read through that code and subsequently understand the attributes and characteristics that go to make up our Person data type. Assuming that you make it to the end of this book, you’ll notice that I’ll refer to this idea of creating new data types several times in the coming chapters. This is me attempting to achieve positive reinforcement of this idea which, as we stand now, might possibly fly in the face of much of what you already knew about abstraction. There. Now that I’ve gone and reminded you that you’re reading through the chapter on abstraction, you’ve just asked yourself ‘What on earth has all this data type nonsense got to do with abstraction?’ Yes? No? Maybe you did. Maybe you didn’t. What is certain at this point is that I’ve made no solid case for abstraction at all. You’re quite right in noticing that, but rest assured, I am getting to it. Cast your mind back to the beginning of this chapter and that much over-used code sample: abstract class Animal {
private $strength = 1; abstract public function say(); public function isHungry() {
return $this-> strength < 2; }
public function eat(Food $food) { $this-> strength += $food-> getStrengthValue(); } }
class Cat extends Animal {
public function say() {
print "Meow";
Code, but in the abstract sense.
62
} }
What if, instead of using that Animal class as the root of the inheritance hierarchy that we think we need, instead of using that Animal class as a repository of the methods we think are a common requirement of the rest of the hierarchy, we were to treat that Animal class as the immutable definition of a new data type? If our application needs an Animal data type, we’ve now got it, explicitly set out in perfectly readable code. If, like the well behaved programmers that we are, we were to properly document the code through the liberal use of docblocks, we could conceivably arrive at something that could be considered an extension of the PHP manual itself, at least from the perspective of this application. The rest of our team could then refer to this code and its documentation and not just learn, but also understand all of the attributes (properties) and characteristics (methods) that go in to describing the behaviour of the Animal data type. I made mention of the word immutable in the previous paragraph, but since we are creating these new data types in PHP code the term immutable is conceptual at best. We haven’t even gotten to the topics of inheritance and polymorphism yet but even so, I know you guys are good. You guys know your beans. When it comes to creating child classes from this abstract parent, you know that a developer can alter the behaviour that a child instance can exhibit just by overriding one or more of the parent methods. Would this be a good thing? The answer must surely be no. Especially not if we are indeed committing to the notion that the abstract parent defines the type. Let’s consider a very silly example to help illustrate the point. In another one of those infinite parallel universes, the creators of PHP decided that it would be a good idea to allow developers to create their own custom versions of the array construct. Wise as they are, they figured that we should be able to set the first index value as a config setting in php.ini, thereby allowing the application builders themselves to pick the number that array indexing starts with. As a result, in one application, the first element of a numerically indexed array can be referenced using the number 7, leading to code that looks something like this:
Code, but in the abstract sense.
63
$myArray = array('first', 'second', 'third');
print $myArray[7]; // output is 'first'
In another application, the Chief Architect has opted for a starting index of ‘42’ since it goes so well with their “I heart Douglas Adams” T-shirt. You get the picture. The image that I’m trying to spraypaint onto the sides of your brains here is a relatively simple one, perhaps mostly because I’m a bit rubbish at drawing. It’s this: If the attributes and characteristics of our most basic data types differ from one place to the next, how can we possibly and reasonably be expected to build robust, secure and reliable applications? The question is rhetorical of course, since we wouldn’t be able to. We might even be tempted to migrate to an alternative language, one that doesn’t insideously encourage such extensive maintainance nightmares. Consequently, why can’t we apply the same sort of rigour to the custom data types that we create in our code through the use of the class keyword? I say we can. I say we should. Indeed, I say we must. For the safety and security of our application’s data, it’s imperative that our data types, including our objects, behave consistently and in accordance with the specifications that are laid down. That means both for the types that are natively built into the language itself, and so too for the types that we add to the language whenever we declare an abstract base class. Just as arrays behave consistently from one php file to the next, so too should the instances of the Animal type behave consistently from one file to the next. PHP does help us in some way to achieve this. Whenever you use the abstract keyword for a class and optionally for method(s) within that class, PHP will place some restrictions on how that abstract class may be used. I’ll cover those briefly here, just to make sure that you know what I’m talking about. The first of these restrictions is obviously that you may not instantiate objects directly from that class but must use a child class instead. This is spot on and in laying down this restriction, I think it helps promote the idea that we’re laying down the blueprint for a new data type.
Code, but in the abstract sense.
64
The second restriction is that, for any of the methods in the abstract class that are also marked abstract, the developer is forced to implement that abstract method in any child classes that he or she creates. This concept alone helps to further bolster the idea that the abstract is imposing rules on its usage. Furthermore, the implemented method must have a level of visibility that is either the same or less restricted than that stated in the abstract method itself. For example, if the abstract method is declared as being protected, then the implemented method in the child class must be declared as being either protected (the same), or public (less restricted). However, the implemented method in the child class may not be declared private as that is more restricted than that of the method in the parent abstract. Additionally, the signature of the implemented method must also match the signature of the abstract method in the parent. To match means that both should have the same number and type of input parameters. I say should here, because it is actually possibly to define a longer list of input parameters in the implemented method but I’m only highlighting this in order to be factually correct. To my mind, being able to have a longer list of input parameters in such an implemented method breaks the concept of an immutable data type in ever such a subtle way. When it’s laid out like this, it should be possible to divine the sense of it. Yet at times, I’ve been given the impression that somewhere along the line, we, as the collective PHP developer community, have allowed the very idea of what abstraction is to become clouded. This may or may not be down to the frameworks or the wellintentioned but factually wrong tutorials but in any event, if we’ve achieved a little bit of common ground, a little bit of clarity over what abstraction can provide for us, then we can proceed as one. Of course, I’ve left out roughly half of the arguments – I’m saving those for the chapter on Inheritance, which is coming up after this, and for the chapter on Polymorphism which follows it. No discussion concerning data abstraction in PHP would be complete without raising the subject of interfaces, which is exactly what I shall do now. It is important to note though that when we talk about interfaces from an abstraction point of view, we are considering the generic concept rather than a concrete implementation through the use of the interface keyword.
Code, but in the abstract sense.
65
When you create an abstract class, which of course you will do so now in order to lay down the blueprint for a new data type, you are in effect specifying the interface for that type. You are declaring to the world, or at least as far as the boundaries of your application’s codebase goes, how consumers, collaborators and clients of this new data type may interact with it. When you provide the public methods for your new data type, you’re saying “Look, when you’ve got a variable of this particular data type, you can do this with it. You can expect it to behave in this particular way.” As it is with arrays, strings and integers, so it should be with the new objects that you create. With inheritance, you can make a real mess of this, especially when it comes to the values that you return from a method call. For the time being at least PHP doesn’t yet allow us to specify the return type in a method’s declaration in quite the same way as, say, Java does. Let me show you a quick snippet of Java code to help illustrate this bit. abstract class Foo {
public Bar getBarInstance() {
return new Bar(); } }
The method declaration on line 3 indicates that the values returned from this method will be instances of the type Bar by specifying the return type before the actual method name. Since PHP doesn’t provide us with the power to enforce return types from method calls, we have to rely on ourselves and our teammates to follow a convention, a standard approach of honouring the interface that we specify in the abstract parent. Of course, this is readily communicated when the abstract parent class provides an implementation for that method. The developer can see in the code the expected return type from calls to this method. What’s missing though is the declaration of the return type when the abstract parent class also defines an abstract method. One such as this:
Code, but in the abstract sense.
66
abstract public function say();
Looking at that line of code, there’s clearly nothing there to tell us what we’re expected to return when we craft the implementations of this method in the child classes that we derive from this abstract. The original author of this code, the designer of this class does have the power to communicate his intent to the developers that will use this blueprint. The author can, and indeed should use docblock commenting to indicate the return type, resulting in code that might look like this. /** method description * * ... other docblock tags * @return AnimalSound */ abstract public function say();
Of course, you should provide the @return tag in the docblock for every method that you create, but I feel that it’s especially important for circumstances such as this where the method declaration on its own fails to specify the return type. Supporting this idea is the convention that missing off the @return tag implies that the return type is void. That being the case, if you’re authoring abstract methods you should also provide the @return tag in a docblock whenever your expected return value is intended to be anything other than void. If you miss off the docblock and the child derivations end up returning a whole gamut of different types, then it’s your fault. You bad person, you. But if you provide the appropriate docblock and the implementing developer ignores it and choose to return whatever the heck they like, then that’s their fault. In this latter case, it’ll be you that gets to go home early whilst they’re the ones staying late to fix the bugs. The notion of providing an interface in this way only goes part of the way to conveying a more thorough understanding of the attributes and characteristics of our new data type, but it is a terribly important part of that process. This idea allows me to segue quite nicely into the final part of our discussion, that of type hinting . I’m reasonably confident that you all know what it is, the clue afterall is in the name, but our exploration of abstraction would not be complete if we didn’t at least touch on it.
Code, but in the abstract sense.
67
Type hinting is the practice of declaring what particular data types are acceptable to our code. In most object oriented cases, we’ll use type hinting in a method signature to inform client code what they must pass in as parameters if they want to invoke this method. Here’s an example. class Zoo {
private $animals = array(); public function addAnimal(Animal $animal) {
print "Welcome to the zoo, " . $animal-> getName(); $this-> animals[] = $animal; } }
Nothing shocking here. Our addAnimal() method declares that the parameter to be passed in must be an instance of the Animal class. If we were to try to pass in a parameter that wasn’t precisely an instance of the Animal class or of a child class derived from it, PHP would complain most horrendously and our application would be broken. The reason for using type hinting is a simple one: we want to ensure that the parameters passed in all honour the interface that is defined in the Animal class. For this reason, I’ve illustrated the point on line 7 where we invoke the getName() method. If we take it as read that the abstract parent Animal class defines a getName() method then we really ought to be confident that any concrete instances derived from the Animal class also not only implement the same method, either directly or by overriding it, but also provide return values of the same or a compatible type. Consider the following piece of code:
Code, but in the abstract sense.
68
abstract class Animal {
/** * @return string */ abstract public function getName(); }
class Cat extends Animal {
public function getName() {
return new AnimalName('Cat'); } }
The abstract parent class declares, through the power of commenting, that the implementation of the getName() method should return a string value. The intended behaviour of our new data type is quite clear to see. However, in the derived Cat class, the return value is clearly an object, which is going to suggest that our client code is going to need to do some return type detection, probably with an if statement, before it can use this value. We have broken the definition of what it is to be an instance of an animal and placed an unnecessary burden on our client code. A burden, incidentally, that is unlikely to be shouldered. Nevertheless, as far as the PHP interpreter is concerned, this code is perfectly valid. What a mess! When we allow varying return types from identically named method calls, we are denying ourselves from taking advantage of this one very powerful feature. Not only that, we are also writing into our future long stretches of bug fixing and maintainance. The developers who exhibit PHP Brilliance are the ones who honour the specification of the new data type as defined in the abstracted parent class, right down to ensuring return types remain compatible. They are also the ones who aren’t keen on staying behind to fix bugs whilst everyone else has gone to the pub.
Abstraction of process flow and control Even so, our discussions of abstraction wouldn’t be complete without paying due consideration to abstraction of control. You may be relieved to know however, that
Code, but in the abstract sense.
69
this matter will not require quite so many words as has gone before it. In essence, we have already covered a lot of the ground work already and it won’t take so much to extend those ideas just a little further. Thus far we’ve been considering abstraction from the perspective of data and in doing so, we’ve explored how we might create potentially very complex structures which describe not only how the individual attributes might appropriately be grouped together in a more logical whole but also how we expect that collected unit of data to behave and interact with the rest of our system. If you consider the final example of the User class from the previous chapter, we have distilled the essential attributes and characteristics into a blueprint from which user objects may be created. You’ll recall that such user objects were seeded with a UserDataTransport instance that was injected via the constructor and yet, the public interface of the User class itself provides no hint of such nefarious activities. What we have done is to hide the complexities of of our User class’ internal workings, abstracting them away behind our designated public interface. Thus we have been discussing data abstraction. With the abstraction of control, we turn our attention away from the objects that know things and instead, consider the lot of the objects that do things . The core principle is the same. We seek to mask the complexities of a given process by tucking them away behind an interface, which, whilst potentially quite complex in its own right nevertheless relieves the operator of needing to know the finer implementation details of the entire process. Let’s take a moment to consider how the case of a motorcar might help us better understand this flavour of abstraction. Now I don’t mean a car class. I mean a car. A real one. One that you drive to the grocery store and back again. When you climb into the driver’s seat, you are confronted with a potentially dazzling array of controls. A wheel, pedals, handbrake, some mechanism to change gears, maybe a satnav, turn indicators, controls for the headlights. And so the list goes on. The car’s purpose isn’t, despite what some of you might think, to provide a cosy little space in which you might make out with the hottie from the coffee shop. No. It’s purpose is to go somewhere. You can achieve this by interacting with the interface that the car provides you with. Therein lies the point. Whilst driving to the grocery store, have you ever said that
Code, but in the abstract sense.
70
you’re going to modulate the flow of liquid hydrocarbons into the combustion chamber, apply an electrical spark to cause a controlled explosion and in turn subject the piston heads to an increasing number of Newtons, which when translated through the gear train will result in us achieving our destination in a shorter period of time? No. No you don’t. You stomp on the pedal and yell “Oh yeah! Burning rubber, baby!” Or maybe that’s just me. In any case, the finer details of converting chemical potential energy into kinetic energy are hidden from you, assuming you’re not a chemist, that is. The complexities of operation that is the work of the internal combustion engine are abstracted away from the operator behind an interface built into the “cockpit”. We achieve exactly the same thing in object oriented programming simply by writing our terribly complicated processes down in code but hiding them behind some sort of interface. We take an idea of the actions that we want to happen, we express them in code and then we provide a means by which that process may be invoked. For the car, the process of actually starting the engine is hidden behind the turn of the ignition key. For our code, the process of running a complex action is hidden behind the definition of a public method. This is abstraction from the perspective of our doers . Let’s double check our understanding of this through the magic of code. class PasswordManager {
const PASSWORD_MANAGER_COST = 10; public function getPasswordHash($rawPassword) { $options = array( 'cost' => self::PASSWORD_MANAGER_COST );
return password_hash($rawPassword, PASSWORD_BCRYPT, $options); }
public function verifyPassword($rawPassword, $hashedPassword) {
Code, but in the abstract sense.
71
return password_verify($rawPassword, $hashedPassword); } }
Looks familiar, no? Welcome back PasswordManager class, our doer from the previous chapter. Here we’ve taken a process, the concept of turning a user supplied string into a hashed representation of that same string and written out the code that we need to accomplish this task. I’ll grant you, our code isn’t especially complex but it’s there as the body of the getPasswordHas() method on lines 7-11. Note however that this code isn’t floating freely within our codebase, it’s locked away inside a method. When our calling code requires a hashed password value, it doesn’t get to use that code directly precisely because we’ve abstracted it away inside a method call. It’s this method call that gets invoked when we need to trigger the complex process that’s behind it. The beauty of this is of course that it gives us the ability to change what’s happening behind the scenes without interrupting the operation of our application. If we need to increase the cost of the hashing operation, we can do that. If we needed to supplant the PASSWORD_BCRYPT algorithm with a newer, more secure algorithm, we can do that too. As long as the method continues to accept a raw password and return a hashed one, we don’t necessarily need to know what processes have been abstracted away behind that very simple interface. This is abstraction of control – hiding the details of a set of actions (computing the hash of a string) behind a different set of actions (invoking the method and collecting the return value).
Summary On our journey through abstraction, we started out in the humid, fetid jungles of abstraction abuse by considering how the notion of code reuse gets mistakenly tangled up in the process of abstracting to and inheriting from classes. By now I would hope that you’re in agreement that taking advantage of abstraction as a means to avoid duplication is not a sound approach to respecting the DRY principle. With any kind of luck, I strengthened that argument by taking in the notion of convenience methods along the way. As a quick refresher though, convenience methods are those methods that you might be tempted to add to a parent class just to provide easy access to a useful routine or
Code, but in the abstract sense.
72
process: one that has little or no relation to the object that we’re actually trying to define through the coding up of a particular class. This chapter, like life itself, makes more sense when experienced in reverse (or perhaps on a second reading). The idea that I’m really trying to push here is that the creation of an abstract class is, in effect, the process of defining a new data type for the language that we use on a daily basis. That’s largely because that’s exactly what it is. Data abstraction through class definition provides us with the power to say “Hey PHP, your array types just aren’t a good fit for the data I’m handling, I’m gonna add more types.” To achieve PHP Brilliance, we need to exercise caution though when we are defining new types. We want to achieve a tight but comprehensive definition of what that type actually is exactly because we want the guys and gals on our collective teams to be able to achieve the same level of familiarity with the new type’s attributes and characteristics as they are with ones that are already in the language. If we can do that, if we can achieve this sort of understanding we are well on our way to building robustness into our applications. Conversely, whenever we add convenience methods to our parent classes, abstract or not, we are progressively clouding the new data type’s statement of intent and thereby incrementally increasing the fragility of our applications. Fragility leads to bugs and things falling over, of this you can be certain. Fight your way out of that jungle and bask in the dazzling brilliance of a sunny world built from rigourously defined new data types.
Inheriting vast wealth is not always good. Now that we’ve gone through Abstraction at some length, it’s time to turn our attention to the other side of the coin. If our understanding of abstraction has changed, or at least has been realigned somewhat, how then should we view the matter of Inheritance? Once we have defined our new data type through the use of an abstract class, we can then proceed to create child classes using the extends keyword. This should all be old hat to seasoned PHP developers like your good self. As you might have guessed by now, that doesn’t necessarily stop me from wanting to upset the apple cart though. One idea that I’d like to introduce at this point is the notion that through the processes of abstraction and inheritance, we’re creating families of closely related data types. The idea of family should be a strong one; unless you’re the kind of PHP developer that was grown in a laboratory from stem cells, there’s a distinct possibility that you have parents. The chances are equally highly likely that, at some given point in the past, a male version of a human somehow managed to mingle his DNA with a female version, with the end result being the creation of a brand new human being. You, in other words. Accompanying this miraculous event, a newly jumbled mashup of their existing DNA was sent forth to conquer the foibles of PHP application development. This book isn’t the place to discuss the mechanics of how this biological wonder actually happens but since you’re here and you’re reading this, we can at least conclude that that genetic mashup process was at least moderately successful. The key point? You as the end result have inherited certain characteristics and attributes from your parents. You could, for instance, have the eyes of your father, your mother’s nose, your paternal grandfather’s fiery temper. In any case, whilst you may indeed be quite clearly be one of these Smiths or one of those Robinsons, you yourself are your own specialised version of them. There may be recognisable familial traits in there but you’re certainly not a direct clone of either parent.
73
Inheriting vast wealth is not always good.
74
Nevertheless, it’s likely that you’ll retain certain identifiable characteristics of one, the other or both. It is a particularly firm belief of mine that the notion of family is as important to object oriented programming as it is in real life. Through the process of inheritance, we have the power to create closely related families of data types, all springing forth from the fruitful loins of a tightly defined parent class. In programming though, we need to be a great deal more careful than Mother Nature’s apparently chaotic process of propagating the genetic characteristics of our ancestors. After all, now that we have started to invest so much care and attention into defining our new data types, we ought to feel especially motivated towards retaining those characteristics inside each of the progeny that we let loose upon the world. We already know that the use of the extends keyword grant us the ability to create child instances of another class, and in doing so those child instances will inherit all of the non-private properties and non-private methods of the parent. At this stage though, we also know that we can inherit from any class that isn’t marked final . You might rightly ascertain that on our voyage through this particular chapter, I may just grumble about this last part. Maybe. Just a little. Afterall, we’ve already spent a fair bit of time focussing on an abstract parent, not just any old parent. In the spirit of embarking on this voyage together, all buddied up and with our sturdiest programming boots on, let’s begin at the beginning. By referring to everyone’s favourite source of information on all things PHP, we can see that: Inheritance is a well-established programming principle, and PHP makes use of this principle in its object model. This principle will affect the way many classes and objects relate to one another. For example, when you extend a class, the subclass inherits all of the public and protected methods from the parent class. Unless a class overrides those methods, they will retain their original functionality. This is useful for defining and abstracting functionality, and permits the implementation of additional functionality in similar objects without the need to reimplement all of the shared functionality. – The PHP Manual (http://php.net)
Inheriting vast wealth is not always good.
75
‘Tis a thing of beauty in all of its concise brevity. Can I leave this chapter to end here then? No. The reason why we must forge on in our quest is precisely because the process of inheritance also grants us the ability to really mess things up on a truly monumental scale. And I mean really mess things up. It isn’t too much of an exaggeration to state that for every single time that I’ve seen inheritance handled correctly, intelligently, properly, I’ve witnessed a hundred truly awful implementations. Imagine if you will one company’s homemade ‘MVC-like’ framework in which you create controllers by extending the View class. I’ll spare you the finer details of this particular horror story. For now, at least.
Why bother with inheritance at all? It’s a good question and one that becomes increasingly difficult to answer once we’ve banished ‘code reuse’ and ‘convenience’ as excuses. The answer does indeed lie in the process of creating specialisations for the new data types that we’ve previously fashioned into our abstract classes. One phrase that we are going to have to become familiar with as we progress through this chapter is “inheritance hierarchy” so I shall cover it now. An inheritance hierarchy occurs precisely at the point when you have one class definition extend another. As soon as you have a parent class and a child class, you have a hierarchy. You also have additional responsibilities. This is the way the world works, of course. You create a child, you get additional responsibilities. From a programming perspective, it’s important to hold this notion of family in mind as you go on to create child classes. It is also important to hold your responsibilities in mind too, whilst you’re busy exerting your god-like powers and dabbling with the denizens that populate your application space. These responsibilities should become all to clear as we putter along. By this point, everyone reading this should certainly already know that when you create an empty child class that extends another class, instances of the empty child class will inherit all of the public and protected methods and properties of the parent. I appreciate that this is very much Inheritance 101 stuff, but let’s take a look at some code, just to confirm that we’re on the same page.
Inheriting vast wealth is not always good.
76
// Defining the new data type abstract class Vehicle {
protected $modelName; public function setModelName($modelName) {
if (!$this-> validateModelName($modelName)) { throw new Exception('Invalid model name parameter'); } $this->modelName = $modelName; }
public function getModelName() {
return $this->modelName; }
private function validateModelName($modelName) {
// internal validation routines // for the $modelName property } }
// Adding empty child class implementations class Car extends Vehicle {
// Empty class definition }
class Motorcycle extends Vehicle {
// Empty class definition }
Thus far, everything is nice and clear. At the top of this listing, we’re clearly defining our new Vehicle data type, the attributes of which consists only of a $modelName property. The characteristics of our Vehicle data type are described in the two public methods, setModelName() and getModelName(). Remember that, as far as our new
Inheriting vast wealth is not always good.
77
data type’s characteristics are concerned, we only consider the public methods as being the ones that define them. That is to say, from the perspective of this object’s collaborators, the overall behaviour that objects of this class will exhibit is governed by the public methods that these objects present to their collaborators, the interface that is exposed. Beneath the data type definition given by the abstract, I’ve provided two empty child class implementations, one for each of Car and Motorcycle. I’ve provided both of these empty classes purely to reinforce that initial idea; that both of these child classes will behave exactly the same as the parent Vehicle. Without modification, our child classes will exhibit behaviour identical to one another. Problems only ever start to arise once you begin the process of adding code to the child classes. When you do that, you need to exercise extreme caution that the changes that you make don’t alter the characteristics of the parent’s interface as expressed through the child. That in itself is quite a mouthful so let’s explore what we mean. Many developers see this ability to override parent class methods as a boon; something that empowers the child class specialisation process. After all, what’s the point in providing specialised child classes if you can’t override the behaviours written into the parent. This argument forms the crux of our discussion on inheritance. The immediate answer is that, yes, of course you can override parent methods in order to provide the specialised behaviour in child classes that your application needs. The way in which you do this though is one that requires careful handling. Perhaps more careful handling that you’ve been used to up until now. For the moment, let’s take the case that was introduced in the previous chapter. If we’re providing a blueprint for a new data type by creating an abstract parent class, then we want all of our child classes to conform to the specification of that new data type as declared by the parent. Put succinctly, the child classes should honour the promises that the parent class makes. In the event that we write code into a child class that overrides a method from the abstract parent, we should take care to ensure that the child’s method remains compatible with the parent’s specification. We can achieve this by following three rules. These rules help us to maintain the type safety of the new data type that we created in the abstract. Keeping our families
Inheriting vast wealth is not always good.
78
“type safe” is one of the best ways we can employ to prevent those pesky little bugs from creeping into our code and making it flakey. This is a good thing, right? We should be striving to build applications that are as robust and as bug free as possible, if for no other reason than to maximise our pub time. Later on in this book, we will be exploring the Liskov Substitution Principle, something that you may or may not yet be familiar with. It’s a big ol’d beast, the LSP and much too large a topic to cover in just a lowly little aside. Especially considering that we’re still in the Foundations section of the book. Rest assured though, we’ll get to analyse Liskov and her principle in due course. What we’re actually doing here is putting down a little of the groundwork first.
So what are these rules then? The first rule concerns the input parameters of the overriding method. The number of input parameters in our child method should be the same as or more than the number of input parameters expressed in the corresponding method from the parent. Ideally, we’d keep the quantities of input parameters the same since when we do that, we reinforce the parent’s definition as being the rule of law. However, we do also need to recognise the fact that since we are coding up child classes to cater for more specialised situtations, we may on occasion need to chuck in a little bit extra alongside the data type’s defined requirements. It probably doesn’t need to be said but rather than miss anything out, let’s just note that irrespective of whether the quantities of input parameters remain the same or not, the order of the parameters as specified in the parent’s method should be honoured in the child’s method. There’s a second facet to the first rule too – one that is perhaps of greater importance. The types of the input parameters in the overriding method should be the same as or more generic than the types specified in the parent. What on earth is this nonsense? More generic? At first glance, this is either ridiculous, indecipherable or both. If we’re to employ a mechanism for providing more specialised child classes, surely the input parameters of our child class’s methods need to be able to change in accordance to the needs of the child class, not those of the parent. Not so.
Inheriting vast wealth is not always good.
79
In order to preserve our desired goal of maintaining type safety, we must keep our attention fixed firmly on the definition of the data type as expressed by the abstract class. The abstract class should remain our single source of truth concerning the definition of this data type. Consider the following code. abstract class Mechanic {
public function fixVehicle(Car $car) { ... // do car repairs
return $car; } }
For the purposes of this illustration, I want you to imagine that we have a three tier inheritance hierarchy going on – but just for this illustration. class Vehicle { }
class Car extends Vehicle { }
class SportsCar extends Car { }
I’ll admit, it’s not the most exciting bit of code but I didn’t want to distract your attention away from the fact that we have progressively specialised classes running from the most generic Vehicle to the most specialised SportsCar. Now, according to our definition of a Mechanic data type, we are expecting an instance of the Car type to be passed in when we call the fixVehicle() method. If we are to assume that our client code will do just that, then all will be well as far as the data type definition is concerned. Let’s proceed to code up a child class of the Mechanic variety.
Inheriting vast wealth is not always good.
80
class SportsCarMechanic extends Mechanic {
public function fixVehicle(Car $car) { ... // do car based repairs
return $car; } }
So far, so good. Our input parameters are identical so we don’t even have to puzzle over whether we are still achieving type safety quite simply because we must be . This is your safest bet. However, the class is called SportsCarMechanic. Isn’t he a specialist? Wouldn’t you expect that a SportsCarMechanic would specialise in fixing sports cars? If we assume that this is true, our code would look like this. class SportsCarMechanic extends Mechanic {
public function fixVehicle(SportsCar $car) { ... // do car based repairs
return $car; } }
At first glance, this appears to be quite logical. We’ve declared a SportsCarMechanic as being a specialised version of a regular Mechanic. It seems entirely reasonable that we should only pass SportsCar instances into the fixVehicle() method. It’s wrong though. Even if our logical selves tell us that it’s right, it’s wrong. What our brains are telling us is that if we are creating a more specialised mechanic, we should reasonably expect it to operate in more specialised circumstances. What we are actually doing is mapping real world experience to application logic, possibly as a consequence of those tutorials that we read when we first started out. Let’s face it, if we’re fortunate enough to be able to afford a Ferrari, we would probably be more inclined to take it to a guy that only fixes Ferraris, rather than your more generic fix-everything mechanic. However, we have violated our data type definition by type hinting for a parameter that is more specifc, when the rule tells us that our parameter should be either the same or more generic.
Inheriting vast wealth is not always good.
81
The reason that it is wrong should become clearer when we shift our focus a little. Our data type definition as set out in the abstract parent specifies that variables of type Mechanic may have the fixVehicle() method invoked by supplying an instance of a Car as the input parameter. Recognising the data type definition, our client code supplies an instance of the Car type when calling the fixVehicle() method. The client code always sends an instance of the Car type. This is the only safe assumption that we can make because it is what is written into the definition itself. In light of this, the SportsCarMechanic class has now broken the contract that the abstract parent is promising. The SportsCarMechaniconly accepts SportsCar (or even more specific) instances. When our client code sends in a regular Car instance, which it does because the data type definition says it can, the SportsCarMechanic will choke and our application dies. So let’s fix that bit of code up. class SportsCarMechanic extends Mechanic {
public function fixVehicle(Vehicle $car) { ... // do car based repairs
return $car; } }
Here, I’ve replaced the type hint for the $car parameter with the more generic form of Vehicle. It. Just. Looks. So. Wrong. It’s a SportsCarMechanic dammit! And yet our rule states that if we’re not going to use the same input type we must go more generic in order to preserve type safety. Despite the apparent wrongness, let’s just test the logic of that. Remember that our client code is still sending in a Car instance because the definition says it can? Well, now it can again. We know from our code that a Car is a type of Vehicle. We can see from the above that our SportsCarMechanic is expecting to receive a Vehicle. Consequently, everything is still tickety-boo. It’s not easy to get your head around that, I know, and I’m half convinced it’s because our head wants to be able to map the expectations that we have of the real world
Inheriting vast wealth is not always good.
82
on the logic that we write when crafting our code. The real world is trying to lead you astray! If you can remember to apply this rule to the methods that child classes override, you’re half way to eliminating a whole swathe of bugs from the applications that you create. More pub time! Let’s move on to our second rule now then. If the first rule concerns the input parameters that the methods of both our abstract parent and our concrete child classes accept, the second rule governs the values that our method invocations can return to the calling client code. Fortunately, this one fits into a single sentence: Values returned from overridden methods should be of the same type or more specific versions of the same type as is returned by the same method in the parent class. This should be reasonably straight forward to understand now that we’ve worked through the logic of our first rule. Let’s look first at a new abstract data type definition. abstract class CarFactory {
/** * @return Car */ public function getCar() { ...// make a car
return new Car(); } }
Since PHP doesn’t allow us to actually specify the return type, I’ve docblocked the getCar() method to show that a call to this method should yield a Car instance. For this example, our Vehicle->Car->SportsCar hierarchy remains unchanged. If our second rule states that our return types in overriding methods should be the of the same or a more specific type as the one specified in the parent, let’s examine the code to support this.
Inheriting vast wealth is not always good.
83
class SportsCarFactory extends CarFactory {
public function getCar() { ..// make a sports car
return new SportsCar(); } }
Here I’ve jumped straight to a satisfactory conclusion. It’s entirely appropriate to expect that a SportsCarFactory should make SportsCar instances. Even though we’re still mapping real world expectations onto our logic design, it works for return types. If we have learnt anything from the first rule, it should be that we ought to ignore our real world expectations though. Consequently, let’s trace our way through the appropriate logic in order to confirm that this is so. Firstly, we know from the abstract parent class that the getCar() method should return a Car instance. This is the definition of how our data type is intended to behave. As a result of the data type definition, the client code that we build to interract with instances of this data type will expect an instance of type Car to be yielded from a getCar() method invocation. If that client code comes into contact with a SportsCarFactory instead and the SportsCarFactory honours the data type definition, then a call to the getCar() method on this factory should still yield a type of Car. Since we know that a SportsCar is indeed a type of Car, our client code continues to work just fine. I hardly need to do this, but lets just look at the other case to confirm that our thinking is sound.
Inheriting vast wealth is not always good.
84
class SportsCarFactory extends CarFactory {
public function getCar() { ..// make a sports car
return new Vehicle(); } }
Here I’ve substituted the return type so that the getCar() method returns a more generic Vehicle instance. According to our rule, this is wrong. If we follow the logic, we can confirm that it’s wrong. Remember, our definition in the abstract parent states that a Car instance is being returned from this method call. As a result, our client code is geared up to work with the Car instances that it expects to get back from invoking this method. Yet here, when our client code encounters a SportsCarFactory instance, it’s yielding up the more generic Vehicle type. We know from our hierarchy that a Vehicle is not a type of Car. When the client expects to work with a Car instance but gets a Vehicle instance instead, problems are sure to follow. End result? Less pub time. We’d be better off keeping the return types identical but happily, real world expectations will generally map onto the second rule. A more specialised child class can emit more specialised return types. So what of our last rule then? Our last rule is super easy since it maps the logic of our second rule onto the use of exceptions. Let’s spell it out: The overriding methods of our child classes may only throw the same or more specialised instances of the exceptions that can be thrown in the overridden method of our parent class. Let’s start with the very same data type again.
Inheriting vast wealth is not always good.
85
abstract class CarFactory {
/** * @throws CarException */ public function getCar() {
throw new CarException('Cannot make car'); } }
This time, I’m only focussing on the exceptions that are being thrown. We have a new inheritance hierarchy to deal with but since it’s a direct parallel to our Vehicle hierarchy, I’ll skip the code and just express it like this: VehicleException>CarException->SportsCarException. As before, our data type is selecting the middle type as being the one that it emits, but since it’s an exception it will be thrown instead of returned. Let’s look now at our modified SportsCarFactory, which would be coded up like this. class SportsCarFactory extends CarFactory {
public function getCar() {
throw new SportsCarException('Cannot make sportscar'); } }
Our specialised child instance is now throwing a more specific type of exception. This is ok still. Our data type definition states that a CarException can be thrown by this method so correspondingly, our client code is prepared for it
Inheriting vast wealth is not always good.
86
try { $sportsCar = $carFactoryInstance-> getCar(); } catch (CarException $e) { Logger::crit('Whoopsies'); }
Since our SportsCarException is a type of CarException, the try/catch block in our client code can accomodate it quite happily. What happens then if we went the other way? class SportsCarFactory extends CarFactory {
public function getCar() {
throw new VehicleException('Cannot make sportscar'); } }
This time we have violated the definition that our abstract parent sets out by throwing a more generic type of exception and as a result, we’re emitting an exception that our client code isn’t prepared to handle. Since our definition tells us to expect a certain type of exception, we, as brilliant PHP developers, should be coding up the client code in a way that can handle it. Or rather, we should be coding up our child classes in a way that remains compatible with what the client code is expecting. As long as we follow this rule, that exceptions thrown by overriding methods of the child class should be of the same or a more specific type , we should be able to rest easy. After all, none of us wants uncaught exceptions to be thrown in production. When an application dies from an uncaught exception, we end up losing valuable pub time. By this point, you may well have noticed that I’ve been pushing my data type definition concept quite a bit. There’s a very solid reason for this; I’m keen to promoted that idea. When it comes to inheritance hierarchies, I strongly believe that you should keep them as flat as possible. The flattest inheritance hierarchy possible consists of only two levels; a parent and a child. The more that you can adhere to this, the more resilient your code becomes. As you might very well expect, I’m about to explain why doing the opposite is a bad thing.
Inheriting vast wealth is not always good.
87
Respect your elders Good parenting should always include encouraging your offspring to respect the guidelines laid down by the parents. In an object oriented sense, this means making sure that your child classes continue to honour the promises made by their parents through the interfaces that the parents have exposed. If the child class can achieve this, it doesn’t actually matter whether the parent class has been declared abstract or not just so long as you remember what the client code is expecting to be able to do with the instances that arise from this hierarchy. Once this has been achieved, there clearly isn’t any genuine limit to the number of tiers in the hierarchies that you create. If each child recognises and respects the requirements that are laid down by its parent, with specific regard to the three rules that I have described above, your application should continue along its merry way without stumbling over a type safety related glitch. Nevertheless, there is one particular factor that seems to get in the way of adhering to these rules. Humans. Pesky humans. Pesky little humans of the programmer variety who aren’t quite as logical in their information processing as the chips and circuits that they seek to manipulate. When you create larger tiers of inheritance hierarchies, it’s just so darned troublesome trying to keep track of what is in each particular layer. If you or your colleagues have to trace the execution of a method call up and down through multiple layers of inheritance, you may just find yourself missing the one subtle but critically important nuance in the flow. If you handed me a troublesome stack of inheriting class definitions, I can guarantee you that it’d take the sloshy wet puddle of grey mess between my ears a not insignificant amount of time just to absorb all of the details, let alone find the problem. But when you f latten those stacks, when you keep your hierarchies down to the least number of layers as possible (ideally two), you make the code far more palatable, far more digestible for the very same humans. For when our child classes cease to be compatible with the parents that they extend, you are virtually guaranteed to find that bugs will appear. Bugs that are directly attributable to this very specific set of circumstances. This should obviously be a situation that you, as a buddingly Brilliant PHP developer, should be keen to avoid, regardless of whether you value your pub time or not.
Inheriting vast wealth is not always good.
88
Inheritance Abuse I couldn’t very easily get to the end of a chapter on Inheritance without going in to something that I’ve come to call Inheritance Abuse . You’ll recall earlier in this chapter that I made mention of a particular horror story, one in which a company’s homemade framework had the controllers extending the view? Well it wasn’t quite as simple as all that. It didn’t just have the developer create a controller by extending the View class directly. Instead, there were two, three or four intermediary levels of derived classes to contend with. If I recall correctly, the worst violation had seven tiers in its inheritance hierarchy with the View class at one end and the implemented Controller class at the other. The View class itself provided an init() method that allowed for the execution of some page setup routines. In some, but not all, of the derived child classes the init() method was overridden. For example, a page controller within the admin section of the site would need to extend a particular admin-flavoured child of the View class because its overriding init() method included a test to see whether the visitor accessing an admin page was a) logged in and b) had admin rights. Of course this is an extreme example, one that was somehow oddly reminiscent of those word ladder puzzles we used to get given as children. You know the ones I mean: We have a starting word at the top, and a target word at the bottom. On each level in between you had to make a new word by changing just one letter of the previous word until you finally made the connection between the starting word and the target. That’s how this company’s framework felt at times; a word ladder. Let’s be thankful for small mercies though. PHP does not support multiple inheritance like C++; each child class may only extend just a single parent. As we have already seen though, it places no restriction on the creation of multi-tier inheritance hierarchies. Any developer so given to such wanton thoughts can create their own “word ladders” in code, changing one little bit in each tier until finally the two extremities are connected. This is, in essence, the very thing that leads to what is known as the Yo-yo problem, which I have personally experienced a few more times than I would have preferred. The Wikipedia page on the Yo-Yo problem makes a very apt description of it, so I shall simply quote the first sentence here
Inheriting vast wealth is not always good.
89
“In computer science, the yo-yo problem is an anti-pattern that occurs when a programmer has to read and understand a program whose inheritance graph is so long and complicated that the programmer has to keep flipping between many different class definitions in order to follow the control flow of the program.” - Wikipedia.com Why then do we do it? Why is the problem so prevalent that it gets its own name and a wikipedia page? The unrestrained ability to provide for three or more levels of inheritance is, to my mind, borne out of the misguided notion that abstraction and inheritance promotes code reuse and helps the developer to avoid duplication. I’m sorry that I reminded you about this idea but I still say vehemently that it doesn’t. Code reuse is good though. We know this. Avoiding duplication of code is also good. We know this too. Adherents to the “Don’t Repeat Yourself” (DRY) principle are keen to both keep and spread this knowledge around, and wisely so. Don’t get me wrong by thinking that I’m turning my nose up at code reuse because I’m not. If I were to express inheritance abuse as a singular concept, it would go like this. Inheritance abuse occurs as a result of the unrestrained extension of one class by another for no other reason than that the extending class would have access to and therefore be able to utilise the logic, routines and/or properties of the class being extended. As you might deduce, I’m refering again to the matter of convenience, but this time, rather than describing the process of writing a convenience method deliberately into a parent class so that the child may have easy access to it, I’m specifically referring to the practice of having one class extend another purely because the intended parent class contains “useful stuff”. When this happens, the developer is clearly disregarding any notion of families, hierarchies or data types. When this happens, it happens because the developer wants to “import” some of the functionality of that parental target into the child class that they are creating. This is another of those terribly common and easily recognisable code smells. One thing that I’m particularly keen to encourage is for developers to avoid falling into the Inheritance Abuse trap. Keep your hierarchies as flat as they possibly can be. Just two levels is the ideal to strive for: a data type definition provided by an abstract
Inheriting vast wealth is not always good.
90
parent and the concrete data type specialisations provided for by the children. Not all application requirements are the same of course, but should you need to go beyond these two levels, please be sure to have a really good, really very strong reason for doing so. And in the course of doing so, be certain to remember that every level you add to an inheritance hierarchy will also add a disproportional amount of fragility to your application, largely because “to err is human”.
Borrowing third party code by extension Which brings us nicely along to our next consideration. Sometimes we are forced into the Inheritance Abuse trap, pitched headfirst down the rabbit hole of hierarchies, tumbled into terrors unseen with our hands tied behind our backs. Before I get too melodramatic, the situation situation that I’m referring to is the one where we have unplanned for hierarchies thrust upon. Unplanned in the sense that the authors of the original code are not directly connected to the project that we are working on. I’m talking about the use of third party libraries, components and frameworks. The concept is to be lauded of course. One of the most fantastic aspects of the open source software movement is simply because awesome code can be created and curated by large numbers of very clever people, and code reviewed, audited and sanity checked by very many more. This in turn leads to some very robust packages packages of reusable code to cater for almost every conceivable use case. Remember, reusable code is good. If nothing else, it saves us reinventing the wheel over and over again. Situations do arise though where we are actively encouraged to adopt Inheritance Abuse as the right way to do things. Whether it’s company policy or merely the CTO’s personal preference, we might find ourselves tied to a particular library or framework. In a commercial setting, it’s often just not possible to take the time out to play and explore with the various offerings that are out there. The business has needs, needs to get the product out there, to shorten the development time, to assemble a team of developers that are familiar with the application’s ecosystem. If the boss tells us that we’re building with Symfony, reaching the point of shipping code comes sooner because there are a lot more people out there already familiar with the Symfony way of doing things. Conversely, the company that insists on its own homemade framework also needs to allow new hires a certain amount of nonproduct-productive product-productive time in order to become familiar with the code environment that they’re intended to work in.
Inheriting vast wealth is not always good.
91
Some, and I stress the word some word some of of these third party sources encourage the idea that the way to consume the functionality that they offer is to create classes that extend the classes that are included within the package. package. This harks back to the core principle that a child class derived from a parent will inherit all of the public and protected methods of the parent and in making use of this principle, the third party vendor is promoting the idea of code reuse through inheritance. A particularly common example of this is the framework that advises you to create your specific model classes by extending their base model class, primarily because their base model class already comes preloaded with what is effectively a library of convenience methods. If you’ve been following along with me thus far, you will already know that this isn’t the best way to achieve code reuse. It is theoretically possible to create, to my mind at least, a very thin argument for suggesting that the base model class defines the model data type, that the underlying methods are indeed an expression of that data type but once we’ve gotten past examining the Single Responsibility Principle, you’ll see that such an argument is exceedingly difficult to maintain. How then might we deal with this situation? First off the bat, we don’t necessarily have to take the vendor’s advice at face value. As programmers we are awesome finders of solutions. At this point, we can summon up the sage advice of Larry Wall’s suggestion that “there’s more than one way to do it”. it”. To help illustrate where I’m going with this, let me bring back a piece of code that we’ve already seen. class User {
/** * @var User UserDataT DataTransp ransport ort */ $userData; private $userData; __construct(UserDataTransport $udt) $udt) public publi c func function tion __construct(UserDataTransport { $this-> $this ->userData userData = $udt; $udt; } ... }
Inheriting vast wealth is not always good.
92
I’ve removed most of the methods that were previously defined just so that we can focus on the relevant parts. Our User model here isn’t extending anything at present. Instead, it’s consuming that UserDataTransport UserDataTransport instance so that it has access to the different pieces of user data that it concerns itself with. Here’s the rub: what if that UserDataTransport class was how we extended the base model class provided by the framework? class UserDataTransport extends FrameworkXYZ_Base_Model { ... }
It may not be an ideal solution because we are still having to accomodate the reuse-by-extension idea that the framework imposes upon us, but by turning our derived so called model class into this notional UserDataTransport class, what we’ve actually achieved is a very nice layer of separation between our own models and the framework’s arguably kludged arguably kludged model model proposal. This in turn allows us to maintain the purity of our own models and keep them focussed on their primary purpose.
Summary Whilst this chapter on Inheritance isn’t quite as wordy as the Abstraction chapter that precedes it, I hope that I’ve managed to convey the key concepts in a way that is, at the very least, comprehensible. If you accept that through abstraction we as developers are able to extend the range of data types that is natively available in our language of choice, then hopefully you will also be able to accept that the child classes that we derive from those abstract parents must remain compatible with the specification of the data type that the abstract parent provides. Perhaps an easier way of thinking in these terms is to consider how other objects, the so-called client code, within our application needs to interact with particular variable instances of these new data types. If a renderer is expecting a string, or an array, or a particular class of object as the return value from a method call, then it really shouldn’t matter which particular child class instance the renderer is dealing with.
Inheriting vast wealth is not always good.
93
If all of the child classes conform to and are compatible with the interface that the abstract parent defines, we will never have an issue arising from unexpected variable types being passed to or returned from a method call. The instances of our child classes will all display the attributes and characteristics of the data type that the parent defines. This all boils down to something that is known as Substitutability and Substitutability and thus forms the central premise of the Liskov Substitution Principle, which I have alluded to before and will cover in much greater depth later in this book. In brief though, substitutability asks us to consider whether our application will execute without a hiccup or a glitch were we to randomly and aribtrarily switch one child instance with any other child instance from the same inheritance family. If we can switch and swap at will without failure, the indications are strong that we have achieved substitutability. If however, we encounter problems because one instance doesn’t quite behave exactly exactly as expected, we have a clear cut case of failing to meet this very desirable goal. If you’ll recall those three rules that I set down earlier on in this chapter? Adhering to those as closely as possible will reduce the potential for type safety related bugs appearing in your application quite significantly. The knock-on effect of this? Your reputation for having almost supernatural powers of foresight grows just that little bit more.
Prodding the polymorph. Polymorphism, as it applies to PHP, is one whole mess of a topic. Should you spend a little time online researching this particular beast, you’ll find a wealth of conflicting information available, the sum of which will leave you questioning whether you have ended up knowing or learning anything of the topic at all. This is not the first time that I shall curse the Internet for all of the immensely powerful ways that it allows poorly qualified information to propagate. You may not be at all surprised by now to discover that I’m about to ask you to unlearn a few things. This is largely because the vast majority of those cursed tutorials tend to use something called subtype called subtype polymorphism as polymorphism as a means of explaining away how PHP does polymorphism. They don’t always explain it terribly well either. At this stage of course, I hope and pray that I don’t fall into the same category! As is my wont, I’ll try to cover the don’ts as don’ts as well as the dos and and along the way, hopefully dispense with all of that bad advice that some of my readers may very well have absorbed and assimilated along the way. In virtually every case, those beginners’ tutorials quite simply don’t go far enough into the details in order to leave the learner with the correct notion formed inside his or her head. First though, let’s just take a moment to remind ourselves what Polymorphism is. After all, it’s not a particular topic that many of us will think about very often. The term “Polymorphism” comes from the conjunction of two greek words; Polys meaning meaning “many” “many”, and and morph morph meaning form or shape. This gives us the literal translation as being “many forms”. So far, so good. Nothing new there. But what does it really it really mean? The generally accepted interpretation in the programming community at large is that polymo polymorph rphism ism is “the the prov provis isio ion n of a sing single le inte interf rfac acee to enti entiti ties es of diff differ eren entt type types. s.”” Th This is in itself is a direct quotation from Bjarne Stroustrup’s C++ Glossary and since Bjarne Stroustrup is the creator of C++ we would be well advised to place our faith in the accuracy of this statement. 94
Prodding the polymorph.
95
From a PHP developer’s perspective, I would like to offer a rather simplistic translation: “Same name, different logic”. That however doesn’t quite cut the muster, for whilst it may be correct in its most essential and literal interpretation, interpretation, it doesn’t even get halfway far enough into telling you what polymorphism is or how it is achieved. Nevertheless, as we work through this topic I’ll be referring back to this “Same name, different logic” idea to illustrate how it links in with each implementation. Before we get onto the specifics though we should take a more generalist view of the topic so let’s get right on with that then. Ironically, polymorphism itself comes in many forms and it’s important to distinguish between the different types, important for us to learn which types we can achieve in PHP and which ones we can’t. Please P lease do note however, however, that I’ll be covering Polymorphism solely from within the scope of Object Oriented Programming. A second thing to note is that I will only be discussing the most commonly considered types. There are plenty more out there, but I shall leave that to you as an optional exercise of further investigation. I’ll begin with a type that we can’t do in PHP.
Ad Hoc Polymorphism Ad hoc polymorphism relies polymorphism relies on the ability of a language to support method overloading, thereby allowing the developer to code up two or more versions of the same method within method within the same class . This fits nicely with the “Same name, different logic” idea. The requirements here for this to work are that the developer should provide multiple methods that all share exactly the same method name but provides for varying method signatures through requiring different numbers and types of input parameters. In this way the developer is leaving the method selection to the program’s execution environment. Whether you’re familiar with the Java language or not, the following example should be quite easy to digest.
Prodding the polymorph.
96
class MethodOverloading {
public int int add( add(in int t x, int int y) {
return x+ y; } int add(do add(doubl uble e x, int y) public int { (int)x+ y; return (int)x+ } int add( add(in int t x, int int y, int int z) public int { y+z; return x+ y+ } }
As you can see from the code sample above, our MethodOverloading class provides three public methods all called add(). In each case, the three add() methods all sport a different method signature; which of the three add() methods gets executed will depend entirely on the number and types of parameters to be passed in. Quite clearly, we can’t achieve this type of Polymorphism in PHP. The interpreter will simply choke and die miserably as soon as it encounters a class that contains two or more identically named methods, irrespective of the fact that the method signatures themselves differ. Consequently, let’s not dwell for too long on this particular type.
Subtype polymorphism and the fatal case of method overriding Our next consideration in the Polymorphism cannon is that known as subtype polymorphism . This one is the one that appears to be the favourite amongst the writers of those cursed tutorials that I made mention of at the beginning of this chapter. In a somewhat curious twist of fate, whilst I was in the process of reviewing this chapter I decided that I would Google the term “Polymorphism” and see what came
Prodding the polymorph.
97
up. Lo and behold right at the very top of my first page of results, Google has very kindly boxed out Webopedia.com’s definition of Polymorphism, which I shall make a direct quotation of here. In object-oriented programming, polymorphism refers to a programming language’s ability to process objects differently depending on their data type or class. More specifically, it is the ability to redefine methods for derived classes. What is Polymorphism? Webopedia www.webopedia.com/TERM/P/polymorphism.html Do you see what they did there? In one fell swoop both Google and Webopedia.com appear to be threatening the very future of humankind by presenting an ill-informed poorly qualified statement as an authoritative piece of information. Of course I’m wildly overstating the possible effects of this. Or am I? Does it take such a fantastical leap of imagination to conjure up an image of the guy working on NASA’s collision-course-asteroid-blasting laser unwittingly overriding the fire() method in a subclass and accidentally turning the laser into a tractor beam instead? Yeah, ok, maybe it does. And just for the record, I’d like to state that NASA’s programmers are clearly exceptionally talented, first class code jockeys. Nevertheless, we must pull our heads out of these fantastical imaginings and get back to the topic at hand. Subtype polymorphism is exactly as it sounds; the ability to change the behaviour of a particularly named method by providing an identically named method in a child class. In this way, we are effectively masking the original method by providing an alternative invocation point for the interpreter. Let’s look at some code to see how this is achieved.
Prodding the polymorph.
98
class RegularBullet {
private $baseDamage = 10; public function getDamagePoints() {
return $this->baseDamage; } }
class HollowPoint extends RegularBullet {
public function getDamagePoints() { $damagePoints = 2 * $this->baseDamage;
return $damagePoints; } }
Hopefully this rather simple bit of code quite clearly illustrates the idea that is subtype polymorphism . Our child class, HollowPoint overrides the parent’s getDamagePoints() method and in doing so changes the logic for how damage points are calculated when a round of a particular type hits the target. This still satisfies our earlier definition of polymorphism as being “Same name, different logic”. With subtype polymorphism, we are achieving the same name qualification by providing methods in the child class that override the identically named methods of the parent class. If you hadn’t already guessed, I’m about to go off on one of my gripes here. The main thrust of this particular gripe is that this is generally the extent of those pesky tutorials that I have so much beef with. Let’s return to the second sentence of that Webopedia.com definition. More specifically, it is the ability to redefine methods for derived classes. How much do I wish that statement was better qualified? How much do I wish that the containing article went into detail about proper usage, pitfalls and gotchas. How
Prodding the polymorph.
99
much do I wish our juniors were presented with better quality information when they first perform an online search for articles on polymorphism ? Let’s mess up our example above, but in a way that conforms perfectly with this supposed definition of polymorphism . class ExplosiveRound extends RegularBullet {
public function getDamagePoints( Player $enemyPlayer ) { $armorFactor = $enemyPlayer-> getArmorPoints(); $damagePoints = parent:: getDamagePoints(); $damagePoints = $damagePoints * 2; $damagePoints -= $armorFactor;
return $damagePoints; } }
At first glance this seems to satisfy the requirements of subtype polymorphism just fine. After all, we’ve successfully redefined the getDamagePoints() method in our derived class. Granted, we’re probably now going to have to modify that whole enemy player taking damage routine just to make sure we inject the player instance in the event that he takes a hit from an ExplosiveRound but hey, we can do that. Polymorphism achieved? Check! Broken our application? Check! Yet this is the typical extent to which the basic tutorials take the topic of polymorphism. What are our poor junior developers supposed to do when they code above ought to be the sort of code that elicits such a torrent of tears as to cause a flood of near biblical proportions? For the developer that truly wants to achieve PHP Brilliance, he or she will know that in order to use subtype polymorphism correctly, he or she will have to tread very carefully indeed. Fortunately for us, you’ve already put yourself through the pain and torture of the preceding two chapters on Abstraction and Inheritance and as a result, you already know that the code above is a monstrously huge no-no. You already know this because, in the preceding chapter concerning Inheritance, you may or may not have realised that I’ve already dealt with subtype polymorphism
Prodding the polymorph.
100
even though I didn’t use the term specifically. In our child class implementations of the data type declared by our abstract parent, we can provide methods that override those of the same name as declared in the parent. We just need to keep those three rules in mind when we do so. Any changes to the method signature in the overriding method must remain compatible with the method signature as declared in the parent class. Any changes to the value that is returned from the overriding method in the child class must remain compatible with the type of return value declared in the parent class. And finally, if the overriding method in the child class can throw exceptions (or indeed, trigger an exception to be thrown) then those exceptions must be of the same type or derived from the exceptions that can be thrown by the same method in the parent. As long as we follow these three rules we’re keeping any future bugs and headaches down to a minimum. One telltale sign to look out for, one that will quietly inform us that we’re failing to keep to these rules, is when you find yourself needing to implement special logic in the client code to accomodate the quirks of one or more child classes from a particular inheritance hierarchy. That is to say, if you find yourself writing conditional statements to determine whether you’re dealing with a RegularBullet or an ExplosiveRound then you can be sure that your child classes aren’t conforming to the data type definition that the parent class provides. There, I did warn you that I would repeat myself. Nevertheless, this is how subtype polymorphism should be handled. To the wellinformed, this is going to look an awful lot like I’m trying to bend subtype polymorphism to the will of parametric polymorphism . Which I am, of course.
Parametric Polymorphism Before I even start to approach the definition of parametric polymorphism , I feel that I have to state that PHP isn’t generally considered to be a language that natively supports the concept. Even so, the extraordinarily loosely typed nature of the language means that parametric polymorphism can be achieved straight out of the box, so to speak. Consider the following code snippet.
Prodding the polymorph.
101
function multiplier($a, $b) { return $a * $b; }
echo multiplier((int)2,(int)4); // Output '8' echo multiplier((float)2.5, (float) 6.0); // Output '15' echo multiplier('abc', 5); // Output '0'
The multiplier() function illustrated above can be considered to support parametric polymorphism since it will execute successfully irrespective of the data types of the input parameters. Quite whether you consider the attempt to multiply the string abc by 5 to be a successful execution is up to you. Nevertheless, on the face of it, the function is called three times with three different sets of input parameters and subsequently emits a return value without complaint. From an object oriented programming perspective though, we should consider parametric polymorphism to be thus: Parametric polymorphism is a way to make a language more expressive, while still maintaining full static type-safety. Using parametric polymorphism, a function or a data type can be written generically so that it can handle values identically without depending on their type. Such functions and data types are called generic functions and generic datatypes respectively and form the basis of generic programming. Wikipedia For the benefit of our discussion here in the glorious world of PHP we can explore the wonderous world of parametric polymorphism only by flipping our point of view and switching our attention from provider to consumer. What do I mean by that? Well, in the preceding sections on Ad Hoc and Subtype Polymorphism, I’ve been focussing our attention on the provider side of the equation. The java example given in the Ad Hoc section provided multiple definitions of the Add() method and allowed the compiler to select the most appropriate definition based on the
102
Prodding the polymorph.
number and types of parameters it had to pass in. The bullet based examples that I gave in the subtype polymorphism section provided multiple implementations of the getDamagePoints() method. In both cases, our attention was firmly directed to the provision of a particular interface with which collaborators may choose to interact with. Now, with parametric polymorphism we need to switch our attention to the consumer rather than the provider. We’ll start by focussing our beady little eyes on the input parameters of the consuming method since this is where it all kicks off. class VehicleRenderer {
public function renderVehicle(Vehicle $vehicle) {
print "Make : print "Model: print "Year : print "Price:
" . $vehicle-> getMake() . "\n"; " . $vehicle-> getModel() . "\n"; " . $vehicle-> getRegYear() . "\n"; " . $vehicle-> getPrice() . "\n";
} }
This is, of course, a shockingly simple piece of code but there’s a great deal of beauty in its simplicity and the key to it lies in those input parameters of the given renderVehicle() method. There’s only one of them and it’s declared as an instance of the Vehicle class. What you should infer though, based on your readings of the previous chapters, is that this particular renderVehicle() method is unlikely to ever receive an actual Vehicle instance. If you would rather not infer anything, then let me tell you straight up that the Vehicle class is declared as the abstract parent class purely so that it can provide the definition of a new Vehicle data type in our application. Since the Vehicle class is abstract, it can’t be instantiated directly. If we need a variable that contains an instance of the Vehicle data type then we’re going to have to look to one of the concrete child class implementations in order to provide one. Nevertheless, and before we get too carried away with the background details, the important point to make is that as long as the renderVehicle() method receives a parameter that is identifiably an instance of the Vehicle data type then that method should execute successfully and without complaint, just as the multiplier()
Prodding the polymorph.
103
function did in the first example. There’s still another key point to make on top of this though: As the consumer, the renderVehicle() method has no need to know the actual class name that the $vehicle parameter is an instance of. Only that it conforms to the interface that is specified by the more generic Vehicle definition as given by the abstract parent. At this point, I’d like to draw your attention back to that simplistic interpretation of polymorphism that I gave close to the beginning of this chapter. Even though we’ve switched our focus away from the provider and towards the consumer, the idea of “Same name, different logic” remains consistent with this type of polymorphism too. The same name component refers to the fact that we’re using the abstract parent’s classname when we type hint the input parameter for the renderVehicle() method. At the same time, we know that the different logic component is satisfied by the fact that the method can receive any one of the derived child classes within this particular family of types. As developers it should now be quite plain to see that we use parametric polymor- phism regularly in our daily coding (or at least, I do) and therefore it should also be quite plain to see why I’ve put so much effort into trying to get across this data type definition idea. I’ll say it again though. Of course I will. As long as we continue to correctly support the idea that the abstract parent provides us with the definition of a new data type and as long as we ensure that any child classes derived from the abstract parent continue to support the data type’s definition then we can reasonably expect the collaborators that work with instances of this data type to operate without error irrespective of which concrete child class instance they end up working with. This is the very basis of generic programming as it applies to object oriented PHP. This discussion wouldn’t be complete though without an illustration of how we can break parametric polymorphism. It’s exceptionally easy to do. I know this is the case because I’ve seen it done far too many times in far too many projects. It goes like this:
Prodding the polymorph.
104
class VehicleRenderer {
public function renderVehicle(Vehicle $vehicle) {
switch( get_class($vehicle)) { case "Car": $this->renderCar($vehicle);
break; case "Motorcycle": $this->renderMotorcycle ($vehicle);
break; case "JetSki": $this->renderJetSki($vehicle);
break; } }
// Individual render methods for different vehicle classes ... }
How we break it is very simple. Instead of relying on the interface that is provided by the abstract parent Vehicle class as we were previously, we’re now building intimate knowledge of our concrete child class implementations into our consumer and as such have broken the generic nature of it in the process. Our renderVehicle() method now knows it needs to check the actual class name of its input parameter before it can proceed to render the details contained within it. Whenever you see this kind of coding approach, you can be certain that the underlying data type of the consumed object has been broken through the inappropriate use of inheritance, which is a shame because a modification to a particular child class requires modifications to the consumers of the child class. In a largeish application, you can be confident that this is going to lead to bugs; grepping code bases in order to find every line of code that needs changing usually does. As you can see, subtype polymorphism and parametric polymorphism are really quite closely related, just so long as they’re done correctly. The improper use of abstraction and inheritance will break our beautifully polymorphic process, turning elegant programming into a mess of spaghetti code in virtually no time at all. But what if I told you that you could enjoy all of the benefits of polymorphic behaviour
Prodding the polymorph.
105
without any of the drawbacks of using inheritance? Well you can and I’m going to talk about it next.
Delegate Polymorphism What if, instead of relying on the process of hard-wiring all of the different logical implementations of our data types into the various specialised child classes, we could dispense with the notions of abstraction and inheritance completely and still support switchable behaviours that better represent what the things within our application actually do. Our sometimes rather ropey and somewhat disparate interpretations of what abstraction and inheritance actually is would suddenly cease to be the source of so many bugs. The good news is that we can and whilst I’ve no doubt that such a notion might start sending shivers up and down the collective spines of many a Senior Developer, once we’ve explored the possibilities I’m certain that at least some of you might start concocting ways in which Delegate Polymorphism might be implemented in your projects. As the name might suggest, the process relies on the implementation of a design pattern called Delegation , which in turn embodies the notion that the changeable aspects of what would otherwise become specialised child class are instead encapsu- lated into separate, independent objects to which method calls are then delegated to in order to execute the particular flow of logic that the delegate thus encapsulates. What I’m really talking about here are injectable, encapsulated behaviours that basically come in the form of helpers. Let’s look at some code. abstract class MortgagePlan {
private $loanValue; private $loanTerm; abstract public function getPaymentPlan(); }
So far, so good. Clearly we’ve created a new data type here, one that’s called MortgagePlan. We have private properties representing the value of the loan and the duration of the loan expressed as the loan term.
Prodding the polymorph.
106
class FixedMortgage extends MortgagePlan {
public function getPaymentPlan() {
// Logic to calculate a mortgage repayment plan // based on a fixed interest rate ...
return $paymentPlan; } }
class CappedMortgage extends MortgagePlan {
public function getPaymentPlan() {
// // // //
Logic to calculate a mortgage repayment plan based on variable interest rates but subject to a nominal cap to prevent repayments exceeding a certain value
...
return $paymentPlan; } }
And here I’ve provided the usual suspects in terms of specialised child classes. In the first instance, we have a FixedMortgage specialisation, which is capable of calculating the mortgage repayments that would fall due when a fixed rate mortgage has been taken out by a customer. The second instance features the necessary logic in order to calculate a variable rate mortgage which nevertheless is subject to capping to prevent the repayments exceeding a certain level. I would certainly feel pretty confident that each and every one of my readers has encountered this kind of code before; child classes that implement specialised logic based on a set of circumstances that the specialisation is intended to represent. This is standard practice but it doesn’t acutally have to be this way. If we were to apply delegate polymorphism to the scheme above, we could actually do away with the abstraction and inheritance process entirely. Even though this is clearly another simplistic example and real life projects are rarely this easy, it does rather make for a good illustration of the point.
Prodding the polymorph.
107
What are we actually looking at? The two specialisations that are embodied in the FixedMortgage and the CappedMortgage classes are really only two different behaviours of the same process. Each of the two child classes is encapsulating a different way of performing the necessary calculations - to yield a mortgage payment plan appropriate for the represented sepcialisation. It doesn’t take a great deal of refactoring to turn this unnecessary inheritance hierarchy into a simple mortgage class and a pair of injectable behaviours. Let’s see that in action. class MortgagePlan {
private $loanValue; private $loanTerm; private $repaymentCalculator; public function setCalc(RepaymentCalc $calc) { $this->repaymentCalculator = $calc; }
public function getCalc() {
return $this->repaymentCalculator ; }
public function getPaymentPlan() { $calc = $this-> getCalc();
return $calc-> getPaymentPlan($this->loanValue, $this->loanTerm); } }
Here I’ve modified the original MortgagePlan class so that it is no longer abstract. This is step one in the process of removing abstraction and inheritance from our code. You will also note that I’ve provided both a setter and a getter for our injectable behaviour, the various repayment calculators that we want to build into our system. Let’s look at those next.
Prodding the polymorph.
108
interface RepaymentCalc {
public function getPaymentPlan($loanValue, $loanTerm); }
class FixedMortgageCalc implements RepaymentCalc {
public function getPaymentPlan($loanValue, $loanTerm) {
// Logic to calculate a mortgage repayment plan // based on a fixed interest rate ...
return $paymentPlan; } }
class CappedMortgageCalc implements RepaymentCalc {
public function getPaymentPlan($loanValue, $loanTerm) {
// // // //
Logic to calculate a mortgage repayment plan based on variable interest rates but subject to a nominal cap to prevent repayments exceeding a certain value
...
return $paymentPlan; } }
As you can see here, I’ve provided an interface from the get-go. This is because we need to ensure that when we inject a repayment calculator into a MortgagePlan instance, our mortgage plan needs to know that the calculator will provide the getPaymentPlan() method. Implementing the interface means that this is guaranteed since the PHP interpreter will emit a fatal error if this isn’t the case. And lastly, those two specialised child classes that were in the first example have now been modified so that, instead of extending the original MortgagePlan abstract, they now implement the RepaymentCalc interface. Each calculator is now an injectable behaviour, allowing us to change the way that the consuming object behaves at run time . This is an important distinction and it’s one that means that our MortgagePlan
Prodding the polymorph.
109
instances can live entirely independently of the calculators that encapsulate the different repayment plan calculator logic. More to the point, it also means that we can change the way that MortgagePlan instances yield up a repayment plan on the fly . Should our bank manager be feeling particularly generous one day and allow us to switch our capped mortgage to a fixed rate one then no problem! We simply inject an instance of the FixedMortgageCalc into our mortgage plan and the repayment plan behaviour is automatically changed for us. To take this example back in line with the original theory, we are still supporting the “Same name, different logic” idea that I first proposed; collaborators of the MortgagePlan instance will still be invoking the getPaymentPlan() method on this object. This is the “Same name” part. Now however, the MortgagePlan instance delegates the responsibility for providing a repayment plan to whichever variety of repayment calculator has been injected into it. This is the “different logic” part. There’s an additional benefit to using this approach too. If, in the future, we find ourselves in need of a new type of repayment plan calculator it will be significantly easier to implement. Afterall, we have the required interface already defined for us, we’ll just need to code up a new repayment calculator to encapsulate the new calculation logic and we’re good to go. However, our analysis of delegate polymorphism wouldn’t be complete without turning our attention to the potential downsides of this approach. Yes, despite what I’ve just illustrated above, it’s not all sunshine and roses in the delegate polymorphism garden. The principle criticism of employing this technique is that it leads to the significant duplication of method declarations. Now don’t get all uppity just yet! I’m not implying that we’re about duplicate the actual logic, merely the method names themselves. The practice has a term in that what we must do is create proxy methods in our principle class to act as intermediaries between the client code and the real target of our inquiry. We can see this in the code above. If our principle class is the MortgagePlan, we can see how it provides on its public interface the getPaymentPlan() method. This getPaymentPlan() method performs no real logic of its own. Instead, it proxies the request to the instance of the RepaymentCalc that it has consumed.
Prodding the polymorph.
110
For a very simple example such as this, it doesn’t really pose too much of a problem that we are declaring one getPaymentPlan() in the principle class and then declaring another getPaymentPlan() in each of the different calculator classes. Since when have things that start simply even stayed that way though? Consider what will happen if, during the evolution of our application, we find ourselves having to extend the interface of the consumed delegate? What effect will that have on the number of proxy methods that we need to incorporate into the consuming object. The potential for our consuming object’s interface to become cluttered with proxy methods is quite large. It’s something that we need to be prepared for and it may involve us having to ask searching questions as to whether these proxy methods contribute to our vision of what the principle class’s interface is supposed to be. Consider for a moment our favourite model class, the User that we’ve already looked at a time or two. It could be said that we’ve delegated the password handling responsibilities to a dedicated instance of the PasswordManager. By consuming the PasswordManager, what methods of the delegate need to be proxied by the User? By scaling up our thinking, what happens to our User interface when we consume an AddressManager delegate, and then for the ecommerce side we add an OrdersManager delegate? It shouldn’t be too difficult to imagine then that our User interface can become quite extensive, even before we start adding dedicated, user specific methods of our own. As in all things then, it becomes a matter of consideration and experience, although I hope this book will allow you to take some effective shortcuts in the latter regard. If you can foresee a time where you would end up with a significant number of proxy methods, you might very well be served by switching the procedures on their respective heads. Rather than proxying method calls to a consumed AddressManager, why not build out an address management module and have the User instance become a parameter to be passed in instead?
Summary I appreciate that it may very well seem like I’ve tried to talk you out of using abstraction as a means of defining a new data type despite having put so much effort into convincing you of the benefits of doing just that. This is not meant to be the case at all. As with many things in life, you need to base your decisions based on what
Prodding the polymorph.
111
you best judge the situation to be. How you come to make those judgements will naturally be influenced by the extent of the things that you know about the topic in hand. In this case, you now know that using abstract classes as a means of defining new data types in your language of choice is a very powerful technique for building up your application in a semantically sound fashion. You also now know that you can modify the behaviour of a particular object onthefly by wrapping up the logic that represents those particular behaviours into their own classes, which can then be injected as behaviour instances into our principle object. It isn’t a case of arguing for data types versus switchable behaviours, since the two techniques are not mutually exclusive. Generally in fact you’ll find that you’ll be using a blend of the two to best achieve your desired outcomes. The building of an application isn’t a single, monolithic process but the combination and recombination of lots of little pieces, which in turn are composed of even smaller pieces. In some situations, subtype polymorphism will make perfect sense, especially when it comes to creating new data types and the concrete implementations and specialisations of those data types. In others, it’s much more appealing to delegate repsonsibilities into discrete, properly encapsulated and switchable behaviours. The choice, as they say, is yours. But they do also say that to be forewarned is to be fore armed. I hope that this first section of the book has at least achieved a little of that.
Talking points. So far then, we’ve considered encapsulation, abstraction, inheritance and polymorphism. Admittedly inside separate chapters and admittedly within some seemingly quite restricted contexts. Obviously, it isn’t sufficient to just leave our “Foundational” considerations there, not least because we simply are not able to take those four tenets and treat them in isolation. The moment we have one class extend another class, we are automatically involving all four. How so? Simply by extending one class with another, we’re automatically invoking abstraction and inheritance. This is perfectly apparent. We are also engaging in polymorphism of the subtype variety, though how much of the morphology of the parent is changed in the child class is entirely down to how we proceed to add code to the child. By now though, I would hope we can appreciate that reworking the interface of the parent beyond all recognition in the child is probably not a very wise move. The act of extending a class also has ramifications in the context of encapsulation. If we know that encapsulation is the process of bundling data and the methods that operate on that data into a single, coherent unit (a class) and that we’re also treating encapsulation as a means of achieving information hiding as well, then we must be breaking encapsulation in the parent since we’re reasonably expecting the parent class to expose at least some of its innards to the prospective child classes. So too are we breaking the notion of encapsulation in the child class. If we’re supposed to be bundling the data and the methods that operate on that data into a single, coherent unit, can we really use inheritance to suck in additional logic from another class and expect to get away with it? Encapsulation, as far as we’ve considered it thus far, begins and ends with those enveloping curly braces. What it boils down to then is a matter of degrees. The degrees to which you bend one in order to accommodate another. We could, for instance, say to hell with it all and build out our entire application in a single, perfectly encapsulated class.
112
Talking points.
113
class Application { ... // All of the codes } $app = new Application(); $app->run();
One file. One class. All nicely encapsulated, without any traces of abstraction, inheritance and polymorphism to get in the way. Am I actually advocating this approach? I would hope that by now, you know that I’m not. That doesn’t stop it being a conceivable possibility though. In theory, there isn’t any reason why such a single Application instance couldn’t handle each request and emit the correct response and frankly, do the job perfectly well. Did I say there isn’t any reason? What I meant to say was there isn’t any reason except one. You see, we’re returning to the knotty little problem inherent in all application development – the human factor. At the time of writing, we aren’t currently employing artificial intelligence to build web applications for us on any sort of significant scale. Notice that I’m hedging there. I suspect there’s at least one project out there attempting to do that (Skynet, anyone?). This in turn brings us to the question of why we write code the way that we do? Why do we even trouble ourselves with class design? Why should we consider the knowers and the doers ? Why even bother adhering to some standard that requires us to write only one class per file? Is it to make the PHP interpreter’s life easier – feeding it smaller, more digestible chunks of code? Naturally no, the answer is that it is not. We write our code with one audience in mind. Humans. The one person that reads our code the most often is ourselves. If we’re working in a commercial environment, our colleagues are likely to want to be able to read our code as well. If what we’re building is open source software then the potential audience can run into hundreds of thousands, if not millions of PHP developers. We write our code so that humans, not computers, can read it and understand it. Obviously, we still have to follow the syntax rules and lexicon of the language itself but since our PHP brethren are also trained in this art to greater and lesser degrees,
Talking points.
114
it comes down to how we set the code out, how we structure it, how we tie it all together, the thing that counts is how we make it comprehensible to PHP developers like ourselves and the others around us. Which brings us back to our monolithic “entire app in a single class” concept. I know for certain that I would have no chance of fathoming out an entire application in a single class, assuming the extent of the code ran to hundreds of methods and tens of thousands of lines of code. I’m not sure I know anyone who could. Even if we were able to get it into our preferred editor of choice without crashing the blessed thing, comprehension of the application’s inner workings would be fragmentary at best. So we need to break it down into more manageable chunks, to get it into a more workable format that recognises the foibles of the people that will work on it. For an excellent starting point, remembering to separate our knowers from our doers will help us to achieve that. Do you remember that I made mention of how some developers feel inclined to treat a particular class like a silo? If some method can be seen as vaguely user-related, in it goes. This is the start of the troublesome trail that ultimately ends up with the “app in a class” effect. It starts with the notion that we need to isolate the nouns in our design but at the other end of the spectrum, it results in a recognisable anti-pattern known as the “God Object”.
God objects The God Object phenomenon occurs precisely at the point when an object can be deemed to either know too much or do too much. That’s quite a sweeping statement with virtually no qualification, so let’s go ahead and try to qualify it now. The term “God Object” was coined to represent the manner in which an object appears to become “all knowing”, that is to say that its influence is felt in many parts of the codebase at large, either through referring to large parts of the codebase external to itself or more commonly, through large parts of the codebase having to look to the God Object itself for data, processing or both. From personal preference, I like to attach the idea that a God Object is not just “all knowing”, but also “omnipresent.” How is this a bad thing?
Talking points.
115
In the first case, when one object can cause things to happen in remote parts of the application, that is to say, at a distance, then we are certainly setting ourselves up for a hard time tracking down any bugs, quite simply because there appears to be a clear separation between the symptoms of the bug and the cause of the problem itself (when we find it). Consider this: $user-> getBalanceManager() -> getBalance() -> getNewTransaction() -> setTitle('Foo') -> setValue('10') -> save();
As ugly as all that is, the code that’s invoking this particular chain is separated by at least two degrees from the act of saving a new transaction on the user’s account. Even though we have yet to look at the various principles and paradigms that guide us to achieving brilliance, you probably already know of at least three violations taking place in there. Regardless of what these might be at this stage, we can see quite clearly that the invoker, with a blatant disregard for encapsulation, simply knows far too much about the inner workings of creating a transaction. We can conclude, quite logically, that this is also a very fragile approach to coding. Even if it works today, with so many links in the chain and therefore so many points where failure can occur, it wouldn’t take much for the chain to collapse tomorrow. Wouldn’t this be better expressed as: $result = $user-> getBalanceManager()-> addTransaction(
array( 'title' => 'Foo', 'value' => 10 ) );
As far as this particular piece of code is concerned, it’s reasonable to expect that a BalanceManager at least provides the relevant method for adding a transaction. What
Talking points.
116
goes on under the hood and the finer details of how this is achieved isn’t important to our invoker. Or at least, it shouldn’t be. Knowing too much is a bad thing. Unless you’re taking part in a game show or a pub quiz, that is. The flip side of the “God Object” coin shows its face when an object finds itself getting involved in too many processes. This comes as a direct result of the “silo” mentality, where vaguely related methods get lumped into the one class. In the process of registering a user? The User class has just the methods for that. Need to send a user an email? Look again, the User class has just the ticket. Where can I find all of the blog posts that a user has made? Yep, you guessed it. The User class has it all and much more besides. On the face of it, it becomes quite easy to justify all those different methods in there. They are all directly related to the user afterall. An email though, that is a discrete entity in its own right. So too is a blog post. By all means let the blog post carry a reference to its author through an $authorId property so that we can acquire the appropriate User instance when the need arises but for sure, there’s no justification for getting a user object involved in every possible blog post related operation. God objects reveal themselves as violations of sound programming principles. We’ve already touched on encapsulation, and an object that’s reaching out beyond its boundaries of concern is clearly violating that one. Later on in the book, we will also be looking at the Law of Demeter and the Single Responsibility Principle as other, closely related techniques for identifying the offenders within our midst. The good news though is that the God Object lends itself very well to being refactored, another thing that we’ll get to explore.
Contra-, Co- and Invariance Now though is a good time to review what we know about the topic of variance. It has already cropped up in our considerations of both abstraction and inheritance and as a result, so it makes good sense to take stock of what these things mean to us at this stage. That being said, we will only look at how variance affects us in light of the material that we’ve covered up until now. Specifically then, this applies to both the input parameters and the return values of the methods that we define.
Talking points.
117
As a starting point, let’s reconsider our three vehicle classes. class Vehicle {} class Car extends Vehicle {} class SportsCar extends Car {}
Since we don’t need to consider the implementation details yet, I’ve left them out to concentrate on the resulting inheritance hierarchy. For the next part, we need to consider the hierarchy for the factories that will produce these “goods”. class VehicleFactory { public function create() {
return new Vehicle(); } }
class CarFactory extends VehicleFactory {} class SportsCarFactory extends CarFactory {}
Now we have two inheritance hierarchies where the progression from grandparent to parent to child class are equally clear. Additionally, I’ve also provided the implementation detail for the create() method inside the uppermost Vehicle class. This is relevant to our discussion when you consider that the two derived classes provide no overriding implementation of their own. This gives us the first of our variance cases to consider: invariance. It should be reasonably clear that, irrespective of which particular class you’re dealing with, when you call the create method, you will get back an instance of the very specific Vehicle class. No matter how you travel up and down the factory hierarchy, the selected result from the product hierarchy remains unchanged. It’s always a Vehicle instance. However, we have already examined how we might reasonably expect a SportsCarFactory to provide us with a SportsCar instance. To support this, we would also need to implement an overriding create() method inside the SportsCarFactory class. Like so.
Talking points.
118
class SportsCarFactory extends CarFactory {
public function create() {
return new SportsCar(); } }
Our more specialised SportsCarFactory now returns the equally more specialised SportsCar instance. In doing so, we have simultaneously traversed the factory hierarchy and the product hierarchy in the same direction. This is covariance. For our applications to remain type safe, the return values that our methods provide should always be either invariant or covariant . Our last consideration here then is that of contravariance. To preserve type safety in our inheritance hierarchy, this is the one we need to adopt for input parameters. To take this type of variance into consideration, I’m going to introduce a third inheritance hierarchy for us to look at. class BasicEngine {} class RegularEngine extends BasicEngine {} class HighPerformanceEngine extends RegularEngine {}
As you can see, the flow from generic to most specialised is equally as apparent, despite being deliberately engineered (if you’ll pardon the pun) to try and catch us out. If we were to take our ‘middle’ CarFactory class and provide the create() method for it, it will look something like this.
Talking points.
119
class CarFactory extends VehicleFactory {
public function create(RegularEngine $engine) {
return new Car($engine); } }
All is well so far then. The middle factory accepts the middle engine type to produce the middle car type. However, I’ve already indicated that we need to use contravariance on input parameters. This is where it gets counter-intuitive. Contravariance means to move in the opposite direction. Consequently, if we move down the factory hierarchy, we need to move up the engine hierarchy, which gives us a SportsCarFactory that looks like this. class SportsCarFactory extends CarFactory {
public function create(BasicEngine $engine) {
return new SportsCar($engine); } }
Possibly worse than this though is observing what happens when we move up the factory hierarchy and therefore, down the engine hierarchy. class VehicleFactory {
public function create(HighPerformanceEngine $engine) {
return new Vehicle($engine); } }
When you look at it this way, it makes no sense, which is partly why I contrived the engine hierarchy in the first place. The important consideration, at least as far as maintaining that all important type safety goes, is that the upper most class in the hierarchy, the ultimate parent, is the one that determines the data type.
Talking points.
120
In the example that we’ve just explored, the data type ( VehicleFactory) informs our client code that the engine parameter should be at least a HighPerformanceEngine instance. As such, our client code should only ever pass in a HighPerformanceEngine instance, or a more specialised (child class) version of it. As our engine hierarchy stands at present, the HighPerformanceEngine is the most specialised version on offer. Consequently, if our client code will always be passing in high performance engines, our car and sports car factories will continue to work as expected, simply because the high performance variety is still identifiable as a type (child class) of RegularEngine on the one hand and BasicEngine on the other. Of course it’s a contrived example but our rules remain this. As far as overriding methods goes: 1. The input parameters of child class methods should be either invariant (unchanged) or contravariant (in the opposite direction) in relation to the input parameters of the parent method. 2. The data type of the return values should be either invariant (unchanged) or covariant (in the same direction) in relation to the return values of the parent method. No matter how counter-intuitive it might be for the input parameters, this is precisely what we need to do to avoid the bugs that crop up from failing to maintain type safety in our inheritance hierarchies.
All setters are evil That’s quite a statement, and not an entirely true one at that. The subject of the setter method does provoke some really quite intensive debate in online forums though, and for good reason. For starters, there’s a very good reason not to use them. “Huh?”, I hear you say, “But weren’t you practically advocating the use of setters earlier on?” We need to dial back a bit. Set the DeLorean to go in reverse. You’ll recall that I harped on quite extensively about not using public properties? Declaring properties
Talking points.
121
as being public is quite clearly the opposite of hiding them away and preserving our notion of encapsulation. Consequently, we make them private so that they’re kept safe from external molestation. Assuming that we ignore all that kerfuffle about the magic __set() method for the moment and go on to provide a dedicated setter method instead. Is that any better? Not in the slightest. Or at least, not when it’s used in the naked sense. class Car {
private $engine; public function setEngine($engine) { $this-> engine = $engine; } }
Here we have a completely pointless “naked” setter. If you study the code, you can see that it provides absolutely no purpose or benefit whatsoever. Why? Simply because it accepts an unexamined parameter and writes that value to an internal property. Functionally, it behaves exactly the same as if the $engine property had been declared public in the first place. Equally as bad though is the side effect that it can trigger in the mind of the uninformed developer. It results in a false sense of security based on the notion that since our property is declared private, it must therefore be better than a public one. If you follow the logic of the naked setter, you can see that its real effect is to make the private property public again. Ergo, it’s evil. In the process of doing nothing good whatsoever it can also create the illusion of achieving something beneficial. That perceived benefit is still only illusory though. There’s also a very important point to make here, which again takes the matter of encapsulation into consideration. When you provide such a method as this, naked or not, you are effectively handing over control of your object’s internal state to code that is outside of the object itself. It goes without saying that this should not be considered a good thing.
122
Talking points.
The “state” of an object is the term given to the thing that encompasses all of the properties that an object has and the values attached to the properties at a given time. We can determine that an object has changed state when one or more of its properties has changed value.
So how can we improve this? With our private properties all nicely hidden away, we still have to get the values in there somehow. As always, there are a number of different ways to go about this. If an object needs certain data in order to be considered complete, that is to say that the object could not function properly without such data, then constructor injection is the way to go. If that’s the only use case for it and your application has no need to change that data on the fly, then there’s subsequently no need to provide a setter in the object’s interface. When your application does need to set an explicit new value inside an object during normal process flow, then by all means provide a setter but dress it up a bit, don’t go providing one of those naked ones. What do I mean by this? class Car {
private $engine; public function setEngine(EngineAbstract $engine) { $this-> engine = $engine; } }
This is probably the programmatic equivalent of a mankini – hardly decent but at least it covers the bare minimum. In this instance, we are still saying that we expect code outside of this object to set the $engine property for us, but at least we’ve taken back a little control by stating that only instances of the EngineAbstract are permissible. Our final method is the preferred route for us to take wherever possible. It involves using a different kind of mutator, one that modifies an object properties rather than explicitly sets a new value on it. Let’s take a look at some examples of this technique.
Talking points.
123
class User {
private $hobbies = array(); public function addHobby(AbstractHobby $hobby) { $this->hobbies[] = $hobby; } }
This one is pretty straightforward. Our $hobbies property is modified via the addHobby() mutator, allowing us invoke a change of state in an appropriate fashion. We can also provide a corresponding removeHobby() method to cover the flip side of this use case. class PlayerScore {
private $score; public function addScore(ScoreEvent $event) { $changeValue = $event-> getScoreDelta(); $changeValue = $this-> applyBuffs($changeValue); $this-> score += $changeValue; } }
In this example, we’ve added a good bit more control logic to the process of modifying a player’s score. On the interface, we’re limiting the input parameter to being an instance of a ScoreEvent, which in turn should guarantee for us that the getScoreDelta() method will be available. Following that, we’re passing the $changeValue yielded into the internal applyBuffs() method to see if the player has any buffs or boosters active before finally applying the change to our private $score property. The end result is that our PlayerScore class has taken back much more control over its internal state than would be the case if we were to allow the application code to supply an explicit value directly. The end result is that encapsulation has been restored, the inner workings are tucked away again and any change to our properties and their values happens on our terms. This is exactly as it should be.
Talking points.
124
Rules are meant to be broken Even though we’re only just coming up to the end of the first part of the book, you may have noticed that I’ve posited quite a large number of rules already. This rule goes here, that rule goes there and whilst you’re at it, keep these three rules in mind too. It’s quite a lot that I’m asking of you, I know this. Fortunately, rules were meant to be broken. For every “rule” that we examine in these pages, bear in mind that each will have a certain amount of elasticity. Some will bend easily, otherwise will just snap at the merest push in the wrong direction. Unless a particular rule is of the latter variety, one that should be considered a law, I’m not intending to tell you which are proper rules and which are merely strongly encouraged guidelines. I know that might sound arrogant and I apologise if it does but the goal here is to equip you sufficiently well to arrive at your own conclusions. A case in point comes up fairly soon when we get to the chapter on traits. In there I will posit a rule but only after we’ve worked through the pros and cons of the blessed things, at which point I hope that you’ll find yourself agreeing with the rule presented. In the meantime, I’d like to tell you a story. It really won’t take that long to do. A few years ago, ok, few-ish, I was tutoring a private student who was pretty keen to get going with the object oriented stuff. Right from the start I gave him a rule: You cannot create an object from a class directly, you always have to create another class called a factory and use that to make objects. Thusly, we created a User class and a corresponding UserFactory class, the latter containing a static create() method which happily returned freshly minted users. Since it worked, it surely confirmed that the rule was true. Well, true only for as long as it took for the student to discover that it was all a bit of a fib. But by that time, the suggestion was already planted and it was easier to describe why the “rule” was really only a very strong suggestion. Even if the student has since learnt that a static create method on a factory class isn’t even the best approach, it still helped to evade a lot of future mistakes simply by locking the creation of new objects down to a single location. A win, but by devious means. In any event, we’ve reach the end of Part One. Let me leave you here with a slightly awkward analogy.
Talking points.
125
If you leave the house in the morning with your belt done up far too tightly, at some point along the way you’ll stop, let it out a notch or too and get to breathe a sigh of relief. Whereas if you leave your house in the morning with your belt far too loose, your trousers might fall down.
Brain Check We really have reached the end of Part One now, but before we move on let’s take stock of what we have covered thus far. In a way, this section might be considered a glossary; it certainly fits that particular bill. However, being a glossary isn’t its full intent. Rather, it’s a checklist of the topics that a developer exhibiting PHP Brilliance could stand up and deliver a lightning talk on. Take a moment to review the list. Given thirty minutes preparation time, could you deliver a 10 slide, 5 minute presentation on each of this topics? Without resorting to a search engine? If the answer is no, consider beefing up your knowledge on the topic in question. Encapsulation & information / implementation hiding Encapsulation is the process of bundling data values (properties) with the methods that act upon those properties. Information / implementation hiding is a kind of “bolt-on” concept that we can apply to update encapsulation in order to provide objects that act as black boxes , protecting their properties and controlling access via a well-defined public interface. Exposing state through setters and getters Following on from the principle of encapsulation, the application of dumb setters and getters, methods that do nothing but provide access to object properties, leaves us prone to bugs caused by an object losing control over its own state. Magic __get() and __set() methods cause a great deal of problems in exposing an object’s state and extreme caution should be applied before using them. Abstraction & Inheritance Abstract provides us with a means of defining a new data type to live alongside the language provided types of string, integer, array and object. Inheritance 126
Brain Check
127
allows us to provide specialisations of the new data types but we should avoid contradicting the data type specification that an abstract parent class provides. Polymorphism Polymorphism can be achieved in PHP via a number of different ways, but not yet by Ad Hoc Polymorphism , the process of providing multiple methods with the same name but different signatures. We can do Subtype Polymorphism though, the kind that allows a child class to override a method in a parent class in order to provide a different logical process behind the method call. Of all the different types of Polymorphism, Delegate Polymorphism is our best bet. This last type allows us to encapsulate custom logic into distinct classes designed to handle specific situations. God objects Also known as monster objects, these things occur in objects that either know too much or do too much, or occasionally both at the same time. They’re revealed when one or both of these situations can be observed; the object is referred to or invoke in a lot of places within the codebase or the object in question refers to lots of other locations in the code base. Either way, the object in question should be broken down into smaller parts, with each part used in the appropriate location. Contra-, Co and Invariance A key feature that comes into play with the topics of abstraction and inheritance. It is essential to the maintenance of type safety to understand the role that these three types of variance play in such. Variance is a key component in the Liskov Substitution Principle
Extending our Object Oriented brain “First goddamn week of winter.” - MacReady
128
Progressive progression, objectively. Right. So we’ve taken a rather circuitous stroll through the heady world of our four central tenets and now our understanding of encapsulation, abstraction, inheritance and polymorphism has been reinforced. Solidified. Bolstered. Or maybe just confirmed. The upshot of this is that we have swept away all of the somewhat fragile inconsistencies and we, the members of Team Brilliance, now have some rock-solid foundations upon which we can build. Even if you haven’t fully adopted the line of thinking that I’ve been proposing thus far, even if you still feel like abstraction could somehow support code reuse, you do at least have the advantage of being equipped to make a judgement call on it. This in itself is a great boon. One of the truly great aspects to a career in software development, especially in online application development, is that the learning process never actually ends. As long as you’re in the job, there will always be something new to learn from. I’m not simply referring to the latest framework, design pattern or development environment either even though these can be the source of some of the greatest excitement in our world. Who doesn’t remember the first time that they installed that shiny new MVC framework, wired together the routing, their first controller, model and view? However, we don’t, in general, learn good software development practices from the latest hot topic or bleeding-edge release of something sparkly and new. For sound software development, our education must be grounded in sound theory. For that, we need to look to older resources too - the ones that have weathered years of public scrutiny and still receive a thumbs up in the context of application development today . That last part is important. Obviously, it would do no good to look back to something published thirty years ago if it’s since been discredited. Nevertheless, clever people have been publishing academic papers on computing topics for as long as computers 129
Progressive progression, objectively.
130
have been around. Some of the core concepts are not subject to the short shelf lives of the last or the current or the next big thing. Sound principles have been around for a very long time and it’s both a good and a bad thing that we can’t just pump those juicy bits of good, sound, theoretical programming knowledge into the heads of newbie developers as soon as they get going. I say that’s a bad thing since it takes time and effort and a whole lot of mistakes to finally get the good stuff fixed into those developer brains, time we might better employ into building the next big thing instead. And I say that’s a good thing since the endless voyage of discovery keeps the appetite sharp, the enthusiasm up and the flow of warm fuzzies going. If you had the wow thing happen the first time you tried an MVC framework, did you also get the wow thing going on the first time you spun up a virtual machine using Vagrant? I know I did. I still do. And that’s very much what I love about this business. Just imagine what your world would be like, only for the briefest moment, if PHP had stayed with version 5.2. If Zend had declared the language finished and complete at that point. If your job was only to re-skin yet another WordPress installation because that’s all you ever had to do to get an application online. Add in the plugins, re-skin the beast and go through that famous five minute install all over again for the next one. Forget clowns and being chased by monsters but still not being able to run away fast enough. This is the very stuff that nightmares are truly made from. Wordpress. Over and over and over again. Let’s not dwell on that particular dystopian vision but move on swiftly to brighter imaginings. I’m sure that you will have guessed that I like to give you a little wander down a dark and fearsome path before dragging you back into the warm and sunny glow of the right one. Well, my version of the right one at least. To do that notion justice, let me say with thanks and praise that PHP didn’t just stop at version 5.2, but went on happily to add, amongst many things, a much stronger object model. We know this because some juicy new “Good Stuff” has appeared in the versions that came after. Stuff like namespaces. Stuff like traits. Stuff like closures. This then forms the basis for how our explorations into the foundations of PHPflavoured object oriented programming will continue. This part of the book will explore the concepts and practical implications of the key “juicy” bits of more
Progressive progression, objectively.
131
recent PHP releases in order to extend our arsenal of knowledge, to improve the foundational toolbox that we’ll use to fashion awesome codebases. Doing just that will also let us add a little extra polish to the burgeoning shine of our oncoming brilliance. Before we do all that though, we still need take a reality check on our understanding of how PHP deals with interfaces. That’s coming up in the next chapter. Come on. Turn the page already.
More pub time through interfaces. We have already touched on the concept of interfaces whilst we were discussing abstraction a few chapters back. Within this chapter, we’ll bring that idea forward with a more detailed look at both the conceptual interface and the concrete interface as defined through the use of the eponymous keyword.
An interface is… an interface Before we get onto the theory of how we should be employing the use of interfaces within our application code, let’s just take a brief refresher on how an interface is defined.
interface InterfaceName {
/** * Example method definition * * @var mixed $parameter */ public function methodName($parameter); ... // any other method definitions }
All very straightforward then. Within this very simple interface definition, I’ve provided for the specification of just a single method, named here as methodName(). All very straightforward indeed.
132
More pub time through interfaces.
133
You might be tempted to think that I’ve forgotten that this isn’t a beginner’s book but worry ye not, I’m only including this here to provide a common base from which we can expand our understanding.
Whenever we create an interface in this manner, we’re laying down a particular specification that states that the methods declared within it will appear in any class that implements the interface. This is frequently referred to as a contract in most places, but I prefer to call it a guarantee. It’s a matter of personal preference for me: I feel that the word “guarantee” provides a stronger declaration of intent. Out there in the real world, contracts rarely make the news unless they’ve been broken and that’s not a connotation that we want to leak into our understanding of interfaces. For our purposes here, we need to consider our interfaces as rock solid, cast iron, impermeable, unbreakable guarantees. Nevertheless, if the word contract is what you feel more comfortable with, by all means mentally substitute the one for the other in the text that follows. Whenever we code up an object that implements an interface, we are effectively sticking a large, shiny, day-glo orange badge on the object that says “Hey buddy, you don’t need to know who or what I am, I’m carrying this guarantee see? You don’t need my family history nor my inside leg measurements. You only need this here guarantee. With it you can rest assured that I can do this, that and the other.” Whatever “this, that and the other” pertains to is up to you as the developer to declare inside your interfaces. If you need to specify that an object guarantees to accept an array of data and transform it into something else, you might only need a very simple interface declaration such as this one: interface ArrayTransformer {
public function getTransformed( array $data = []); }
That’s a pretty simple guarantee right there, a guarantee that states any object wearing the ArrayTransformer badge (maybe this one is neon yellow?) will provide a method called getTransformed(), which in turn accepts an array of data. Whenever the rest of our application code sees an object wearing this badge, it knows that the object has a public method called getTransformed() and that that method
More pub time through interfaces.
134
will take an array. We know that this is true since PHP will exit with a fatal error if a class that implements the interface fails to provide for the methods it declared. This is why I like to think of it as a guarantee. Let’s throw together a simple illustration of a class that proudly wears this guarantee badge. class CsvRenderer implements ArrayTransformer { ...
public function getTransformed( array $data = []) {
// logic to turn $data into a CSV string. } ... }
Our CSVRenderer has no idea which objects will be engaging with it, and nor should it. You would have some extraordinary problems cropping up if it did. Nevertheless, objects of this class happily present the guarantee that the ArrayTransformer interface has been implemented and therefore, they present the getTransformed() method exactly as the interface guarantees. That’s of particular use for our object’s collaborators and consumers. When we write our code that needs to work with something that can transform arrays, we no longer need to test for specific class instances, we only need to look for that big, shiny badge. Let me show you something horrific. public function sendEmail($recipient) {
if ( get_class($recipient) == 'Student') { $toAddress = $recipient-> student_email; } else if ( get_class($recipient) == 'Staff') { $toAddress = $recipient-> staff_email; } else if ( get_class($recipient) == 'Admin') { $toAddress = $recipient-> admin_email; } ... }
More pub time through interfaces.
135
Ignoring the fact that the schema for this application is clearly awful, you should be able to see just how fragile this code is. What would happen were we ever to add another user type? It’s conceivable that, since this is clearly for an educational establishment, we would want to add a Parent (in the biological sense) or Guardian class. If we sent an instance of the Guardian class to this method, the $toAddress variable wouldn’t get set. This would obviously entail hunting through the entire application for conditional trees such as this one, just to add another elseif case for the new user type. When our code looks like this, our application is almost certainly going to be buggy. If we were very careful and meticulous coders, we could certainly ensure that it still works though. Logically, it may be sound but it doesn’t take into account that we’re human and we can be prone to missing things. In reality, all that this particular piece of code is looking for is a way to get hold of an email address. The very thing that this sendMail method needs is a guarantee that the incoming $recipient parameter can provide it. public function sendEmail( Emailable $recipient ) { $toAddress = $recipient-> getEmailAddress(); ... }
Now, as long as we’ve coded up an interface called Emailable that defines the getEmailAddress() method, we can send any old object to the sendEmail() method and it’ll just work , just so long as it’s wearing the appropriate badge. No more hunting through the code base to look for potential problems every time we introduce a new class of a particular type, and more importantly, no more bugs cropping up because we accidentally missed one or more locations where we should have incorporated those changes. The end result is much cleaner, more readable code. In short then, an interface is a public declaration of the methods that an implementing object provides. Such a public declaration is made purely for the benefit of the client code that will interact with such objects. As far as methods go then, it follows that an interface may only define methods that are intended to be public - client code doesn’t care, or rather shouldn’t care about an object’s private and protected methods simply because they’re not accessible. When an interface guarantees that
More pub time through interfaces.
136
a particular method is included in an object, client code is only interested in the methods that it can actually invoke. It has no business knowing about the object’s internals. Again, an interface is a public declaration made for the benefit of an implementing object’s collaborators. There are a couple of pretty important things to remember here though. Important as in “When is your spouse’s birthday” important. The first is that an interface never declares the actual implementation logic that goes into a method. The guarantee being offered only extends to declaring the method names that are provided, and the list of expected parameters for each of those methods. There is, of course, no guarantee that the named methods will behave as the method name might suggest. That particular responsibility is down to you and/or the members of your team. As such, the method names declared in an interface are always written down as a single line of code terminated with a semi-colon, like this: public function methodName($arg1, $arg2 ...);
That right there is called the method signature, which in turn comprises precisely of the method name itself and a declaration of any parameters that the method requires. Whenever the opportunity arises the developer exhibiting PHP Brilliance will stick the appropriate type hints in there too, leading to a method specification that looks more like this one. public function setAdapter(AdapterInterface $adapter);
The second pretty important thing to note is that, whilst we are waiting for PHP7 to go mainstream, our interfaces cannot declare the data type of any values that might be returned from invoking the declared methods. What that means in reality is that even if an interface guarantees that a getEmailAddress() will be present, the guarantee doesn’t extend as far as ensuring the implemented method will give you back an email address, or that the return value will even be a string. PHP7 does give us that ability in some degree but in the meantime, that responsibility is also squarely on the shoulders of you and your team.
More pub time through interfaces.
137
An abstract class is also an interface You knew this already, of course. Especially if you were paying attention in the chapter on abstraction . You were, weren’t you? What we have here is an opportunity to reinforce that idea because we’ve just gone through the process of thinking of an interface as declared in code as being a guarantee . Think back for a moment to the chapter on abstraction . Here we spent a fair amount of time exploring the idea that an abstract class is a means of defining a new data type. This was in recognition of the fact that the native data types that PHP gives us aren’t quite good enough for our purposes. Yes, we could hold our user data in an associative array but that doesn’t adequately represent our notion of what a user actually is. It doesn’t bear the attributes and characteristics of a piece of data of type User quite simply because it bears the attributes and characteristics of a piece of data of type Array. And that’s the key point here too. Now that we’ve spent all this time writing up coded interfaces to represent a guaranteed form of behaviour, we should be able to take the same concept of a guarantee and apply it to the various interfaces provided by abstract classes. The key distinction here though is that an abstract class can and does provide the actual implementation logic. The provided guarantee can be extended to include the logical implementation of the declared methods. Let’s take a look at how an earlier example might be adapted to suit our needs here: abstract class AbstractRenderer {
private $data; public function setData( array $data = []) { $this-> data = $data; }
abstract public function getTransformed(); }
Here we have declared a new data type called AbstractRenderer, which provides an interface to the rest of our application. This interface states that whenever client
More pub time through interfaces.
138
codes sees an object with the data type AbstractRenderer, it can be assured that there will be two methods available. The first one is the full logical implementation of a setter for the $data property. The second one, getTransformed() will be a specialism provided by any child classes derived from this data type blueprint. We know this because the getTransformed() method is declared as being abstract. Let’s look at a concrete implementation then. class CsvRenderer extends AbstractRenderer {
public function getTransformed() {
// logic to turn $data into a CSV string. } }
When we spark up an instance of the CsvRenderer, we can see quite clearly (largely because it’s such a skinny piece of code) that the interface declared by the abstract parent is fully honoured in the derived child. Our guarantee is in place. We’re rock solid. The abstract parent has those all important attributes and characteristics laid out for us. Just as we can write code that knows how to deal with integers and arrays, we can also write code that knows how to deal with AbstractRenderers. Unfortunately, now comes the point where some less capable developers might start to fall over. Let’s modify our abstract class to help illustrate the upcoming points. abstract class AbstractRenderer {
private $data; public function setData( array $data = []) { $this-> data = $data; }
public function getTransformed() {
return print_r($this-> data, true);
More pub time through interfaces.
139
} }
A subtle difference has been introduced, and it’s one that isn’t as uncommon as we might hope. Instead of declaring the getTransformed() method as being abstract and therefore requiring a specific implementation in the derived children, what we’ve done here is provide an actual logical implementation of the method itself. The usual justification for this sort of approach is that, well, we’re setting up the default behaviour for the child classes. We know that code duplication is bad, so if most of our children perform this transformation, then we can set up a default method to avoid coding up the same concrete implementations in the child classes themselves. Obviously this is quite a silly thing to do for the case of an AbstractRenderer data type but bear with me please. The notion of providing ‘default’ behaviour in the abstract parent introduces some complications into our code that we have to be very mindful of. The idea behind providing a default suggests that some, but not all child classes will override this default behaviour in order to provide their own, more specialised behaviour. However, we have just changed the data type’s interface by explicitly stating that the getTransformed() method has a return value of type string. It’s not immediately obvious but it’s right there with the line return print_r($this-> data, true);
Do bear in mind that we want our child classes to honour the attributes and characteristics of the data type definition. To do so means that all of our child classes must also return a string value from the getTransformed() method. This makes sense when you think of the interface presented by the abstract parent as being a guarantee . It wouldn’t be worth the (digital) paper it’s printed on if child classes can ignore said guarantee and return what they like from their overridding methods. This is a crucially important factor to bear in mind. If the guarantee that you want to provide should include a specific type of return value, then by all means write that specification into the abstract parent but by doing so, you lose the ability to require that a child class provides its own implementation. That may not be such a concern just so long as you remember that the abstract parent is now providing an extended
More pub time through interfaces.
140
guarantee that includes return types as well. That guarantee is there to inform client code how this data type behaves. As such, it’s our responsibility to ensure that that guarantee is honoured in every class that we derive from the parent. Some developers either aren’t aware of this or simply choose to ignore it and since the language itself places no restriction on the types of values returned from method calls, all sorts of different types of data might get sent back. This in turn leads to client code that has to perform all sorts of data type detection on return values in order to know what to do with them. This shouldn’t be the case when the abstract class has already declared the return type in its own interface: in our case above as a string. Fortunately, the changes in PHP7 go some (but not all) of the way to mitigating this particular set of circumstances. I’ve included details of how this is achieved in the appendices should you wish to scoot ahead and take a look.
The curious case of the variable constants This idea has always “prickled” me somehow, even though it’s evidently a very useful feature. As much as a stickler as I am, the notion of a constant that can have variable values just makes me, I don’t know, itchy somehow. At least I get to breathe a little easier when I remember that a class constant can only ever hold a scalar value. Let’s take a look at how variable constants are achieved with a reasonably realistic implementation. class AbstractDataTable {
const TABLE_NAME=null; }
class UsersTable extends AbstractDataTable {
const TABLE_NAME="users"; }
class OrdersTable extends AbstractDataTable {
const TABLE_NAME="orders"; }
More pub time through interfaces.
141
Clearly, this is some part of an application’s persistance layer that looks like it’s going to provide a mapping between an object’s properties and a table in a database. The variability of the constant TABLE_NAME is achieved by overriding and redeclaring the constant in the child classes. Quite straight forward and really very useful. Of course, the constancy of the value is maintained within the class, and subsequently any instantiated instances created from it. You couldn’t, for example, write code that will modify the value of the TABLE_NAME constant at run time. If that were possible, we would, quite frankly, have to switch languages.
The joyful introduction of the constant constants However, you do also have the ability to create constant constants. Ones that cannot be changed by anything, either at compile time or run time. If you have a use case for a truly unchangeable constant constant, you can write it into an interface instead. To illustrate this idea, here’s an example in code. interface MathConstantsInterface {
const PI = "3.1415926"; const GIBBS = "1.85193705198246617036"; }
class MathConstants implements MathConstantsInterface { }
print MathConstants::PI; // output - "3.1415926"
There. Sanity is restored through the provision of a truly unchangeable constant. The MathConstants class cannot provide its own constant definitions without causing the PHP interpreter to exit with a fatal error. Therefore if what you need is a very definite hard coded value that’s accessible through your class definitions and is guaranteed to not be changeable in any implementing class, the provision of a constant inside
More pub time through interfaces.
142
an interface is the way to go. It gives you the same constancy as using the define() function procedurally, but without the headaches that arise from polluting the global namespace.
Which witch is which? It’s not an uncommon question: when should you use an abstract class and when should you prefer an interface? Thankfully, we already have a rather well fleshed out answer with which to determine the most appropriate route to take. When we think back to the idea that our objects should be separated into the knowers and the doers , half of that answer is already provided for us. As a general rule of thumb, the doers should be sporting an interface and wearing that eye-catching badge that says “Hey buddy, this is what I can do.” Which leaves us with the other half of that question to answer for the knowers in our system. Well, since we’ve already determined that our knowers are the ones that act as the caretakers of our data, isn’t it then reasonable to suggest that they can automatically be considered as data types? And if that’s the case, then clearly shouldn’t we go down the route of providing an abstract class for them? The answer to these questions is a rather ambiguous yes and no at the same time. What you need to do here is to dispense with the knowing aspect for just a moment and actually consider the requirements that you’re faced with. We have already considered how providing an abstract class is a means for specifying a new data type, but would this particular knower benefit from being defined as a data type itself? Let’s take a look at a rather common use case in the PHP universe. If you’re working within an MVC-like framework, then our main source of knowers will be down within the model layer. This is by definition what a model layer is supposed to represent: the data that resides within our application. The only other residents of our model layer will be the objects that encapsulate our application’s business logic within the context that they operate with and upon the data within the model . It doesn’t take too much thought to be able to determine that the units of business logic floating around in there can already be classified as doers , and as such we have already determined that they’re more likely to require an interface in preference to an abstract parent.
More pub time through interfaces.
143
Some of the frameworks out there would like you to think of the model layer as the database access layer. If a developer discovering MVC for the first time happens to choose one of these ‘model equals storage’ frameworks to facilitate his or her learning then there’s every chance that that idea will become enshrined within their heads. The key problem with this type of framework is that they tend to place unnecessary restrictions on your storage solutions. When you correctly separate the model layer (data) from the persistence layer (the code that handles the storing of that data), you are better equipped to find the right storage solution for each particular element of data. Users? Well, they suit an RDBMS such as mysql perfectly well. Searchable blog posts made by those users? We’d be better served by storing them in a document database such as MongoDB or ElasticSearch. What about our users’ friends and followers? Clearly, they belong in a graph database such as neo4j. If we can foresee a need to scale our application beyond a few thousand users, we should aim for a solution that appropriately has the data layer and the storage layer properly separated. You can’t do that with a framework whose base model is geared specifically for reading and writing rows to and from an RDBMS.
However, it’s not uncommon for the framework to expect you to create model classes by extending a base model provided by the framework itself. Is the framework’s base model an appropriate data type definition? Rarely could this question be answered in the affirmative for the simple reason that the base model tends to be loaded with convenience methods. To take a case in point, a recent framework popularity survey⁶ conducted by Sitepoint.com shows that the Laravel framework is the clear winner by quite a significant margin. Yet the Eloquent ORM implementation offers up a base model class that’s absolutely chock-full of convenience methods. Now don’t get me wrong, Laravel is not a unique offender, but when a developer uses this framework to create their own models, they’re importing over 3,000 lines of convenience methods into their model classes before they’ve written a single line of code themselves. Can such a developer remember every method that’s defined in such a base class and therefore avoid accidentally providing an overriding method with the same name? It is certainly very unlikely that a Laravel developer can recite the entire list of methods ⁶http://www.sitepoint.com/best-php-framework-2015-sitepoint-survey-results/
More pub time through interfaces.
144
provided. Furthermore, it’s entirely unreasonable to expect them to do so. To provide a case in point, there’s a toJson method in there; a clear cut convenience method if ever there was one. Why would a model object need to provide a data formatting method when the responsibility clearly lies with a formatter instead? The answer appears to be as a convenience when rendering model data in response to an API call or an AJAX request. But if a model object truly required a toJson method to complete its definition as a data type, why isn’t there also a toXML method. Or a toYaml method as well? Now don’t get me wrong. I don’t want you to start think that I’m “hating” on Laravel in particular, nor do I want to alienate any fans of the product. The majority of frameworks out there take a very similar approach and it should be borne in mind that these very same frameworks provide a huge range of benefits in terms of the problems that they’ve already solved. Routing? Check. Caching? Check. A consistent approach to templating? Also check. The advantages of not reinventing the wheel are well documented. This is to say nothing of the fact that the framework code is out there in the public domain, readily scrutinised by anyone wishing to do so. In turn, this leads to the faster identification of bugs and the faster fixing of those bugs. On its own, this is one very distinct benefit of community curated code. However, a developer that wants to exhibit PHP Brilliance isn’t one that loses control of their interfaces. Isn’t one that accidentally overrides a method in the base class simply because the base class is simply too large to memorise. It’s important to remember that our client code is going to rely on those very same interfaces to guarantee a certain mode of operation. To bring ourselves back to the point, if our model classes are already extending a framework’s base model we can no longer make the decision as to whether we should provide an abstract parent class or not. The interface, as cluttered as it might already be, is already defined for us. We have surrendered control. Does this really have to be the case? Happily, all is not lost and better yet, we’ve already stumbled across the solution by looking at a more domain model oriented approach. You might recognise at least some of the influence in this code from from the first part of the book:
More pub time through interfaces.
145
class UserDataTransport extends WhateverFrameworkAbstractModel implements StorageInterface { ... }
Yes, we’ve wrapped the framework’s base model class in an arbitrary UserDataTransport class and in doing so, we’ve regained control of the decision making process for our models and their interfaces. Look. class UserModel {
private $userDataTransport; public function __construct(StorageInterface $userDataTransport) { $this->userDataTransport = $userDataTransport; } }
What is this StorageInterface that we’re type hinting for? Frankly, anything that we want or need it to be - it gives us the opportunity to pass in anything that fulfils the role of carrying the actual data values and providing some sort of hook for trigger to the storage of that data. What we lose is the instant access to a curmudgeonly collection of convenience methods presented on our model’s own interface but what we (re)gain is complete control of the interface itself. This is crucial to achieving PHP Brilliance. Our objects simply cannot adopt a laser-focussed intent when they’re based on extending bloated abstracts, but they can do exactly that when we abstract away the bloated abstractions themselves. Now at last, we can achieve what we set out to do - decide for ourselves whether our model objects would benefit from a declared abstract data type. And why not, I ask? If your application is going to deal with lots of different types of user, you’re at liberty to define a User data type and create the requried specialisations. You find you have a need to declare different types of product? Go for it. Your own social network might be full of diverse but readily identifiable data types.
More pub time through interfaces.
146
Nevertheless, it pays to remember that the data in your models will still need to be persisted between each request. The finer details of how you implement this though are beyond the scope of a chapter dedicated to the exploration of interfaces. To this end, I’ll just leave you with something that might be a starting point for the StorageInterface. interface StorageInterface {
public function setStorageAdapter( StorageAdapterInterface $adapter ); public function getStorageAdapter(); public function create( array $data = []); public function read($id); public function update($id, array $data = []); public function delete($id); }
Doing just this has another important side effect: The storage adapters are going to be readily mockable, leading to unit test suites that can really fly .
Putting a name on our spaces. We’ve had namespaces for six years, why aren’t you using them already? Actually you have, we all have, whether we’ve been aware of it or not. Prior to the arrival of the namespace keyword back in 2009, every bit of code ever written ended up in one regardless: the global (or root) namespace. Namespaces are a common feature of many programming languages and for good reason too, since they lend themselves well to giving us the power to properly, intelligently and logically organise our code into sensible units.
Namespaces are to code what directories are to files Not having the ability to define our own namespaces led to us facing two competing problems. The first of these problems stems from the fact that you can’t declare more than one class or interface of the same name within the same scope. This of course makes perfect sense - you couldn’t reasonably expect the PHP interpreter to reliably instantiate an object from the correct PDO class if you’d also provided your own version. This is perhaps the most overt representation of something known as name collision . If the language itself provides a named class within the global namespace, you’re denied the opportunity for using the same name in your own classes. A much more insidious manner in which this problem presents itself arrives when you also start importing third-party libraries of code into your projects. Consider what would happen if you defined your own User class within the global namespace. Seems reasonable, right? But then you proceeded to add in a third-party library that came with its own User class? You would have to agree that the name “User” is a pretty popular choice, but now your application would be broken. PHP would simply refuse to run it due to the presence of the two competing class names. To fix this issue, you might be tempted to rename your class to MyUser. Ugh!
147
Putting a name on our spaces.
148
I subtitled this section with an analogy to directories and files. It’s appropriate now to bring that comparison into play. When it comes to mentoring team members about namespaces, we can start with the root of a file system. On a Windows based machine, that’s normally referred to as “C:”, but since I’m a linux type of guy I’m going to stick with the terribly simple and more *nix like “/”. The root of the filesystem is directly comparable to the global namespace in our code. We can ask our developers which they would find preferable: To drop every file that they create into just this one root location? Or would they prefer to make use of directories to impose some sanity and order on the layout of the codebase? It’s a rhetorical question of course. Anyone with a passion for music might very well have hundreds if not thousands of mp3s on their machine but they won’t all be saved into the same ‘/home/$username/Music’ folder. In all likelihood, they will be organised into folders. First by the name of the artist in question, then by album title, giving rise to a folder structure that might just look something like this. /home/ home/thunder/ thunder / Music/ Music/ Bauhaus/ Bauhaus / Crackle/ Crackle/ Iron Maiden Maiden/ / Powerslave/ Powerslave / Metallica/ Metallica / Master Master of Puppets Puppets/ /
This is namespacing at work directly within the filesystem - the logical gathering together of related materials. When the filesystem analogy is considered, it’s hard to see a way for not justifying the the benefits of sensibly organised folders. More significantly, significantly, we already know from experience that we simply cannot save two files with the same name in the same directory, a fact which only strengthens the analogy. Still, we haven’t always had the namespace keyword to help us out, which put us squarely into the path of the second problem. If we can’t use the same name inside the same scope, what must we do instead?
Putting a name on our spaces.
149
class My_Very_Long_Class_ My_Very_Long_Class_Name_That_Is_Hopefu Name_That_Is_Hopefully_Unique lly_Unique { ... }
Sorry Zend but nothing displays this particular issue better than the code in the 1.x series of Zend Framework. Let’s say that we needed to create a connection to our MySQL database, a fairly common use case. Achieving this would mean having to create a new instance of the Zend_Db_Adapter_Pd Zend_Db_Adapter_Pdo_Mysql o_Mysql class. It works, for sure, but using long names for our classes leads to code that is cumbersome to work with and difficult to read. Back in the day, it was a necessary measure but the same can’t be said for today. Both Both of thes thesee prob proble lems ms are are read readil ily y solv solved ed with with the the appl applic icat atio ion n of a cust custom om namespace, which gives us the ability to readily package up and organise our classes into sensible and logical units. Exactly in the same manner as organising files into directories. To illustrate this idea, let me perform a direct translation of that Zend class into something appropriately namespaced.
namespace Zend\Db\Adapter\Pdo; class Mysql { ... }
Clearly then, with the provision of an appropriate namespace, the class name itself becomes shorter, more succinct and easier to comprehend. However, I don’t wish to be seen as promoting the use of an old framework for my examples so for future examples, I’ll switch up to a more contemporary offering.
Putting a name on our spaces.
150
/* * Th This is fi file le is pa part rt of th the e Sy Symf mfon ony y pa pack ckag age. e. * * (c) Fabie Fabien n Poten Potencier cier
> * */ namespace Symfony\Component\Console; class Application { ... }
This is a stripped down representation of the Application class file within Symfony’s Console component, but you didn’t need me to tell you that, did you? Everything I said about this code can be readily deduced from the class name itself and the namespace that it’s coded under. I find this to be one of the best examples of the kind of clarity that can be achieved with appropriately namespaced code. To build a command line tool around Symfony’s console component, you need only create an instance of the Application class, configure that instance and then invoke the run() method. Now you could use the fully qualified class name for your new statement, which would something look like this: $myApp = new \Symfony\Component\Console\Application();
You might notice that I’ve included a leading ‘\’ in that reference to Symfony’s console application class. This is purely to make sure that we’re referencing it by the “absolute” namespace path. Without the leading ‘’, namespaces are resolved in a relative fashion, a potential source of much confusion
But using the fully qualified class name is hardly any better than going back to the really long class names illustrated in the ZF1 example above. Never fear though, usability is restored through the power of the use keyword.
Putting a name on our spaces.
151
use Symfony\Component\Console\Application; ... $myApp = new Application();
There. Clarity has returned to our coding once more. The intent of the line $myApp $myApp = new Applicatio Application(); n(); is clear without requiring us to mentally parse the namespace path. This is because the use keyword imports that definition for us. However, whilst we’ve solved the long class name problem, we’ve just reintroduced the name collision problem again. Using the word “Application” as a class name isn’t unique to the Symfony Console component, just like User it’s a pretty popular choice. Indeed, there’s every possibility that we might want to use the very same word ourselves. Look.
namespace CompanyName\AppName; use Symfony\Component\Console\Application; class Application {
private $app; $app; public publi c func function tion __construct(Application __construct(Application $app) $app) { $this-> $this -> app = $app; $app; } } $myApp = new Application( new Application());
Even if PHP allowed you to do this, you would have a devil of a job trying to decipher what was supposed to be going on in there. All sense of clarity has been lost again because you could only assume only assume that that the author’s intent here was to instantiate an instance of his own Application class by passing an instance of the Symfony
Putting a name on our spaces.
152
console Application class on the constructor. constructor. Or maybe that should that be the other way around? Trying Trying to decipher that code would be pointless though since we would be trying to redeclare the Application class within the current scope. Quite rightly, PHP will choke and die when we try to do that. aliasing the imported class and thereby No matter though, we can fix things by aliasing the freeing us up to declare our own version of the Application class. Aliasing is achieved by adding the as keyword like this
namespace CompanyName\AppName; // Format: "use $fullClassPath as $localAliasName" use Symfony\Component\Console\Application as sfApp; class Application { $sfApp; private $sfApp; __construct( sfApp sfApp $sfApp ) public publi c func function tion __construct( { $this-> $this -> sfApp = $sfApp; $sfApp; } } $myApp = new Application( new sfApp());
There. Job done. Now we have not only solved our name collision issue but we’ve also clarified the intent. Granted, there’s not a lot going on in there but at least it doesn’t take much effort to determine that our Application instance is meant to compose an instance of Symfony’s console Application. I used the word “compose” in that last sentence quite deliberately as this is the part of the chapter where I pull the generic notion of namespacing into the realms of PHP P HP Brilliance. The days of the monolithic “one framework does all” are coming to an end, at least in terms of building commercial applications. That’s not to say that we’re suddenly building everything from scratch again. Indeed not, as we’ve already determined that reinventing the wheel is a rather pointless waste of our precious time. Nevertheless,
153
Putting a name on our spaces.
we no longer need to select this framework or that framework to get our projects up and running. In recent years there’s been a steady but logical and accelerating shift towards using component libraries within an application’s design, and it’s easy to see why. Every web based application tends to have the same basic needs at their core. A very simple list might look like the following. Component
Provides
Routing
For mapping request paths to the code responsibile for generating the response expected under that path Most applicatio applications ns need need to store store their their data data somewhere Simp Simpli lify fy/s /sta tand ndar ardi dise se the the proc proces esss of gene genera rati ting ng response bodies
Database/S Database/Storag toragee Templ emplat atin ing: g:
In real world applications the list of high level components can grow quite rapidly: caching, authentication, authorisation and configuration to name but a few of the extras that we typically come across. When we consider that no single framework currently offers the best of breed solutions for all of our needs, it becomes harder to justify using a complete, monolithic framework at all. The developer exhibiting PHP Brilliance won’t be happy with just “making do” when the power to mix and match the “best of breed” components is quite literally at the end of their fingertips. If you need the best of breed routing solution in your application, it really is as simple as issuing the following command. composer require league/ league/fastroute
If you hadn’t already guessed, this is precisely why I called out the “compose” word earlier. The power of Composer⁷ is Composer⁷ is such that you can slot together your own MVClike framework, one that uses the best of breed components for every high level requirement, in a matter of hours. ⁷https://getcomposer.org/
Putting a name on our spaces.
154
This of course completely destroys the argument that the all inclusive “one framework does all” approach brings any benefits with it beyond any kind of headstart offered through the developers’ own familiarity with one. For sure, I’m not suggesting that you immediately start rewriting any existing applications to switch from a monolith to a component library, not least because there’s always a certain fragility that creeps into the code whenever a developer starts out learning something new. What I am recommending though is that you do explore creating your own application in a component library fashion - simply by using fastroute alone you may surprise yourself. Nevertheless, I should also point out that the term “Monolithic” is used to refer to those frameworks that not only seek to provide all that is need needed ed with within in the the on onee pack packag age, e, bu butt also also make makess it diff diffic icul ultt for for the the deve develo lope perr to swap out, say, the templating engine for a better one. Many of the more popular frameworks these days are also going doing the route of supporting componentized libraries managed by Composer. In any event, it should always come down to selecting the best tool for the job.
What has Composer got to do with namespaces? Quite a lot, as it turns out. With the ever increasing popularity and burgeoning use of Composer as the dependency manager of choice within the PHP universe, it becomes harder and harder to ignore the manner in which it promotes the use of the PSR-0 and more recently the PSR-4 standard for autoloaders. I won’t go into those standards here since they are explored in greater detail later on the book. Suffice it to say here that the stringent requirements as set down in the standard actually make it far easier for us poor developers to impose some sort of coherent order on our application’s codebase. This applies both to the code itself and to how it’s set out on the file system that holds it. Let’s take a look at an example.
Putting a name on our spaces.
155
namespace Vanqard\Controller; use Vanqard\Service\BlogService; use Vanqard\Controller\Interface\ControllerInterface; class PageController implements ControllerInterface { ... }
This is obviously a very lightweight example that I’ve concocted here just to help with making the few points that follow. Our first consideration then is the namespace itself. You’ll notice that I started that with ‘Vanqard’. For PSR-4 compliance, the first element of a namespace must always be what is referred to as the vendor namespace. This is a superficial but very effective way of imposing a degree of separation between your code and any libraries that you might pull in via Composer. Remember that we’re very keen to a) use namespaces in our own code to avoid polluting the global namespace and simultaneously b) avoid any name collisions that might possibly occur should we ever find ourselves with code that has identical namespace and class declarations. Picking our own unique vendor namespace makes it easy to achieve both goals. A common practice would be to use either your github/bitbucket user name or indeed, a suitable representation of your company’s name as the vendor namespace but if you’re stuck for one and don’t have either of these, you could just as easily visit Packagist⁸ and register a new account under a username of your own choosing. If the username that you pick is still available then it’s automatically a good choice for your own vendor namespace.
With the vendor namespace choice out of the way, our next PSR-4 compliant consideration needs to be that we’re required to store the PageController class inside a ‘.php’ file with a filename that perfectly matches the classname itself. This covers both the spelling and the capitalisation being used. In our case here, the filename ⁸https://packagist.org
Putting a name on our spaces.
156
becomes PageController.php. We do need to honour the mandatory file extension of ‘.php’ too, but since I don’t think I’ve seen anything else other than ‘.php’ out there in the wilds for a very long time, it’s hardly worth mentioning. As you can see, the minimum requirements that Composer imposes upon us to meet PSR-4 standards are quite readily achievable. Not only that, they’re pretty sane too! The only thing left to do is to tell Composer’s autoloader how to find our properly namespaced classes. This is achieved by adding an autoload entry to Composer’s config file, composer.json. Our simple example above would result in an entry that looks very much like this. { "require": { ... }, "autoload": { "psr-4": { "Vanqard \\": "src/" } }
All that we’ve really done here is provide a simple key->value pair that tell’s Composer’s PSR-4 compliant autoloader that classes beginning with the ‘Vanqard’ namespace can be found under the “src/” directory relative to the composer.json file itself. However, our work isn’t quite over yet. We’ve only told Composer about our vendor namespace. To remain compliant (and therefore functional!) we still need to consider any sub-namespaces left over after we’ve stricken off the vendor prefix. In our example above, our namespace is “VanqardController”. Once we’ve stricken the “Vanqard” prefix from the list, the only thing left is a single sub-namespace i.e. “Controller”. PSR-4 tells us that that should map directly to a directory underneath the directory that we mapped to the prefix in composer.json. Again, the spelling and capitalisation are important here too. That leads us to a filesystem layout that looks something like this.
Putting a name on our spaces.
157
$PROJECT_ROOT/ composer.json
// <- This dir mapped to the "Vanqard" prefix
src/ Controller/ Interace/
ControllerInterface.php PageController .php Service/ BlogService.php Auth/ Strategy/ Acl.php
As you can see, following even the most basic application of the PSR-4 standard leads to a filesystem layout that is not only neat and tidy but also very logical too. For the benefit of further illustration, I also include the path for a class that would be referred to as “VanqardAuthStrategyAcl”, simply to show how each sub-namespace (“Auth” and “Strategy”) here corresponds to an identically named directory. Adopting this scheme will mean that the problem of autoloading classes will already be taken care of. As always, not reinventing the wheel is a very smart thing to do. There are many more options that you can supply to Composer via it’s composer.json file, but we will cover those later on the book.
Whilst I’ve presented only a very simple strategy here, a more common use case would be to “extend” the namespace prefix that we use to provide a project level degree of separation as well. If the vendor namespace represents you (or your company), then the first sub-namespace can be used to indicate the particular project that you’re working on. A small modification to the example above would end up looking like this.
Putting a name on our spaces.
158
namespace Vanqard\MyApplication\Controller; use Vanqard\MyApplication\Service\BlogService; use Vanqard\MyApplication\Controller\Interface\ControllerInterface; class PageController implements ControllerInterface { ... }
The only thing that has changed there is that I’ve inserted “MyApplication” into the namespace declarations. Should we move all of our code into a new “MyApplication” folder? Well, we could if we wanted to. Doing so would make sense if we anticipated that this particular project directory would end up containing multiple sub-projects. Otherwise, there would be no need to reorganise the filesystem layout for the project, we just need to tell Composer that the namespace prefix has been changed. { "require": { ... }, "autoload": { "psr-4": { "Vanqard \\MyApplication \\": "src/", "Vanqard \\MyPackage \\": "lib/mypackage/" } }
For the sake of illustrating how we might accommodate a new sub-project, I’ve also added a second entry to the autoloading specification; one that maps the namespace “VanqardMyPackage” to a new directory. Even if you don’t use Composer to pull in and manage outside dependencies, just following these namespace guidelines for your own code and making the appropriate edits to composer.json makes autoloading super easy and entirely hassle free. With a free and simple autoloading solution already available for your projects, simply following the standards that it requires will start our projects off nicely on
Putting a name on our spaces.
159
the road to a high degree of cleanliness and logical separation, very much along the lines of a properly organised mp3 collection. I can’t automatically provide us with the “correct” directories and their corresponding sub-namespaces obviously but our exposure to frameworks makes it highly likely that we know most of the popular choices already.
Conclusion If I could condense this chapter down into just five words, the message would be “Namespace everything that you write”. The very real danger of name collisions happening within the global (root) namespace becomes a complete non-issue when all of your own code resides inside a namespace of its own. You really are better off leaving the global namespace to the language itself. You might already have noticed that I don’t often promote the idea of lazy coding. Indeed, taking shortcuts is anathema to the PHP Brilliance manifesto. However, when it comes to a choice between trying to create your own router package or choosing one that’s already of a proven standard, the optimum approach must be to use the pre-built one just so long as you can ascertain that it is fit for your purposes and of a high enough standard. Employing Composer to manage those dependencies for you makes achieving quality relatively effortless, and this in particular includes making it easy to keep your dependencies up to date with the latest bug fixes and security patches as provided by the respective authors. Conclusion? Namespace all the things and let Composer manage the autoloading for you.
Expressing good traits. Back in 2012, London was host to the Olympic Games and PHP5.4 arrived, towing behind it a truck loaded with problems and dynamite. This truck of trouble had one word painted on its side. That word? Traits. Hang on a second, didn’t you refer to traits as being some of the juicy new “Good Stuff” not so long ago? . Yes, I did and traits do still qualify as being some of the juicy new “Good Stuff”, which we’ll get onto shortly. A truck loaded with problems and dynamite only becomes an issue if you’re not ready for them when you open the door. Hence the reason for using quite incendiary language – my aim is quite specifically to put you on your guard. In other words, it’s time to don those snake proof gloves again because, baby, this one’s gonna bite you. First of all, we need to be clear on what traits are. Declared in a similar fashion to a class, a trait provides the means for which a closely related set of functionality can be coded up into a single unit and then, at compile time, “copy and pasted” into a regular class definition before the class definition itself is parsed. Without even worrying about the actual mechanics, this is how you need to remember it happening. In this way, traits provide a very powerful and convenient mechanism for supporting code reuse. They help us to keep our codebases DRY and that in itself automatically means we are avoiding unnecessary code duplication. The basic premise is that you define the reusable code inside the trait itself - this is the ‘copy’ part of the operation. Then you proceed to tell the PHP interpreter where that reusable code should be inserted through the use of the use keyword - correspondingly the paste part of the operation.
160
Expressing good traits.
161
I know that we just looked at the use keyword in the previous chapter on namespaces. use has two discrete modes of operation. Outside of a class, it’s used for importing a namespace, class, interface or function. Conversely, when use is used inside of a class declaration, it’s expressly for importing the contents of a trait at compile time. It may help to avoid confusion if you can hang on to the notion that use is used to import something, at least.
Let’s proceed with a simple example to see how they work, and then proceed to explore them a little further.
namespace Vanqard\MyProject\Model; class UserModel {
public function save() { ... // trigger storage } ... }
Our starting point then is a rather anaemic looking UserModel, though obviously I’m only showing the one method that we are particularly interested in here - the save() method, which might be invoked on a model object in order to trigger the persistence its data to whatever storage system we are using for this model. It’s reasonable to assume though that the save() method will be common to all of our model classes. Should we then make it part of an abstract base class? It could be argued that somehow triggering storage is a requirement common to all models. This would only be the case though when you can readily and comfortably apply the notion of a single “model” data type to all of the models. For many small to medium sized projects where the need for scale is relatively low, this might be appropriate. Yet it’s also possible to interpret the save() method as a convenience method. We should know by now that that’s not the PHP Brilliance way of doing things. We know that
Expressing good traits.
162
adding convenience methods to an abstract base isn’t the way to go because code reuse via inheritance is a terrible thing. Of course, since we’re in the Traits chapter, we should look at how that is achieved with a trait. First of all, we’ll start with our modified model class.
// File: src/Model/UserModel.php namespace Vanqard\MyProject\Model; use Vanqard\MyProject\Persist\Mysql as MySqlStorageTrait; class UserModel {
use MySqlStorageTrait; ... }
Already, we have a few things to examine here. Outside of the class{} definition itself, I’ve used the use keyword to import the trait into the current scope and then aliased the class so that we can refer to it locally as MySqlStorageTrait. Inside the class definition itself, I’m telling the PHP interpreter to copy the contents of the trait identified by the alias MySqlStorageTrait into the location where I’ve used the use keyword. There’s an important distinction to make here- the line with the use keyword on will be replaced by the entire contents of the trait that it refers so before executing any of the code. This makes sense when you think about it. PHP needs to know what the final structure of a class will look like before using the finished definition of the class internally. I know that it trips a developer up occasionally, so it’s worth pointing out that that prohibits us from using a variable (such as $trait) to identify the actual trait to use. Anyway, let’s go on to look at the trait.
Expressing good traits.
163
// File: src/Persist/Mysql.php namespace Vanqard\MyProject\Persist; trait Mysql {
public function save() { ... // trigger storage } }
Again, this is a pretty simple example. As you can see, the trait only contains the save() method that was previously located in the UserModel class. If we bear in mind that it’s the contents between the opening and closing curly braces that will be copy and pasted into the target class, we should be able to deduce that the ‘finished’ class definition that PHP sees will in fact be identical to the first example that didn’t reference the trait in the first place. Why is this advantageous? The immediate benefits for this example are threefold. Firstly, we are now theoretically able to provide the trait’s functionality to any class of our choosing, simply by adding the use MySqlStorageTrait; at an appropriate location within the target class. Win. Secondly, because the reusable code is held in a single location, it is much easier to maintain. If we need to make changes to the internal logic of the trait, or even provide additional functionality (a delete method, perhaps?), this is relatively easy to do so. Since we have a single, authoritative source of the code, we have successfully avoided code duplication and no longer need to search for every instance that this code occurs. Win. Lastly, it makes compile time switching of storage strategies very simple. Assuming that you’ve prepared the appropriate migration scripts in the background, swapping out MySqlStorage for, say, MongoDbStorage for just one specific model becomes very easy to do. Win. Is this better than writing mysql and mongo specific convenience methods into a base class? For sure. Is this a wise thing to do? Absolutely not.
Expressing good traits.
164
Huh? Having just presented three seemingly positive aspects to traits, I now seem to be trying to trip you up by rejecting the idea. What’s that all about then? Primarily, a trait is there to present reusable code to objects with cross-cutting concerns . In other words, objects that otherwise bear no discernible relation to each other now have the opportunity to implement a common method without the developer resorting to actual copy and pasting or making the mistake of creating a common base class to put the desired method(s) into. This is precisely what they are intended for and I’m only presenting the “switchable strategy” case here because I’ve actually encountered this approach out there in the wilds. There are in fact much better ways to implement switchable strategies and well get onto those when we hit the chapters concerning architecture and application design. In the meantime, let’s just take a moment to double check that those snake-proof gloves are on nice and tight. The key problem that we’ve introduced with those examples above, and it’s certainly true of all trait usage, is that we’ve modified the interface of the consuming class without having provided any indication that we’ve done so. Having gone to such lengths to extol the benefit of representing a conceptual interface as a form of guarantee, with the simple application of a trait we’ve now proceeded to provide unguaranteed functionality. If one of our goals is to ensure that our codebase remains robust and resistant to bugs, what we need here then is the provision of an interface. For the purpose of providing appropriate illustrations, I’m going to switch to a more commonly cited example.
interface Loggable {
public function log($msg, $level); }
trait LoggingTrait {
public function log($msg, $level) { ... // write to log
Expressing good traits.
165
} }
class AnythingThatNeedsLogging implements Loggable {
use LoggingTrait; }
I’ve excluded the namespace declarations here just to preserve clarity, but as you should be able to see, our panic is over. The class that’s consuming the LoggingTrait now also guarantees that it has Loggable functionality by implementing the corresponding interface. This then should be our golden rule: For every trait that we create, we should also provide a corresponding interface definition that covers the methods defined within that trait. Any object that subsequently consumes a trait should also implement the corresponding interface to maintain the integrity of its guarantees.
Damn those leaky traits Since we can’t actually specify an interface directly on a trait, we’ve seen how we need to provide an independent but very specific interface declaration for that trait. I’ve even posited that as a golden rule for the explicit reason that failing to do so is to open to door to some rather buggy scenarios down the line. Nevertheless, it’s possible for a trait to have a direct influence on the structure of any class that consumes it. This is what I mean by a “leaky” trait. When a trait is fully introspective and concerns itself solely on its own contents, then we can consider it safe to use. Its boundaries are well defined. Its entire mode of operation is encapsulated within the scope of its own curly braces. When this is so, we have ourselves a non-leaky trait. However, since a trait is naturally composed of plain old PHP, a less well informed developer can start popping holes in our previous watertight hull, causing the trait’s concerns to dribble out all over the place. How? The first leaky approach that we’ll inspect is achieved through the use of declaring a trait’s method abstract. By marking a method as abstract, a trait is requiring that
Expressing good traits.
166
any consuming class must also provide a concrete implementation of that abstract method. Returning to our LoggingTrait example, we might end up with something looking like this. trait LoggingTrait {
private $logger; public function log($msg, $level) { $filename = $this-> getLogFileName(); $this->logger-> setFileName($filename) ->log($msg, $level); }
abstract public function getLogFileName(); }
With this example, we’ve specified that any class wishing to consume the LoggingTrait must also provide a getLogFileName() method because the code in our trait now depends on it. This is our first type of leaky and admittedly, it’s not a bad leak. Our trait is expressing an external dependency which must be provided for if the trait code is to work properly. Since the demand is an explicit one, we can accommodate it by modifying the Loggable interface accordingly and all should be well. Any class that wishes to consume the trait should also implement the interface provided for the trait. However, we’ve gotten ourselves into a bit of a tangle. We should now be clear on the idea that an interface is the public expression of a guarantee for the specific benefit of any client code encountering it . Should getLogFileName() be part of the consumer’s public guarantee? In this specific instance, probably not. For your own applications, it should be on a case-by-case basis backed by the knowledge of the guarantee that your consuming classes need to provide. In the above example, we could fix this by making the getLogFileName() private or protected and subsequently omitting it from the interface. Our next example of leakiness is one that we cannot guard against through an interface though. It happens when the code in a trait refers to a property that is supposed to be present in the consuming class.
Expressing good traits.
167
trait LoggingTrait {
private $logger; public function log($msg, $level) { $this->filename = $this->logFileName; $this->logger-> setFileName($filename) ->log($msg, $level); } }
Even though PHP is quite happy to let us do this, we absolutely must not allow this sort of thing to happen for quite obvious reasons. Not for properties and not for class constants either. Sure, we can add a constant definition to the trait’s corresponding interface but since we are now well aware that constants that are defined in interfaces are of the constant constant variety, we are actually wasting our time entirely. The consuming class wouldn’t be able to change it, so it might as well be written into the trait directly. In any event, our trait has become far too concerned with the things that lay beyond its boundaries. If it can’t be covered in the corresponding interface, it shouldn’t be included in the trait’s code. Methods are safe since they do have the opportunity to appear in the interface. Constants are sometimes safe as long as we remember that, when declared in the interface they become immutable. When a trait references a property that is defined outside of itself, it is never safe. Our next leaky scenario is one that bridges the gap between the previous leaks and the more general issue of method overrides. It’s also pretty nasty.
Expressing good traits.
168
use Vanqard\MyProject\Logging\SimpleLogger; trait LoggingTrait {
private $logger; public function __construct(\SimpleLogger $logger) { $this->logger = $logger; }
public function log($msg, $level) { $this->logger->log($msg, $level); } }
“What the heck! ”, I hear you say, and quite rightly so I might add. Yes, even though I think that this is something of an oversight, PHP’s trait system will allow you to attempt to provide your own custom constructor to any consuming class that uses the trait. If the consuming class accepts the constructor then you really are in a pickle, since a class’ own constructor plays such a fundamental role in the set up of any objects that are instantiated from that class. This, to my mind, isn’t just a leak - it’s a clear cut case of firing torpedoes at the rest of our application. When you think about it logically, a trait should have no role in governing the actual setup of the objects that might consume it. Such a notion steps well beyond the boundaries of its own concerns, especially when you recall that a trait is intended to provide common functionality to objects that otherwise have nothing in common. You may well have noticed that I was being a bit cagey about whether the constructor actually succeeds in getting pasted into the consuming class though, and for good reason as it turns out. This is the point where we switch from being leaky to skating on thin ice.
Expressing good traits.
169
How do I work out who gets what? Whenever you import a trait into a consuming class it’s usually because the trait represents a certain chunk of functionality that’s required by two or more unrelated classes. As we’ve already seen, traits provide us with a convenient approach to creating reusable code without resorting to inheritance. This part we already understand. Where things start to get a bit slippery is when we realise that a method declared within a trait will only be successfully imported into a consuming class when that consuming class hasn’t already declared its own method with the same name. Or to put it another way, only those methods that aren’t already in a consuming class will get imported from a trait. This is why I was a bit cagey earlier with our example of a LoggingTrait that included a definition for a constructor. The trait’s version of the constructor would only ever be imported into classes that don’t already declare their own. How slippery is that? It’s perfectly apparent that the rest of the trait’s code relies on that logger property but going down the route of specifying a constructor that requires a logger instance simply isn’t appropriate, if only because from the trait’s own perspective, it has no idea which of its methods will end up being successfully imported and which ones will fail. It should certainly start to become apparent now why I was suggesting the application of snake-proof gloves at the beginning of this chapter. Traits are bitey little critters, that’s for sure. Clearly then, we now have a need to start considering precedence. How do we determine which methods would be successfully imported, and which methods wouldn’t. Well, to begin with I’ve already covered the circumstance whereby a class that already defines a particular method won’t then be able to import an identically named method from a trait. Our first order of precedence lies with the consuming class, which is exactly as it should be. This does pose something of a problem as far as the trait itself is concerned given that the trait has no guarantee that all of its code will be imported wholesale. However, the consuming class’ ability to retain full control over its own interface is paramount. The second order of precedence then lies with the trait itself. When none of the trait’s
Expressing good traits.
170
methods are explicitly defined in the consuming class, the trait will be applied in its entirety. This might prompt the unwary developer (or the developer with a very definite set of needs) to start providing some vague guarantee that all of a trait’s methods will be successfully imported. How? By pseudo-namespacing the method names themselves. use Vanqard\MyProject\Logging\SimpleLogger; trait LoggingTrait {
private $logger; public function loggingTraitLog($msg, $level) { $this->loggingTraitgetLogger (); $this->logger->log($msg, $level); }
public function loggingTraitGetLogger() { ... // lazy load logger if not available
return $logger; } }
It’ll work (in most cases) but it sure is ugly. PHP Brilliance doesn’t make its mark on the world by leaving behind a trail of ugly code. This is just as bad as employing Really_Long_Class_Names in order to avoid name collisions . PHP Brilliance remembers that the code that we write is written for the benefit of one particular type of audience only: ourselves and other coders. By adhering to a policy of creating clean, logical, coherent code that remains easy to read we remain on the sunny path to a bug-free future. Or in other words, more pub time! When you’re faced with considering the option to “namespace” a trait’s methods, it is most certainly time to consider moving the trait’s code into its own dedicated class. Doing so means that instances can be brought into being and then suitably injected into the objects that were originally intended to consume the trait. The application of a namespace-like prefix to method names is a symptom of a need for encapsulation. In this case, go right ahead and encapsulate.
Expressing good traits.
171
Nevertheless, we are still in the chapter on traits and we are still in the middle of considering how the precedence hierarchy affects the successful (or otherwise) importation of a trait’s methods. Since we have arrived at the trait level, we now need to consider the possibility of two traits providing identically named methods. Consider the following example.
trait LoggingTrait {
private $loggingAdapter; public function getAdapter() {
return $this->loggingAdapter ; } }
trait SmsTrait {
private $smsAdapter; public function getAdapter() {
return $this-> smsAdapter; } }
Here we have two independently defined traits that both provide the identically named getAdapter() method. If we then turn our attention to the consuming class, we can start to consider what might happen if we tried to import both of these traits.
Expressing good traits.
172
class PageController implements ControllerInterface, Loggable, SmsEnabled {
use LoggingTrait, SmsTrait; ... }
We know this is going to fail, even if we’ve specified our trait specific interfaces correctly. The problem here is that with both traits specifying a getAdapter() method, the PHP interpreter simply can’t tell which one it’s supposed to import. Both are independently successful candidates but side by side, the two traits have equal precedence. Consequently, we will be required to employ some rather dubious looking code in order to indicate which one to use.
class PageController implements ControllerInterface, Loggable, SmsEnabled {
use LoggingTrait, SmsTrait { LoggingTrait:: getAdapter insteadof SmsTrait:: getAdapter; } ... }
Whilst this satisfies the interpreter’s requirement by explicitly instructing it which of the two conflicting methods it should import, there’s clearly quite a problem going on here. It should be a reasonably safe assumption that each trait is reliant upon its own version of the getAdaptor() method, yet the modifications here are telling the interpreter to favour the version in the LoggingTrait insteadof the version in the SmsTrait. If you can’t hear the alarm bells ringing at this point, it may be time to go and clean out your ears. I think it’s a reasonably safe assumption that any appearance of the insteadof keyword in our code is much more an indicator of a flaw in our design more than anything else. insteadof feels “kludgey”, to be blunt. Whenever
Expressing good traits.
173
you encounter this kludgey keyword in the middle of a use declaration, you need to go and look at the trait that contains the unsuccessful method. Are you satisfied that that method is entirely optional to the trait’s operation. This needs to be considered not just for the case at hand, but in all cases. If another team member favours a different method over this one, will that lead to a bug? Of course, we might be tempted to go down that route of pseudo-namespacing our trait’s methods to avoid the possibility of any collision but as we’ve already seen, when encapsulation is called for, encapsulation should be used. Finally, we’ve arrived at the last level in our precedence hierarchy. When a trait consuming class is itself derived from a parent class via the miracle of inheritance, any methods declared in the parent class will be overridden by any identically named methods in any trait that a child class consumes. This is actually perfectly logical when you remember that a trait is nothing more than an automated copy and paste mechanism. The end result is just the same as if you’d coded the overrides into the child class directly. However, since the overriding methods aren’t immediately visible to the developer (they’re “hidden” behind the use keyword), caution should certainly be exercised to ensure that a trait’s methods don’t accidentally override any methods declared by the parent, not least because it’s the parent that presents that all important interface guarantee to the rest of our code. In other words, a trait can mess with your data type definitions and you wouldn’t necessarily know about it until something goes wrong.
The Yo-Yo problem… horizontalised I haven’t exactly painted a rosy picture in support of traits, have I? Yet they do provide us with a very powerful means of achieving code reuse. We already know that our application’s should be DRY, not WET. We already know that the duplication of the same pieces of code throughout our application’s code base only ever leads to a maintenance nightmare in the future. Even though I’ve taken us through quite a nightmarish landscape, our considerations of the problems that traits bring with them are not over yet, although I promise that this is the very last one!
Expressing good traits.
174
It’s something that I’ve already alluded to briefly and it’s all about the question of visibility. When a developer is looking at the code for a particular class and that class consumes a trait, it follows that the developer no longer has the complete picture in front of them. Not unless they have both files open side-by-side. Bearing in mind that our code is written for a specifically human readership, we need to make sure that they’re comfortable with the idea that employing the use keyword inside a class means that have also have to consider a trait’s code as if it were inlined within the class in front of them. This of course is very much like the need to check for parents in an inheritance hierarchy, but whereas a parent class requires the developer to look “up” so to speak, a trait requires a developer to look sideways. Which in turn leads to the “horizontalisation” of the Yo-Yo problem. You see, even though I haven’t even mentioned it yet, it’s perfectly possible for one trait to consume another trait, which in turn can consume yet another. As a result, we can end up with a stack of traits bulging out of the sides of our classes like some cancerous tumour. If the Yo-Yo problem describes a situation where an inheritance hierarchy has so many levels that a developer needs to track up and down the class tree in order to decipher the flow of control through an object, the newly horizontalised YoYo problem occurs when multiple traits are chained together and thus, leave the developer with a difficult job of trying to work out which methods will successfully make it into the final class.
trait Foo {
public function FooMethod() { }
public function FooYouTooMethod() { } }
Here’s the first trait in our proposed chain. It’s pretty straight forward so we’ll not dwell on it for long.
Expressing good traits.
175
trait Bar {
use Foo; // Bar trait consumes Foo trait public function BarMethod() { }
public function FooMethod() { } }
Now here comes our first oddity. The Bar trait consumes the Foo trait from before but overrides the FooMethod() to provide its own version. trait Baz {
use Bar; // Baz trait consumes Bar trait public function BazMethod() { } }
The last trait in our chain, the Baz trait is also very straightforward. class AwesomeThing extends AwesomeBase {
use Baz; // target class consumes Baz trait public function doStuff() { } }
By consuming the Baz trait in our target class, we as developers need to know what kind of stuff is going to end up being pasted inside the class at compile time. This means we have to check the contents of the trait that we’re consuming. Opening
Expressing good traits.
176
up that file in our editor is going to result in us needing to open up the Bar trait, which in turn leads to opening up the file for the Foo trait. After all that, we have to somehow mentally collapse that structure down until we’ve ascertained which versions of which methods are going to end up inside AwesomeThing. With such a simple structure, we shouldn’t have too much difficulty in spotting that the Bar::FooMethod() is likely the only cause for surprise. But when you start putting real code in there… Except our AwesomeThing class seems to be inheriting from AwesomeBase. Oh dear. The long and the short of it is this: don’t even think of chaining traits together in this manner. If you do have a definite need for traits, try to ensure that the hierarchy is only two levels deep: The consuming class and the consumed trait. Given that we have the ability to use as many traits as necessary, if those traits are each only a single level deep and if each trait has a sensible name that clearly implies its intent, then the developer that encounters those traits in a class should at least be able to infer what’s going on. If they can’t, they need to go and look. Checking a trait to see what’s going on should not result in the need to mentally solve a “Towers of Hanoi” type puzzle game just to work out what the end result will be.
Summary On the whole, this may well have seemed quite a negative chapter by illustrating just how many pitfalls crop up from using traits. Even so, let this knowledge then be a means of empowering our wisdom on the matter. Let it put us at a distinct advantage of having been equipped with sufficient knowledge to provide foresight on how we might avoid trait related bugs creeping into our applications. Whenever we learn something as a developer, we enrich our mental toolbox. Knowing just a few of the things in here will help drive better judgement calls when it comes to using traits. In just the same way that all of the preparations for the Millennium Bug made the Millennium Bug a non-event, we can avoid the pain of time spent fixing the bugs in our own code when those bugs never occur in the first place. Conclusion? Traits should be used sparingly and used wisely. Keep the focus of their intent both self contained and laser sharp. Then reap the rewards from a carefully managed but powerfully effective system of code reuse. High five for the win!
Finding Closure. Closures are funny little things. Not funny as in baby-kitten-on-youtube funny. More funny as in “If I feed this thing after midnight, is it going to wreck my apartment” funny. We’re going to look at them anyway because we’re going to be seeing a lot more of them in the future. To start with though, you might even be wondering why I’m even including them in an book that’s ostensibly about object oriented programming. This may especially be the case when you consider that they are also commonly referred to as anonymous functions. Heck, they even look like regular procedural functions too. Just, you know, without a name. The reason why I’ve dedicated a chapter to them should become clear in due course. In the meantime we need one of those “back to basics” examples. $myFunc = function($name) {
return "Hey {$name}"; };
echo $myFunc('Joe'); // Output: "Hey Joe"
An anonymous function then is just that: a function without a name. The immediate benefit to be derived from this is that they do not clutter up whatever namespace that they happen to reside in with one of those clunky, old-fashioned function names attached to them. This is especially pertinent to the global namespace, where many named userland functions tend to reside. That having been said, the fact that we are perfectly capable of declaring custom named functions within a namespace of our own choosing, MyVendor\MyApplication\Function for example, makes the cluttering of the global namespace a thing of the past anyway. So what’s so darned good about them then? 177
Finding Closure.
178
Portable, disposable functions One potential benefit of assigning a function to a variable is that that function now becomes portable. Rather than creating your regular libraries of functions in files quite remote from where you actually use them, you now have the ability to set up a function exactly where it’s needed, pass it around as much as is necessary and then destroy it when you’re done. That last part is also significant. When taken in context of a single web-based request, named functions written into required files exist for the entire duration of the request. Anonymous functions exist only for as long as you need them to be and will be subject to PHP’s garbage collection once the variable that’s carrying it has gone out of scope. Of course, you may also manually destroy such a function by calling unset on the variable that it’s assigned to, like this: unset($myFunc);
It is precisely for this reason that anonymous functions are often referred to as “throw away code”. My own preference is somewhat different to this though. When taken in this most basic sense, I prefer to treat them like macros. I certainly feel that this more appropriately conveys the utility that they provide. “Macros are used to make a sequence of computing instructions available to the programmer as a single program statement, making the programming task less tedious and less error-prone.” - Wikipedia.org⁹ Doesn’t this better portray what can be done with such beasties? Within the body of an anonymous function, you can code up the appropriate logical processes that you need and then simply execute that “macro” at the point that it’s needed. This is precisely why the portability aspect is of particular interest. The ability to create a function in one location, assign it to a variable and then pass that variable around is of significant value. ⁹http://en.wikipedia.org/wiki/Macro_%28computer_science%29
Finding Closure.
179
You might already be wondering why this part is even relevant given the fact that we’ve also looked at how they can be created in precisely the location that they’re needed. Indeed, you might already have arrived at the conclusion that if such a function, or at least the logical processes that it carries, were to be required in more than one place then surely an ordinary, regularly named function would more likely serve our purposes better. Perhaps.
Another “use” case One distinct advantage that anonymous functions have over their more familiar named brethren is the ability to import variables from the parent scope at the point that they are defined. This aspect is another key feature to pay close attention to. Normally, variables inside a function only live as long as it takes for the function to execute. By default, the values that are supplied as parameters to a function call are copies of the values that were available at the point where the function was invoked. The exceptions being variables passed by reference with the ampersand prefix, and objects, which are passed by reference anyway. In the bad old days, accessing a variable that was declared outside of the function’s own internal scope meant using the global keyword or accessing the $GLOBALS array itself. Doing so would make that variable available inside the function.
function showDriver() { global $driver; echo "Driver is set to: {$driver }"; } showDriver();
// Output: "Driver is set to: mysql"
This is of course horrendous. Leaving variables, potentially crucially important ones, lying around in the global namespace is a recipe for disaster since they offer
Finding Closure.
180
absolutely zero protection for the values that they are holding on to and may be changed on a whim. When an application runs to thousands of lines, attempting to track down the one that inadvertently changes the value of a global can lead to a significant loss of valuable pub time. Which brings us back to the use keyword again. We’ve already seen how the definition of a class, interface, function or trait might be imported into our current namespace through the application of the use keyword at the top of the file. We’ve also seen how the actual contents of a particular trait can be imported into a class’ definition through the application of the use keyword within the consuming class. Now, we also get to apply the use keyword to functions as well, which adds a whole new dimension to the usability (!) of our anonymous functions. The basic syntax looks like this: "mysql", "dan" => "mysql:host=localhost; port=3306", "user" => "root", "pass" => "secret" ); $getConfig = function() use ( $config ) {
return $config; }
Let’s begin the process of examination by looking at the use clause itself. This clause allows us to import variables into the function as they exist within the function’s parent scope. This is important. Having declared the config array here in the same scope as the function itself, it means that this array can be imported into the function. Once it’s in there, you can then proceed to send the function to any place that it’s needed, simply by passing the $getConfig variable. With a statically declared function residing in another file, this is a great deal harder to achieve.
Finding Closure.
181
What we have here then is a powerful way of transporting other variables, mostly objects, to other parts of your application and helps you to avoid situations where you might otherwise resort to accessing global variables or reaching out to, say, a registry singleton. It works both ways though. Rather than simply parcelling up a variable by importing from the parent scope and then sending it somewhere else, we can use this function() use() structure to have our parent scope variables affected by remote operations. What do I mean by this? Let’s look another example. class MessageQueue {
private $messages = []; public function addMessage($message) { $this->messages[] = $message; }
public function getMessages() {
return $this->messages; } }
Here we’ve defined a really rather simple class that will allow us to collect messages passed in via calls to the addMessage() method. $mq = new MessageQueue(); $queueMessage = function($message) use ($mq) { $mq-> addMessage($message); };
In this fragment of code, we’ve instantiated a new MessageQueue object and, by using the use construct, we’ve imported that instance into the anonymous function, which is subsequently assigned to the $queueMessage variable.
Finding Closure.
182
Now, whenever we pass this variable around, the client code that receives it will be able to add messages to our MessageQueue instance, simply by invoking the function with an appropriate message parameter. $queueMessage-> addMessage('Here comes the $_POST array'); array_walk($_POST, function($value, $key) use ($queueMessage)) { $message = "POST element {$key} set to {$value }"; $queueMessage-> addMessage($message); };
Rather cheekily I’ve jumped ahead of myself and included another anonymous function in there as a callback to an array_walk() over a $_POST array. Nevertheless, I hope you can see how we’ve managed to somehow encapsulate a connection back to the original MessageQueue instance by importing it into an anonymous function with the use keyword. Just like function parameters, variables passed in via the use() clause are by default copied in. For a function to affect the original variable, it needs to be passed in by reference. For objects, that happens automatically. But for scalars and arrays, you’ll need to remember the ampersand (&) prefix.
$name = "Jane"; $addLastName = function($lastName) use ( &$name ) { $name = $name . ' ' . $lastName; }; $addLastName('Doe');
echo $name; // Output: "Jane Doe"
As we can see, this can be quite a powerful technique for having a “central” object affected by various disparate sub-systems simply by passing an anonymous function to those very sub-systems.
Finding Closure.
183
Of course, for simple one-liners such as this one, we would be much more likely to pass the original object itself. However, when you mesh this technique with the idea that such a function can be treated as a macro, that is a repeatable series of logical steps, then the value of these things should become more apparent. At the appropriate time, each sub-system can invoke that macro with any relevant, locally held parameters. Speaking of which…
Anonymous functions are callable. All of PHPs functions, including userland ones created by developers, are callable naturally. We would soon find ourselves struggling to do anything useful with the language if this wasn’t the case. However, “Callback-able functions” makes for an awkward sub-heading and doesn’t entirely convey the right meaning. However, I made a reference earlier to using an anonymous function as a callback. Let’s just take a moment to go over this idea.
/* Output: Array ( [0] => [1] => [2] => [3] => [4] => ) */
2 4 6 8 10
184
Finding Closure.
This is painfully simple but it will serve to illustrate the idea perfectly well. The anonymous function assigned to the $doubler variable is short and to the point. As a “macro”, it doesn’t do a great deal. Nevertheless, we can see that it was invoked five times, once for each element of the input $values array and its effect is visible in the array after the array_walk() function has completed. There’s no reason at all why we couldn’t have encapsulated a much more complex process into the function and fed that to array_walk() instead. Nevertheless, we’ve gone for a very simple one in this example. Given that the portability aspect is irrelevant here, we can dispense with the variable assignment and simply write the anonymous function directly into the location where it’s needed. Very much like this:
/* Output: Array ( [0] => [1] => [2] => [3] => [4] => ) */
2 4 6 8 10
Bonus! The function only exists for as long as it’s needed and is disposed of shortly after the array has been “walked”. This though is the point: Whenever you have an iterative process that requires a callback to each iteration, an anonymous function will be one of those candidates that fits the bill. It just needs to be callable.
Finding Closure.
185
Procedurally, this means that a test with the is_callable() function returns true, whereas in an object oriented sense it should satisfy the requirements of a “Callable” type hint. We can satisfy both of these conditions with a relatively short piece of code to illustrate these facts.
echo "Hey Joe!"; } var_dump(is_callable($myFunc));
// Output: bool(true) // We can typehint for "Callable" too function isCallable(Callable $func) {
echo "Yes, it passes the typehint test"; } isCallable($myFunc);
// Output: "Yes, it passes the typehint test";
So far then, we’ve seen a few of the benefits that anonymous functions make available to us as developers. With the ability to assign them to variables, we are able to pass them around as our needs dictate. We can create them precisely where they are needed, rather than in a relatively “distant” file. We can import variables local to the function where it is originally defined and then make those variables available to wherever the function ends up. Since they’re also disposable, we can destroy them once we’ve actually finished using them. The same can’t be said for their named cousins. After the relevant file has been required, those named function definitions, all of them, are held in memory for the duration of the request whether they are actually needed or not. We have also seen how they are callable and as such, can be used in any location where we are expected to provide a callback. They satisfy the requirements of being typehinted as “Callable” and return true from the is_callable() function.
Finding Closure.
186
One thing that we haven’t really considered yet is this: Why on earth are they included in a part of the book entitled “Extending your object oriented brain”. The reason isn’t actually “Because they’re actually objects”, even though that part is true.
Anonymous functions are objects Whenever you create an anonymous function and assign it to a variable (the portable characteristic), PHP turns it into an object for you. Not a plain old vanilla stdClass type object. Rather, an instance of the Closure class. There. Now we’ve pulled this thing back in line with the title of the chapter itself. The Closure class is specific to these particular beasties. You cannot create your own instances in code by writing a statement such as $c = new Closure();. Nor can you derive child classes from it since the class is marked final internally and therefore cannot be extended. Not that you would actually want to do anything like this. If you really wanted to create your own callable object, you’re perfectly at liberty to do so. When closures originally made their appearance in PHP, they brought with them a new magic method: __invoke(). We’ll look at how this is relevant shortly, but first of all let’s concentrate on the structure of the closure class rather than jumping ahead too soon. Whenever the interpreter encounters an anonymous function being assigned to a variable, it will turn that function into a Closure instance. In doing so, your “function-that-is-now-an-object” picks up an interesting method along the way: bindTo(). The bindTo() method gives you the extraordinary power to brutally savage and molest the private properties of other objects. I threw that in there just in case you were going to sleep on me. It’s true, nonetheless. The bindTo() method allows you to create a clone of the current closure, but one with its active scope bound to another object. The effect of this is that if the closure in question contains a reference to $this, the scope that $this refers to can be changed dynamically. As a result, $this can be made to reference the internal scope of a completely different object. Look.
Finding Closure.
187
class SecretValue {
private $secret = 'nobody-knows'; } $shhh = new SecretValue();
Our exceptionally limited interface here ought to be protecting the value of the $secret property quite adequately. In other words, there’s no way to get to it from the outside. Without the appropriate methods being put into place, there should be no way to read from or write to that property. However, by the power of the bindTo() method, we can pop it right out simply by binding a closure containing a reference to $this. Like, er, this. $closure = function() {
return $this-> secret; };
So far, nothing startling. Other than the reference to $this inside the function, it’s very much like the examples that we’ve been examining previous. But when you get to the binding part, it’s a different matter. $boundClosure = $closure->bindTo($shhh, $shhh);
echo $boundClosure(); // Output: "nobody-knows"
The bindTo() method accepts two parameters. The first parameter is the object that the closure should be bound to. The second parameter is optional and when provided, identifies the new scope inside of which the bound closure should operate. When we assign the same object as the new scope parameter, we are effectively attaching the closure to the object as if it were a method of that object. This is despite the fact that the $boundClosure is created outside the class itself. Does this make sense? It might help to flip the idea on its head a little, although I should point out that what follows next is technically incorrect but if it helps encourage understanding, then we can be excused. Maybe.
Finding Closure.
188
In any case, let get this fallacious statement out of the way then. Once we’ve bound the closure to an object with the optional scope parameter included, the content of the closure - that is, the logic that we’ve provided - behaves like an unnamed method inside the target object itself. We can trigger that unnamed method simply be invoking the closure from the outside. That trigger was pulled on the last line of the previous code block. echo $bound Closure() is how we caused the function’s internal logic (effectively just return $this->secret;) to pop that theoretically inaccessible private property right out for us. Of course, it can work the other way too, which is why I introduced this section with the notion of being able to brutally savage private properties. $closure = function($newValue) { $this-> secret = $newValue; }; $boundClosure = $closure->bindTo($shhh, $shhh); $boundClosure('Not very secret afterall');
Writing new values to an object’s private properties is terribly easy to achieve with these things. On the one hand, it’s rather a natty little technique for extracting the data out of objects without having to clutter their interface up in order to do so. On the other hand, it’s a rather sneaky technique and terribly difficult to debug. As a result, it’s not an activity that we want to engage in without having a bloody good reason to do so. A reason so compelling that it makes the ensuing loss of valuable pub time worthwhile.
Objects are callable too. Or at least, they can be. I made a brief mention earlier on in this chapter that you can create your own callable objects. This is true. The new magic method, __invoke() that arrived with Closures can also be implemented in a class of your own devising too.
Finding Closure.
189
When you write up the code for this particular method in one of your classes, you make the objects of this type callable. As a result, wherever you can employ a closure as a callback, you can substitute this for an object instance that also supports the __invoke() method. Whether you take this approach is up to you of course. If a closure on it’s own doesn’t quite cut it, you can always pass your own object instance to something like the array_walk() function.
Although you’d probably still be better served by providing a closure that bridges the array_walk process with the MegaArrayTransformer anyway.
Building your own objects Nevertheless, the notion of seeding existing objects with new, but unnamed, methods can be quite an attractive proposition. Albeit in very restricted circumstances. The technique that we’re about to look at has only the one vaguely justifiable use case that I know of. If you do know of more, I’d be very keen to hear of them. In an attempt to honour the correct definition of the Model-View-Controller pattern (MVC), it’s the controller’s job to ask for a model, which it then presents to the view to allow the view to update itself. The stop-start nature of the http protocol makes it difficult to follow the pattern correctly, which has resulted in the view layer as being “the bit that builds an html page for sending to the browser”. (Ignoring Ajax for the moment) Nevertheless, we still have the opportunity to honour the first part of the process. For a GET request, our controller’s going to go to the model layer (possibly via a service) and retrieve the model it needs to pass to the view. In the event that our view is going to display data from multiple sources, this presents us with a bit of an issue.
Finding Closure.
190
It isn’t the controller’s job to retrieve that data from different model sources and then mangle it together into a unit that the view can access. In spite of this, there are literally hundreds of examples presented across the web where this very practice is shown as the correct way: the action method in the controller collects the data and then prepares it for the view. We can illustrate this with a not-too-uncommon example. Let’s say that we needed a page to display customer details, along with any orders that they had placed and the corresponding status of the invoices attached to those orders. Doing it the wrong way might look like this. // In the controller $customerDetails = $this->modelService->fetch('Customers', $customerId); $orderDetails = $this->modelService->fetch('Orders', $customerId); $orderIds = [];
foreach ($orderDetails as $order) { $orderIds[] = $order-> getOrderId(); } $invoiceDetails = $this->modelService->fetch('Invoices', $orderIds); $this-> view->render( array( 'customer' => $customerDetails, 'orders' => $orderDetails, 'invoices' => $invoiceDetails ));
Now I’m not saying for one moment that everybody does it this way but I have certainly seen this approach taken on a fairly regular basis. The problem with having a controller method mangle model data into a view consumable unit like this is that we are inevitably setting ourselves up for code duplication. It’s a DRY violation in waiting. Why? Our assumption here is that this particular controller method will be the only way to trigger the rendering of this data. If this is truly the case, now and forever always, then we can get away with it.
Finding Closure.
191
But only until the credit controllers in the finance department make a request to have the same information emailed to them on a monthly basis. Or head office wants to be able to grab this data as a json payload from an API call. Consequently, it would make much more sense for the model layer to perform the data preparation and return a single usable unit that contains all of the relevant information. The question is, does this requirement warrant the creation of a whole new model object just to carry read only data to a rendering process? That’s not necessarily a question that we can answer in these pages. However, it doesn’t prevent us from exploring some possible alternatives. The first alternative is indeed to go ahead and build a specific CustomerOrdersIn voices model object, which can be populated with the relevant data and passed back to the controller/API call/cron job. The problem with this approach is that we would need a dedicated class for each unit of complex data. That is potentially quite a lot of coding and really only makes sense when we need to take advantage of a properly defined interface and the logical controls that we can impose upon it. In a read-only scenario, that is often less important since the when and how of rendering that data is not a responsibility that our model layer should be concerning itself with. Another alternative to consider then is the ability to build up a dynamically created object using closures instead of hard coded methods. This concept starts with an appropriately reusable skeleton class such as this one class ViewData {
private $methods = array(); public function addMethod($methodName, Callable $func) { $this->methods[$methodName] = $func; }
public function __call($methodName, $args) {
if (is_callable($this->methods[$methodName]) { return call_user_func_array( $this->methods[$methodName], $args );
Finding Closure.
192
} } }
With such a skeleton object, we can load it up with as many “public” methods as required. For our Customer/Orders/Invoices unit, we might employ some code that looks a little like this. // Inside the model layer $viewData = new ViewData(); $viewData-> addMethod('getCustomerName' , function() use ($customerModel ) {
return $customerModel-> getName(); }; array_walk($orderCollection , function($order) use ($invoiceCollection )) {
foreach ($invoiceCollection as $invoice) { if ($invoice-> getOrderId() === $order-> getId()) { $order-> setInvoice($invoice); } } }); $viewData-> addMethod('getOrders', function() use ($orderCollection ) {
return $orderCollection; });
return $viewData;
For brevity, I’ve left it to your imagination as to how we acquire the $customerModel, $orderCollection and $invoiceCollection variables. Also for brevity here, we have closures here that return the original model objects. For a read only consumer the return statements could be fashioned into sending back scalar values or arrays quite readily. “But isn’t that simply duplicating code anyway?”, I hear you shout. We would certainly be right to question an approach that involves creating what are little more than proxy methods to the underlying sources of our data. In the case of
Finding Closure.
193
extracting the customer’s information from the $customerModel instance, virtually every closure that we created would be one of these proxies. Fortunately, we don’t have to duplicate the code at all. We can use reflection for simple return values that don’t require the same sort of mangling as the array_walk() process conducted on the $orderCollection and $invoiceCollection objects. Doing so lets us pluck existing methods from other objects and build our ViewData instance with them. $reflectedMethod = new ReflectionMethod($customerModel , 'getName'); $viewData-> addMethod('getName', $reflectedMethod-> getClosure($customerModel ));
That’s just two lines of code for every method that we wish to lift from the $customerModel() instance and add to our ViewData instance. The first line generates the reflected method, whereas the second line provides with an active closure whose scope is bound to the original model. In this way we can dynamically build our new object with methods drawn from a variety of source models. Since each closure retains it’s original scope, it behaves precisely as if we’re invoking the method on the original object itself. Without the need for processing the return data, we can simply cherry-pick the relevant getters from the models that we’re working with and present a purely “read only” composite object back to the controller. In any case, once we’ve populated that $viewData instance with the appropriate methods it can be returned to the controller and consequently passed to the view ready, willing and able to provide the elements of data required to complete the page. It’s the implementation of the magic __call() method that allows this thing to pretend that it’s a regular object.
Finding Closure.
194
public function __call($methodName, $args) {
if (is_callable($this->methods[$methodName]) { return call_user_func_array( $this->methods[$methodName], $args ); } }
The end result? The view is able to access the required data by invoking the individual closures as if they were public methods . Account Number: = echo $model-> getAccountNumber(); ?> Customer Name: = echo $model-> getCustomerName(); ?>
getOrders() as $order): ?> // render order details as desired.
Obviously, you’d adjust the output generation to your own requirements. All that we are looking at here is how the invocation of the closures resembles the calling of object methods.
And it’s back to the macros again. A chapter on closures wouldn’t be complete without making mention of the most likely circumstances that you’re going to encounter them in the wilds. In recent years, much has been made in the PHP universe of the idea of Dependency Injection Containers, and less frequently but no less importantly about the art and science of routing. If you’ve come into contact with either of these concepts of late, you will have been hard pressed not to notice that they also use closures quite extensively. As a result, these two concepts provide an ideal way for us to examine closure usage within real world contexts. Our first example is a routing one and comes directly from the home page of the Slim microframework¹⁰. ¹⁰http://www.slimframework.com/
Finding Closure.
195
get('/hello/:name' , function ($name) {
echo "Hello, $name"; }); $app->run();)
This sort of approach to routing is becoming ubiquitous in the PHP framework world, and for good reason too. It’s a very effective, immensely flexible approach for mapping that macro like function against a URL pattern, thereby allowing for the automatic execution of the enclosed code based on the result of matching patterns against the incoming request path. Despite the simplicity of the code above it does do an excellent job of illustrating the relevant aspects that we should consider here. On the one hand, we have the relevant pattern that will trigger the execution of the macro. On the other hand, we have the macro definition itself, which will be triggered whenever the pattern is matched. Of course, the body of this macro, this closure, is a painfully simple one. A little, subtle modification will also serve to introduce the next idea quite admirably. get('/blog/view/:postId' , function ($postId) use ($app) { $service = new BlogService(); $controller = new BlogController($app, $service); $controller->renderPostAction ($postId): });
Ok, perhaps not so subtle but at least our callable has now been f leshed out with code that is a little more relevant, and in doing so helps us to consider it as being more like a macro. Within the body of our callable are two lines of code that result in the instantiation of a BlogService object and a BlogController object. Given the fact that the contents of this callable will not be executed until the callable itself is invoked, it should be easy to see how we are delaying the instantiation of the service and controller objects until the very last moment.
Finding Closure.
196
When used in this way, closures very positively help us support the notion of lazy loading. For sure, this approach will cause us to expend a fair bit of time in the early stages of setup and configuration but the benefits to be gained will compensate us for our early efforts admirably. Assuming that we’re using a cogent autoloading strategy, creating closures to facilitate the instantiation of objects only at the point that they are needed spares us the cost of wasted server resources and provides for maximum efficiency in handling an end user’s request and delivering the response. This concept of lazy loading objects on demand is a key feature of Dependency Injection Containers. We will take a closer look at some of these during our exploration of dependency injection later on in this book. Nevertheless, for our purposes here let’s take a brief look at Pimple¹¹, the DI container from SensioLabs. Again, I’m going to borrow a little code directly from the homepage so that we can pick over it in similar fashion to the routing example provided by Slim above.
// define some services $container['session_storage' ] = function ($c) {
return new SessionStorage('SESSION_ID'); }; $container['session'] = function ($c) {
return new Session($c['session_storage' ]); };
// get the session object $session = $container['session'];
It’s barely worth noting except purely for the purpose of avoiding confusion, the Container class in Pimple supports the ArrayAcess interface, which is why we’re seeing closures assigned with the array style square bracket accessors. ¹¹http://pimple.sensiolabs.org/
Finding Closure.
197
In any case, we can still trace this code backwards to examine why this sort of closure based approach is desirable. Starting with the very last line, we’re assigning the $session variable here with the content of the ‘session’ key stored in the container. With Pimple, the first time you access one of the container’s keys, the callable stored against it is invoked, yielding its return value. In this case, it’s an instance of the Session class. The truly relevant part at this stage is that the corresponding closure doesn’t simply return the new instance in isolation but configures that instance with the value returned from another one of the closures; this time it’s the one keyed against ‘session_storage’. Internally, this session storage closure instantiates and returns a configured storage object. This is where we start to reap the rewards from the effort of setting up the container in the first place. By laying down the configuration, with all its incumbent closures in one location, we can get to appreciate the power and simplicity of the final line in that code snippet again. $session = $container['session'];
Far away from the original configuration and setup, our $session variable is being populated with a properly configured Session instance, one that is carrying all of its dependencies and ready to be used. Prior to the execution of this line, the session objects didn’t exist in memory and the class definitions hadn’t been parsed. This results in performance gains that are of great benefit to us if we’re hoping to construct something that is more than a basic “tin pot” website. Nevertheless, and more to the point, this single line of client code has no idea what’s going on in the background, nor does it need to know such details. This simple detail makes it possible for us to change the implementation of sessions and their storage quickly and easily. As long as the new session code continues to honour the interface of the previous implementation, none of the session’s collaborators will be broken by the change.
Summary As I suggested at the start, it’s probably wise not to feed this critters after midnight.
Finding Closure.
198
On the one hand, we’ve seen how they can be used to gain full read/write access to even the most fervently defended private properties of another object. Not doing this is a particularly wise stance to take. On the other hand, we have also considered how we can create an entirely new objects populated with methods that have been cherry-picked from other real, live object instances and where those methods still retain their original scope. For the purposes of creating a read only composite object from a variety of models, this is something of a boon. However, and as previously noted, such a technique has a very limited use case in real world applications. We certainly couldn’t present such a promethean construct to client code for further processing since the thing doesn’t come with it’s own interface. Such a lack of guarantee as provided either by an abstract base class or an implemented interface would mean having to resort to duck typing. With one of these “Frankenstien’s Monsters” to hand, even that would be painful. The key benefits that can be derived from the use of closures then include the three that have been highlighted here. They’re portable, which means they provide a convenient means of passing around pre-constructed logical sequences, without or without data elements that were present in their parent scope, and which can subsequently be executed far away from the point of their inception. They are also disposable, which quite simply means we can dispense with them quite readily after they have been used. Finally, due to their “save it for later” macro style construction, we get to lazily load only the resources that we actually need in order to serve a particular request. With an appropriate autoloading scheme in place, such as that which is provided by Composer, this also means that our application’s will only ever require the files for the classes that get used, and none of the files for the ones that do not.
Talking points. We have arrived at the end of Part Two and in doing so, we are reaching the end of the foundational aspects of PHP Brilliance. Before we move on to the fascinating topics of programming principles, paradigms and design patterns, there are still a few things that we need to tie up. So here we are again at another “Talking points”.
So, which is it? Contract or guarantee? Way back in the chapter on interfaces, I expressed a preference for the word guarantee. This is despite the fact that the rest of the world calls them contracts. Just to be clear, I’m not about to make a stand on the issue and declare that the rest of the world is wrong. That would be just plain silly. Nevertheless, I think there is some value in exploring these ideas in here and away from the interface chapter itself. As I mentioned previously, it’s largely a matter of preference. The end result should be the same: we are looking for an enforceable agreement. In normal circumstances, a contract refers to something in which two or more parties come to a mutual agreement. For instance, you enter into a contract of sale whenever you visit a store to exchange cold, hard cash in return for the latest all-singing, alldancing techno-gadget. Or whenever you pay for yet another Minecraft content pack download. It’s a contract because each party gives something and receives something back, based on those mutually agreeable terms. You agree to provide money and receive a product. The store agrees to provide a product and receive money. That being said, the concept of the contract in this case stems from the Design By Contract¹² approach laid down by Bertrand Meyer in 1986, and is subsequently thoroughly documented in his book Object Oriented Software Construction . ¹²https://archive.eiffel.com/doc/manuals/technology/contract/
199
Talking points.
200
Design By Contract (DbC) is an idea worth reading up on, even though it’s clearly geared towards creating software in Eiffel. However, following DbC to the letter leads to violations of the Liskov Substitution Principles (coming up in Part Four) and therefore leads to application code that is not type safe. Just so that you know.
In contrast to this, a guarantee is typically a one-sided affair, a unilaterally made promise to honour the terms laid out in the guarantee itself. When the terms of the guarantee are written out, it’s quite often done so without knowing who the interested parties are going to be. At the end of the day, the question of whether it should be considered a contract or a guarantee is rather a moot one. The contract term applies perfectly well when you consider the two parties involved in a logical process, the object and the client code that is collaborating with it. The proposal of the term guarantee , being a much more unilaterally made arrangement, is offered with regard to the intended audience, the readers of our code. When a white goods manufacturer ships a brand new refrigerator, they will include a written guarantee (warranty) with it. Such a guarantee is, in theory, written as a unilaterally made agreement to fix or replace the appliance in the event of a failure. If the appliance works perfectly well during the guarantee period, then that guarantee is unlikely to be taken up. In other words, whomsoever purchases that refrigerator has the option to read the terms and conditions set out in that guarantee and in doing so will hopefully understand how a contract of service may be entered into with the manufacturer should the need ever arise. The manufacturer has no idea who will end up reading this document, nor indeed whether they are able to decipher the jargon and legalese that peppers the small print. In essence, this is the idea that I’m putting forward. In the process of mentoring colleagues, I want to suggest that these interfaces are there to lay down the terms and conditions of engagement. To be read and understood prior to writing the code that will utilise these things, which is the point where the contract analogy becomes more appropriate. Contracts for code, guarantees for coders.
201
Talking points.
At the end of the day, it doesn’t amount to a hill of beans, as long as the underlying concept is understood within our respective teams.
Duck typing I made a brief reference to the concept of duck typing in the previous chapter so I think it’s appropriate to bring it in here for a little look-see, especially given that we’ve literally just gone through that guarantee/contract malarkey. As you might have gathered already, I’m much more in favour of the guarantee style approach since we’re less likely to run into problems further down the line that way. Regardless, duck typing is the act of checking for what an object can do in order to see if it can indeed do what you want it to. What does that even mean? Rather than testing for the presence of a guarantee either through type hinting or testing against the instanceof keyword, duck typing relies on a check to see if a method or property exists prior to invoking that method or reading that property. The notion is based on a quotation by the nineteenth century poet, James Whitcomb Riley, who once said: When I see a bird that walks like a duck, swims like a duck and quacks like a duck, I call that bird a duck
What this leaves us with is the idea that we are simply checking for the existence of a method, a characteristic, to determine whether that object is useful to us. To transpose Mr Riley’s quotation into code, it is effectively the same is saying that if an object provides quack() method, then the object can be treated as if it’s an instance of a Duck regardless of whether it is or not. In code, this might look like:
Talking points.
202
public function doStuff($param) {
if (is_callable( array($param, 'quack'))) { return call_user_func( array($param, 'quack')); } }
Since we’re not type hinting in the method signature, we’re testing for that quack() method prior to invoking it. In this regard, as long as the thing can quack we don’t actually care what it is. There’s danger in these here hills though, danger that comes our way as another form of name collision . Admittedly, with a quack() method this will hardly ever be the case since quacking is most definitely an unambiguously specific activity. Nevertheless, invoking $employee->fire() will most likely have a different set of outcomes compared to invoking $gun->fire(). The more common the method name, the greater the opportunity for things to go wrong. How often do you see a run() method? Or a doStuff()? Nevertheless, we will certainly encounter duck typing from time to time, especially in class constructors where it’ll tend to take the form of testing a constructor param to see whether it’s an object or a string representation of the desired object’s class name. Like this. class MyProcessor {
private $adapter; public function __construct($adapter) {
if (is_object($adapter)) { $this-> adapter = $adapter; } else if (is_string($adapter) && class_exists($adapter)) { $this-> adapter = new $adapter; } else {
throw new \Exception("Invalid adapter param"); } } }
Talking points.
203
This isn’t the approach that we’re going to take though. It isn’t the way to achieve PHP Brilliance. There’s a little too much reliance on the right thing coming in as a parameter, and you’ve already had to suffer me squawking once about throwing exceptions in constructors and likening the process to sticking your head in the waste disposal. I promise, we’ll be discussing Instantiaphobia , which includes this whole constructor business, in very short order - coming up in Part Three no less.
Before we finish off this part of the book, it’s appropriate to bring back that dynamically composed Frankenstein’s Monster of an object that we looked at previously. It is, after all, the very reason that we’re reviewing duck typing here in the “Talking Points”. Our skeleton class, the one that has scavenged closures out of other models’ methods, implements the magic __call() method, the very thing that lets us trigger those closures in the first place. However, the presence of the magic __call() method within a class means that whenever we test an instance to see whether it has a particular callable method (i.e. duck typing), we will always get a boolean true response back quite simply because we can invoke any method name on a class that has this magic method in place. Whether the anticipated effect is achieved or not is entirely another matter. Consequently… $viewData = new ViewData(); $viewData-> addMethod('greet', function($name) {
echo "Hello $name};" });
if (is_callable( array($viewData, 'pocahontas'))) { return $viewData->pocahontas(); }
On the other hand, the structure of our skeleton class means that the method_exists() function will always return false. Like this
Talking points.
204
if (method_exists($viewData, 'greet')) { // not executed. $viewData-> greet(); }
Therefore we have to go back to editing the skeleton class itself in order to support duck typing. We do this by adding a specific method to reliably give us that boolean response that we are looking for.
class ViewData {
private $methods = array(); ...
public function hasMethod($methodName) {
return ( array_key_exists($methodName, $this->methods) && is_callable($this->methods[$methodName]) ); } }
// Consequently, to duck type if ($viewData->hasMethod('greet')) { $viewData-> greet('Joe'); }
This is better of course, since it provides us with a little extra safety when handling such an awkward beastie.
Moving on Our attention has been concentrated on the finer details and small print for a little too long now. It’s time to zoom out a bit, grab ourselves a nice cup of tea and head into Part Three to start picking over patterns, principles and paradigms.
Brain Check Welcome to the very last chapter of Part Two . This is the part where we test ourselves on the material that we’ve covered in this part of the book by asking ourselves, “Could I give a lightning talk on each one of these?” It’s a checklist and a guide for highlighting areas of further reading. Interfaces Interfaces allow us to express a guarantee to an object’s collaborators that the methods described are available in the implementing object. An abstract parent class also provides an interface that we can type-hint on, but in this case the guarantee is not present and the onus is upon the developers to ensure that child classes honour the parent’s interface. Namespaces Namespaces provide us with a means to package our code into cohesive, logical, well organised units or modules and avoid naming clashes with other, third party code that we might bring in. The autoloading standards PSR-0 and PSR-4 require name-spacing our code but provide great convenience and compatibility with other packages that follow the same standards. Composer Composer has become the package manager of choice in the world of PHP application development, supplanting even Pear by providing both ease of use and extraordinary power when it comes to managing an application’s dependencies. Traits Traits were added to PHP in order to provide a mechanism for code reuse to classes that otherwise bear no relationship to each other. Traits are applied to a consuming class in a system-level copy-and-paste like procedure but care 205
Brain Check
206
should be exercised when using them since not all of a trait’s methods will be imported when the consuming class already provides identically named methods. Closures Closures are also known as anonymous functions since they can be declared without a function name. Instead, the function that a closure represents is applied directly to a variable’s value. Internally, closures are represented as instances of the Closure class, with the functionality bound to the __invoke() magic method. Userland created classes may also provide the __invoke() magic method to allow them to be treated as callbacks. Both types may be tested for as instances of the “callable” type. Duck typing Duck typing is a form of testing a variable’s abilities rather than typehinting for a particular Interface or class name. In object oriented code, this generally means testing the variable for the presence of specific methods and/or properties before attempting to use those methods or properties.
Standing on Principles “You believe any of this voodoo bullshit, Blair? ” - Childs
207
Building on bedrock There. We have successfully navigated the foundational stuff. Don’t you feel better for it? I know I do. Even if the benefits of having done so are not yet apparent, they should do soon. Irrespective of whether it has been quite time since you last considered things like encapsulation and inheritance or not, or even encountered them in a book, what we’ve actually been doing is getting that bedrock in place. Our goal has been to ensure that we have a solid foundation upon which to build those glittering palaces of software magnificence. The relevance of the first two parts of this book will become all too apparent within the forthcoming pages. In this part of the book, we step beyond those foundations and start putting into place what will eventually become the subliminal techniques that masters of their craft somehow exhibit naturally. There comes a part in a craftsman’s career where conscious thought is no longer required when it comes to applying a hammer blow to the head of a chisel or to the stroke of a brush against canvas. Experience dictates how hard or how fast the action should be. When such a craftsman finds himself stopping to guage the speed, or strength, or direction of the action, that’s the point where they fluff it. Here then we’ll be considering the principles that will guide the software craftsman’s hand, with just enough repetition to encourage them to become more “muscle memory” than conscious thought. Over the coming pages we’re going to explore the principles, patterns and paradigms that are necessary for attaining PHP Brilliance. Only the essential ones though. It isn’t the intention for this to become a pattern catalogue, not least because there are already plenty of excellent examples readily available for the developer who is interested in such. Nevertheless, those that are presented here share exceedingly close ties to either the material that we have already covered or the material that is yet to come. 208
Building on bedrock
209
In other words, there will be plenty of those circular references that I hinted at at the start of this book. Let’s get started then.
Ghostbusters Remember Joe from Part One? As likeable a chap as he was, we had to let him go. No matter how likeable he was, the company couldn’t be seen to be supporting the torrenting of that kind of material frankly. In his place, we have Lizzie. She’s ace. However, she got a bit too excited about the possibilities presented in the Closures chapter and decided it would be a good idea to build a DataMapper type of thing that would pull the data out of models and save the changed bits back into the database. It’s an entirely commendable attitude of course, but it’s also the reason why we’re starting the real content of this part of the book with an anti-pattern. All of that excitement that bubbled up with the closures stuff needs to be put in check somehow. The anti-pattern in question is called The Poltergeist. It’s also known as the Gypsy Wagon, but I think Poltergeist is the more apt name for it since it describes a phenomenon where “something turns up, causes things to happen and disappears again”. In itself, this is a very apt description of the wrong kind of use that closures can be put to and it manifests itself in objects that have no real substance, objects whose sole apparent purpose is to make things happen in other parts of the system and then go away again. Our consideration of the poltergeist is doubly important since we also spent a good bit of time considering how we might divide the objects of our system into the knowers and the doers. It’s around about this time that it becomes appropriate to reclassify those knowers and doers as doing so will help us achieve an understanding of how these poltergeists are none of those. The knowers of our system are more appropriately divided into two distinct camps. The first, larger camp is the one that we can label for the Entities. These are the key players, the models that curate our application’s data, providing both the properties that retain that data and the methods that allow the data to be manipulated in an appropriate fashion. 210
Ghostbusters
211
The second, much smaller camp of knowers consists of the value objects. These are the little blighters that allow us to transport complex variable structures around the system from one process to the next. They neither operate on the data that they hold nor perform processes because of it. This leaves us then with the doers. These are the critters that will perform the processes that we ask of them, the ones that are more properly known as Services. As a result, we are left with entities, services and value objects instead of simply knowers and doers . Poltergeists have no home in any of these camps, which is why we need to banish them and perform the necessary rites in order to prevent their appearance. Lizzie’s notion of using a closure to link an entity’s data with the storage process, whilst entirely commendable, is an example of a poltergeist in action, albeit an exceedingly skinny one. Nevertheless, what we are left with is an object that has no discernible purpose other than to create a rather transient bridge between a model’s data and the save mechanism. This in itself provides us with a process that is difficult to trace should an error arise but that isn’t itself the issue here. We would, in fact, be much better served by creating a more explicit bridge between the model object and the storage. As we’ve seen previously, this can be achieved by injecting the model with a data transport object, allowing the model to function properly and with access to the information that it requires and still providing the link back to the mapper as we desire. This is a key characteristic of a poltergeist; it represents some form on control logic for making things happen, yet as an object itself it has no real substance. To take a look at another concept that we’ve already encountered and consider how it might display poltergeist-like behaviour, think back to when I suggested we remove the register() method from the User model. We might be tempted to create a single, unifying object whose sole purpose is to co-ordinate the entire user creation process. However, there’s a chance that this could lead to some poltergeist like behaviour creeping into our code. Let’s take a look. We will begin with the kind of code that less experienced developers are prone to writing — putting the process logic into the body of a controller action.
Ghostbusters
212
// inside controller if ($_POST['accept_terms' ]) { $user = new User(); $user-> setEmail($_POST['email']); $user-> setPassword($_POST['password']); ...
if ($user-> save()) { $email = new Email(); $email-> setSubject('Please verify...');
if ($email-> send()) { $this->flashMsg('Congrats'); $this->redirect('/login.php'); } else {
// something about failed email } } else {
// something about failed user save }
Looks familiar? I know many PHP developers who have done this sort of thing at least once in their coding careers, even when the registration process involves more steps than simply creating a user record and sending an email. We have already seen how this kind of data mangling inside a controller method leads to duplicated code; an entirely avoidable situation. What would Lizzie do? Her first attempt might be to encapsulate the entire process into a single object, giving rise to a controller method that looks rather more like this. // inside controller $accepted = filter_input(INPUT_POST, 'accept_terms');
if ($accepted) { $reg = new UserRegistration(); if ($reg-> addUser($_POST)) { $this->flashMsg('Congrats'); $this->redirect('/login.php'); } else { $errors = $reg-> getErrors();
// show errors } }
Ghostbusters
213
From the perspective of our controller, this is a vast improvement. Instead of being cluttered with lots of messy ifs, some of which are even nested, we now have a single call to the addUser() method on our UserRegistration() instance. The assumption here then is that the UserRegistration object co-ordinates the entire process of validating user data, hashing the password, creating the user record and any related rows and finally sending that welcome email. At first glance, this looks entirely satisfactory. After all, what we have now is an object that we only need to throw an appropriately structured array of data at in order to get a user account created and the end result is a leaner, meaner, much more succinct controller. Better yet, we seem to be honouring the “Tell, Don’t Ask” principle even though we haven’t even read about that yet. So, a step in the right direction but have we finished this particular journey yet? It all depends on that UserRegistration class and its addUser() method. If all that has happened is that we’ve moved those nested ifs into a method, then we really haven’t achieved anything.
class UserRegistration {
public function addUser( array $data) { $user = new User(); $user-> setEmail($data['email']); ...
if ($user-> save()) { $email = new Email(); $email-> setSubject('Please verify...'); ...
if ($email-> send()) { return true; } else { return false; } } else {
return false; } } }
Ghostbusters
214
Furthermore, if all that we have done is put the entire process behind a method call then we’re actually limiting ourselves to an all or nothing approach. We throw all of the necessary data at the method; if it works, then that’s great. If it doesn’t, then what? Try it all again? Even without a fully functioning crystal ball, we can be reasonably certain that at least one of our users will need that welcome email sending again, particularly if we’re asking them to verify their email address in order to activate their account. With this in mind, we can at least ascertain that we need to break the process down into two parts. Additionally, since we’ve already considered that these doers are in fact service candidates, then it’s going to be entirely reasonable for us to ask Lizzie to implement the two processes as method calls on appropriately named service classes. This might lead us to something looking like this.
class UserRegistration {
public function addUser( array $data) { $userService = new UserService(); $user = $userService-> addUser($data);
if ($user instanceof User) { $emailService = new EmailService(); $emailService-> send($user, 'welcome.email.tpl'); } } }
Apparently, we now have UserService and EmailService classes. Great. However, if a poltergeist is characterised by an object that carries little or no internal state and whose apparent purpose is merely to trigger method calls on other system objects, then this is exactly what we have here. Admittedly, this is a highly contrived route that we’ve taken in order to arrive at a poltergeist but it’s actually not that uncommon for our less experienced brethren to simply embody that sequence of button pushes into an object that appears to ‘automate’ things for us. Crafting an object with the explicit intent of automating a
Ghostbusters
215
complicated process seems like a perfectly valid approach but if that object brings nothing of its own to the table, then the chances are it’ll be a poltergeist. Fortunately, fixing the code that we have here is quite simple and straightforward. Like any instance of the Poltergeist anti-pattern, it can be banished by refactoring the logic out of the body of the beast and into either the caller or the callee. This is an enduring characteristic of the Poltergeist - the fact that they have no real substance of their own is evident in their behaviour as some sort of bridge between a piece of code that needs something doing and the target code that will do what is required. Banishing our own poltergeist here is simply a matter of moving those service calls back into the controller. The services themselves are providing the actual functionality that is needed, we just need out controller to push the buttons instead. // inside controller $accepted = filter_input(INPUT_POST, 'accept_terms');
if ($accepted) { $userService = new UserService(); $emailService = new EmailService();
try { $user = $userService-> addUser($_POST); $emailService-> send($user, 'welcome.email.tpl'); $this->flashMsg('Congrats'); $this->redirect('/login.php'); } catch (UserServiceException $use) {
// Error handling } catch (EmailServiceException $ese) { // Error handling } catch (Exception $e) { // Unknown error handling } }
With these changes in place our poltergeist has been exorcised from our code base, allowing us to delete the UserRegistration class quite safely. Furthermore, the addition of the two service classes provides us with easy access to the two discrete processes of creating user accounts and sending the welcome/verification emails. This is significant in a number of ways.
Ghostbusters
216
Decoupling the account creation process from the controller here makes it possible to create users outside of the web context. What would the benefit of this be? If we are rolling this out to a large organisation, or we have a task to migrate users from a legacy application to our shiny new one, the ability to create user accounts in bulk is going to prove most advantageous to us. Generating user accounts from an uploaded CSV file becomes a cinch. The benefit of providing an independent email service should be immediately apparent. Nevertheless, now that it has been separated from the process of account creation will mean that we will have the opportunity to send a different email template to user accounts that were created in bulk - one that is eminently more appropriate. Further, for the singular web user, we can offer the facility to resend the welcome email independently of the form post that created their account in the first place. Even though we have yet to look at the “Tell, Don’t Ask” principle (that’s coming up shortly) the elimination of this particular poltergeist has led to us satisfying that principle’s requirements. This is not an uncommon side effect of poltergeist elimination and as such, we’ll be returning to our controller/service example again shortly. In the meantime, we have another question to answer. Given that we moved the user account creation logic in a UserService class, why isn’t this service now a poltergeist in its own right? To find the answer to this question, we need to consider the characteristics that allow us to identify a poltergeist in its own right. These are: 1. Transient and stateless objects 2. Single operation classes that are used only to trigger other operations or sequences of operations. 3. Temporary, short lived objects that provide only control-like operations. Our rather short-lived and limited use UserRegistration object satisfies all three of these definitions. That it was stateless was perfectly apparent - the object had no properties and no methods to manage those properties. It also provided only a single method that was used to control the account creation process and the sending of the welcome email.
Ghostbusters
217
In contrast, our UserService provides us with plenty of scope to build out more user related methods on the service interface. It is, after all, called a UserService and not a UserRegistrationService, which would have put us back into poltergeist territory again. The more generic UserService name allows us to flesh out this particular doer with more user related operations yet still avoid having the object lose its focus.
When is a facade not a facade? “When it’s a poltergeist.” is certainly one answer that we might give. If you’re already familiar with the Facade pattern, you’ll know that it’s a pattern that is designed to provide a simplified interface to a complex operation. That indeed is the commonly accepted definition of the design pattern in question. It’s also my belief that it’s this definition and variations thereof that tend to lead developers to create poltergeists in the first place. The notion of hiding a complicated multi-stage process behind a single, simplified method call is naturally appealing. Correctly though, a facade isn’t just a push button provider but a coherent collection of methods presented as the interface on a particular package or set of related logical processes. The service classes that seemingly sprang into being during our earlier code explorations are justifiably candidates for implementations of the Facade pattern as it is entirely reasonable to expect their interfaces to grow as we proceed to develop the application. If we have built it correctly, the EmailService, whilst only presenting the single send() method thus far, provides a clear means by which client code can trigger the dispatch of an email at any point during program execution. And yet, the complexities involved in constructing an email, including the parsing of a template, remain hidden from the client code. There’s a very distinct benefit to be gained from the use of Facades such as this: we can vary the actual logic that is employed in the despatch of an email without having any effect whatsoever on the client code that needs to send one. For an application that’s “going large” this makes it supremely easy to switch from sending emails directly, to implementing a database backed queue system for email to piping them through a queueing system such as beanstalkd, RabbitMQ or even Apache Kafka.
Ghostbusters
218
This is a key consideration and we’ll be taking a more detailed look at this when we hit the architectural concerns address in Part Five. I bet you can’t wait!
Summary We need to guard against the rise of poltergeists within our system. By their very nature, they introduce unnecessary complexities into our object graph. Recognisable by their control-like modes of operation, they are insubstantial and transient and as the anti-pattern definition goes, they “turn up, cause things to happen and then disappear again”. Mastery of the “Tell, Don’t Ask” principle will help prevent these troublesome apparitions from cropping up in our codebases. We will get onto that one shortly. In the meantime, please remember that whilst a broom handle might very well aid us in closing the door, flicking the light switch and changing the channel of the TV, as far as our code is concerned we need it to get up out of the goddamn arm chair and go and do those things itself.
Favour Interfaces. Perhaps the most commonly cited principle in the object oriented programming world is this one. “Program to an interface, not an implementation.” Despite this, we won’t be devoting much time to its consideration. Why on earth not? The reason for this is short and sweet - we should have nailed it already. Lizzie has. If you haven’t yet, you’re lagging behind. Go back to the chapters on abstraction and interfaces and then report back here when you’re done. The reason why this particular principle has been nailed, is in the bag, stems from all of that discussion on interfaces and how they provide us with a guarantee of method availability. Not only that, of course, but it’s also in the bag, has been nailed, because of all that discussion on how we can use abstract classes to lay down the blueprint for new, custom data types. Within the scope of considering either of these things what we are really concentrating on is that very desirable goal of achieving type safety. This becomes especially relevant when we reconsider the notion that we, as developers, understand how PHPs own, native data types behave. Yes, I am going to repeat that bit again. If we as developers can achieve a fully comprehensive understanding of how integers, strings and associative arrays work we can create application code that works reliably with them and consequently achieve consistent results. Multiplying integers together will give us an integer result. Dividing an integer by a float will give us a float result. 219
Favour Interfaces.
220
Unshifting an array with a string value will give us an array result that contains the unshifted string in its first index. This is type safety at work. No developer in their right mind would try to divide a string by an array and expect a boolean result. Stuff just doesn’t happen that way. It’s this very same notion that we should be able to apply to the custom data types that we create. The abstract classes and the interface definitions that we set down in code describe things that we can type hint for in method signatures. When we type hint for these things, we are relying on the same kind of knowledge that we can apply when operating on the native data types. Just so long as we continue to honour the guarantees that an interface declaration provides. And just so long as we continue to honour the interface that our abstract classes describe. When we do this, when we ensure that our custom data types behave consistently, then we are achieving type safety within our code and by extension, we eliminate the future possibility to bugs to occur as a result of us failing to do so. This is what underpins the idea of “programming to an interface, not an implementation”. When we can hint for a particular type of data in a method signature, the code within that method can operate reliably in conjunction with those input parameters. But only if we take those interfaces and abstract classes seriously. The key concept in all of this then is that our client code, our method bodies, should be able to operate consistently with those input parameters irrespective of the actual, concrete identity of those parameters. In real terms, honouring this particular principle means not doing this:
class AbstractDocument {
private $db; public function __construct(Mysqli $db) { $this-> db = $db; } ... }
But instead, doing this
Favour Interfaces.
221
use MyVendor\MyApplication\Storage\StorageInterface; class AbstractDocument {
private $storage; public function __construct(StorageInterface $db) { $this-> storage = $db; } }
In the first example, we are erroneously specifying that any new Document instance must be provided a very specific Mysqli instance on the constructor. What makes this erroneous is the simple fact that it ties every possible document child class to using the very specific Mysqli object interface. This isn’t a mistake that our beloved Lizzie would make since she’s already learnt that the change requests from the business team come in thick and fast and you can bet your bottom dollar that one of those requests will be to make those document records searchable . As good as MySQL is at it’s core operation as a relational database management system, it’s far from being the best candidate for building a free text search system on top of. This is where the power of the StorageInterface can become apparent. Whilst we are just starting out, we can readily create an object that implements this interface, providing, say, a save() method, a find() method and a delete() method which do little more than build the appropriate queries to execute against a Mysqli instance maintained as a local property. Further down the line, we can provide a new composite to the constructors of Document instances. Still implementing the StorageInterface means that the individual child classes derived from the AbtractDocument can continue to enjoy the convenience of simply invoking save() or delete() as required. In contrast though, when the save() method is invoked on this new storage composite, we might get to enjoy the ability to: • Update the corresponding record in MySQL
Favour Interfaces.
222
• Create or update a document record in MongoDB • Add the document’s title and permalink details to a tag map being maintained in a key/value store such as Redis or Memcache. All of these things could be achieved without necessitating a single change in the code of the AbstractDocument inheritance hierarchy, a highly desirable outcome that is readily achieved when we adhere to this particular principle. In direct contrast to this, type hinting for a very specific concrete class such as a Mysqli instance will inevitably lead to us polluting the code in our Document child classes with Mysqli specific method calls - a highly un desirable outcome since it will entail a major refactoring effort to be able to add in other types of storage at a later date.
Summary There’s very little to add to this. When we are comfortably and consistently recognising the characteristics and attributes of our abstract data types and badgewearing interface implementers in preference to having to manoeuvre our code around concrete classes, we achieve the following things • Resilient code based on a consistent respect for type safety • Reduced future maintenance requirements based on incoming change requests and therefore: • More pub time. Nevertheless, Lizzie would have us rewrite the order of preference as being “Icecream over interfaces over implementations”, not least because pistachio, chocolate and mint is apparently the dog’s doodahs.
And favour Composition too. Lizzie may be getting a bit frustrated with my chapter titles. As a result, I may be forced to save my Ben & Jerry’s “Phpish food” flavour joke for another time. As a concession though, let’s just say that nothing beats pistachio, mint and chocolate flavoured ice-cream. And with that silliness out of the way, let’s proceed with another popular principle; the one that goes like this. “Favour composition over inheritance” This is one that we’ve encountered before, albeit rather briefly. Whilst we were considering, very wisely, to banish any thoughts of achieving code reuse through inheritance we tripped over the idea of favouring composition over inheritance along the way. Naturally, this is the very spot where we will take a proper look at it, especially in conjunction with the design patterns that directly illustrate the concept. This is good stuff. Good stuff that is rarely acknowledged, frequently ignored but no less important. The basic premise of this one is that we can achieve a much stronger, much more flexible object graph when we compose our finished objects from a variety of discrete, laser focussed resources rather than attempting to build out complex inheritance hierarchies. Yes, it does indeed mean that we have to consider the topic of inheritance again. I’m sorry about that. Looking on the bright side, since we’ve already done the topic almost to death, it does at least mean that we can get away with little more than a refresher. Remember if you will that the case against committing “Inheritance Abuse” has already been made, whether through inheriting from another class just to gain access 223
And favour Composition too.
224
to a particularly desired method, or via creating a parent class to share that method. Or indeed through creating great, wobbly towers of “re-use through inheritance” and using method overrides to blank out the ones that we don’t want. The core issue here arises through the hard-coded dependencies that we naturally create via the extends keyword. As developers, we learn that hard coded dependencies between one class and another are a bad thing. They cause tight coupling to occur, which in turn leads to brittle code taking up residence within our applications. Not good. In the majority of cases, that is. The tight coupling that occurs between a parent class and any child classes that have been derived from it is of course a natural outcome of inheritance anyway. As long as we treat abstraction and inheritance in the correct manner, this will never be a problem for us. How so? In recognition of the fact that it’s the abstract parent that provides the data type definition that we are looking for to be present in all of the child instances. If we proceed to modify the behaviour of the parent, that those behavioural changes should subsequently and correspondingly become evident in all of the child classes is a desirable outcome. Unfortunately, we cannot say the same for an inheritance hierarchy based on code reuse. When the hierarchy has been constructed just for the benefit of gaining common access to useful methods, changes to those shared methods have to be assessed in terms of the corresponding effects on all of the derived subclasses. If ever there was a single most significant cause for the devastating loss of highly valuable pub time, it’s the inheritance hierarchy built for code re-use. Thankfully, we know better than to do that. Even when those popular MVC-like frameworks wish to sully the tender minds of our colleagues with their wishes to have them extend a base model class loaded with convenience methods, we can take measures to protect our application code from the perils that they represent. Think about this for a moment - if, in the process of building out our application, we ended up creating a hundred different model classes that are all based on a framework’s convenience-oriented base model the final outcome is that we have a
And favour Composition too.
225
hundred model objects whose behaviour can change (read “break”) if the underlying base class changes. This isn’t such a far-fetched scenario. In the event that the framework’s base model class drops the save() method and adds an insertOrUpdate() method, we have a hundred broken model classes because of the dependency that exists between our models and the framework’s abstract base. Now consider the effect of a strategy that would have us injecting “save-ability” into our model classes instead of building model classes on top of the “save-ability” present in an abstract base. If we were to wrap the framework’s base model with our own adapter class and then arrange to have that injected into new model instances, we protect our model layer against such devastating interface changes in that abstract base class. That’s not to say that our one hundred model classes won’t also be broken by such a change in the framework’s base model of course. The fix, however, will be a one-liner in our custom adapter. The alternative? Comb through our entire codebase looking for calls of the save() method to see whether they need to be changed or not. Obviously, the framework that made such a monstrous change would be the framework that became exceedingly unpopular with its users and as such, is highly improbably in the real world. That being said, this scenario serves to make the point that we need at this stage. Inheritance, regardless of whether it’s employed in order to achieve code reuse or not, introduces fragility into our object designs quite simply because those hardcoded dependencies upon parent classes are completely unavoidable. A change to the parent class automatically means the effects of that change will ripple through the entire hierarchy to greater or lesser degrees. Code reuse through inheritance is completely avoidable though. Don’t do it, ok? Just. Don’t. That’s why we are here. Now. At this point. To consider the fact that Composition is here to save the day. Not only can we immediately dispense with any kind of inheritance-related fragility, we gain the ability to employ switchable behaviours at run-time. This is something that we’ll naturally explore when we get onto considering the design patterns that promote this idea of favouring composition and what it means for our application code.
And favour Composition too.
226
First though, let’s just take a moment to get that composition idea written down on the page. Up above, we considered the possibility of wrapping a framework’s base model in a custom adapter and then injecting it into our model class. This is composition at work. The code that represents “storable behaviour” has been neatly carved away from our model classes, allowing us to treat the latter’s public interface as a blank canvas. How we proceed to fill in and flesh out that interface is entirely under our control. Which is exactly as it should be. Of course, it’s not just popular frameworks that this principle applies to. Imagine the state that we would be in if that were the case. Speaking of which.
The State Pattern As far as design patterns go, it’s the State Pattern itself that perhaps most overtly states (pun!) the case for favouring composition over inheritance. Like most patterns, this one is intended to direct the developer towards an attractive solution to a common problem. In this case, the common problem manifests itself in an object whose behaviour is expected to change in accordance with changes to one or more property values inside the object itself. In other words, we have an object whose behaviour reflects the object’s current status. Perhaps the most readily accessible example of this is the lifecycle of a customer order within an e-commerce application. I’ll grant you, it’s such a cliche! Nevertheless, it will serve our purposes to perfection. If you’ve ever shopped online, for physical goods at least, it’s entirely likely that you will already be aware that any such orders will have likely gone through a number of stages. For our examples here, we’ll limit these stages to just the four most likely candidates: Pending, Processing, Dispatched and Cancelled. Back in the days of yore when dragons still roamed English vales looking for damsels to distress and when I still had hair on my head, I might, maybe possibly, have written code to represent an order’s state that looks something like this.
And favour Composition too.
227
class Order {
private $status; public function isDespatched() {
return $this-> status == "DISPATCHED"; }
public function canBeCancelled() {
if ($this-> status == "PENDING") { return true; } else if ($this-> status == "PROCESSING") { return true; } else if ($this-> status == "DISPATCHED") { return false; } else if ($this-> status == "CANCELLED") { return false; } } ... }
Of course, I’ve shortened this to just a couple of method styles that appropriately make the point. The first sign that the State Pattern is called for is when we start to see logic like this in our object methods. On the one hand, we’re reporting whether a property is of a particular value. On the other, we’re introducing unwieldy conditional structures that tests a property’s value in order to decide how to report back to the caller. In other words, we’re checking the state of one or more object properties to determine how we should respond. This represents a problem and it’s not one that can be addressed by adding constants to the class, although that is clearly called for anyway. When the business team comes rushing over to our desks to give us another possible state, we will have a real job on our hands in wedging that new state in there - one that will require careful handling to ensure the correct response is given for all methods of this type. How do we solve this curmudgeonly mess? At first blush, an inheritance hierarchy
And favour Composition too.
228
seems to be the way to go, since that would provide us with the opportunity to create a child class for each order state to be represented.
abstract class OrderAbstract {
protected $status; abstract public function isDespatched(); abstract public function canBeCancelled()l; }
Again, I’m only referring to the two sample methods that we are interested in to illustrate the point. Naturally, an Order object would end up being much more involved than this. Nevertheless, here’s the corresponding child class made to represent a pending order.
class PendingOrder extends OrderAbstract {
public function isDespatched() {
return false; }
public function canBeCancelled() {
return true; } }
And another class just to illustrate another of the alternative implementations.
And favour Composition too.
229
class DespatchedOrder extends OrderAbstract {
public function isDespatched() {
return true; }
public function canBeCancelled() {
return false; } }
This all seems to be a highly appropriate solution to the issue. The code involved is nice and clear and easy to read. When the business team comes rushing over to our desk the next time with a new order status to implement, couldn’t we just roll out another child class to represent that status? We could, but there are two key issues with this approach. Ones which highlight the need to consider the code that we write in the context of a living, breathing, functioning application. In other words, on the page this looks fine and dandy. In the real world, we will be faced with problems. How so? For starters, if we’re in the business of turning pending orders into despatched orders, we know that our orders are going to have to transition from one to the other. That means we’re guaranteed to have points in our order lifecycle where the status changes because if it doesn’t, we’re going to go bust. With an inheritance hierarchy to represent the order status, that means we would either have to re-hydrate a new order object with an updated status or set an incorrect status into our existing order object. Which should we chose? The inefficient route or the invalid route. The answer of course should be neither. The other problem that will occur with an approach of this type arises from the fact that most of the details held within an order object should be cacheable. When we think in terms of cache-ability, we need to consider the anticipated lifetime of the data and our expectations of what will change when.
And favour Composition too.
230
For an e-commerce system, we can reasonably assume that the customer’s billing address is very unlikely to change. The customer’s delivery address might change but it’s also quite unlikely. The list of items in the order? There’s a chance that items might be added or removed during the lifetime of this particular order, but in the majority of cases we wouldn’t expect a lot of change there. The order status, though, that’s guaranteed to change. When you’re planning a cache for application data, the cacheable lifetime of that data is a key concern. For our e-commerce system, breaking out the things that are certain to change the most and being able to cache the rest will give us the greatest performance gains for minimal effort. When our order object’s classname is dictated by one of the properties that is guaranteed to change, we would be forced to hack around the issue to nullify that effect. Something that we really don’t need to do. Not when we have the State Pattern to hand, ready and willing to swoop in and save the day. Since we’ve already identified the very thing inside our order object that is guaranteed to change, we can encapsulate those changeable behaviours into custom objects. Doing so will leave the rest of the order object in a very cacheable state indeed. Our first task then is to start creating the order states that we’ll need to represent the desired behaviour. Taking the previous examples, we will end up with something like the following. abstract class OrderStateAbstract {
const ORDER_STATE = 'Undefined'; public function __construct(Order $order) { $this-> order = $order; }
public function getStateValue() {
return static:: ORDER_STATE; }
abstract public function isDespatched();
And favour Composition too.
231
abstract public function canBeCancelled(); }
That’s our abstract class defined, with a single implemented method to return the status value that our child classes will provide. class PendingOrderState extends OrderStateAbstract {
const ORDER_STATE = "PENDING"; public function isDespatched() {
return false; }
public function canBeCancelled() {
return true; } }
And this represents our first order state object. As you can see, this is just as cleanly implemented as the individual order subclasses that we looked at previously. Note too that we’re taking advantage of those variable constants in an inheritance hierarchy to modify the ORDER_STATE value so that it appropriately represents the state that we want to embody. I’ll leave you to fill in the other state classes mentally. Moving on, we can look again at our principle object – the Order itself – and consider what changes are required to support this State Pattern implementation.
And favour Composition too.
232
class Order {
private $status; private $orderState; public function setState(OrderStateAbstract $state) { $this-> status = $state-> getStateValue(); $this-> orderState = $state; }
public function isDespatched() {
return $this-> orderState->isDespatched(); }
public function canBeCancelled() {
return $this-> orderState->canBeCancelled (); } ... }
The first thing to note with these changes is that we’re now modifying the Order object’s status property indirectly by injecting an appropriate order state instance and allowing that to govern the corresponding value. If we were previously providing a setStatus() method on our Order object’s interface, we can now safely remove it. The second thing to note is that all of the implemented status related methods now defer the responsibility for answering such questions to the injected state instance. The interface that the Order object exposes to its collaborators remains unchanged. Therefore, the code that used to work with the previous version will continue to work with the new version. No refactoring required. Bonus! This last point does highlight a common issue with the State Pattern though. Since this is a form of delegate polymorphism, we have a natural requirement to implement the forwarders, the proxy methods, in order to make it work. That’s potentially a lot of method names in the composed behaviour that need to be duplicated in the composing class. The benefits outweigh the cost of this additional work though. With the rest of the order object having a low change expectancy, it’s supremely cacheable. When we pop the thing out of whichever cacheing mechanism that we choose to employ, we
And favour Composition too.
233
just need to make sure it gets the appropriate order state instance. This is a huge plus for a busy application. At the time of writing we’re on the verge of an exciting time with several projects attempting to crack the stateful PHP application nut. Instead of the boot->serve>shutdown lifecycle of a regular PHP script, great strides are being made in producing PHP application servers that are able to maintain context and state across multiple requests. We need to keep our eyes on the likes of Appserver.io and ReactPHP as these could very well negate the need for cacheing as we know it today. http://appserver.io http://reactphp.org/
The Strategy Pattern Closely related to the State Pattern is this one, the Strategy Pattern. In some cases, they’re so closely related that it’s difficult to tell them apart. Imagine for a moment that Lizzie had an identical twin, Lottie. They look exactly the same. They dress exactly alike. The only way to tell them apart is when they’re eating ice-cream since Lizzie will be the one fiercely guarding the tub of pistachio, chocolate and mint. Lottie on the other hand is rather partial to Ben & Jerry’s “Phpish Food”. Ouch. Sorry. Fortunately, the State and Strategy patterns aren’t quite as hard to tell apart at that. Again, we’re going to be injecting some encapsulated behaviour but this time, that behaviour will represent a custom algorithm that will operate on things, usually data, external to itself. This is a key part of the distinction between the State and the Strategy patterns. Whereas the State pattern provides for interchangeable behaviour based on some notion of the internal state of the implementer, the Strategy patterns provides for interchangeable behaviour based on some notion of how we want things to be processed by those behaviours.
And favour Composition too.
234
Clearly we’re talking about doers here, in which case we’re going to need an interface. One that guarantees the availability of the method or methods that will be our hooks into the desired behaviour. This is rather convenient, given that the Strategy pattern is informally specified as being thus: Define a family of related algorithms, encapsulate each one as its own object and then unify them all with a common interface and thus ensure that they are interchangeable. There’s an interesting point arising from that paragraph above. Defining a family of related objects unified by a common interface makes it sound rather like they might be candidates for an abstract class and a set of related child subclasses. In certain circumstances, this may indeed be appropriate but only where you can determine that each encapsulated strategy will be managing it’s own object state. For the majority of use cases though, the state that each strategy will be operating upon will reside either inside the consuming object that the strategy has been injected into or otherwise, on data or objects that are entirely separate to the strategy and its consumer. Before we get too bogged down on the theoretical, let’s get an example cranked out to make the ride a little smoother.
class ReportData {
private $data; ...
public function getOutput($format = 'csv') {
switch ($format) { case 'csv': $lines = [];
foreach ($this-> data as $row) { $lines[] = '"' . implode('",", $row) . '"'; } return implode("\n", $lines); break; case 'array': return $this->data;
And favour Composition too.
235
break; } } }
No great shakes here - a class that defines an object that in turn appears to hold onto report data. Evidently this object has been emitted from a request passed into the model layer and again, we seem to have a conditional structure that determines the format that the report data will be retrieved in. It doesn’t matter whether the conditional structure is a stack of if/elseif clauses as we saw in the State Pattern earlier, or a switch statement as we have here. If we find ourselves selecting certain blocks of code to run on the basis of a conditional statement, we are almost certainly going to benefit from abstracting those code blocks out into separate objects representing injectable behaviours. Nevertheless, our example here illustrates how the report data may be returned either as a (poorly constructed) CSV string or as the raw data array. This despite the fact that the business team have been haranguing us for html tables, PDFs and JSON strings for the API. Are we going to add three more cases to the switch statement? No, of course we are not. Nor are we going to create a family of ReportData classes linked by an abstract parent class for precisely the same reasons that we saw with the Order example in the previous section. If this object represents report data it’s reasonable to assume that there was at least some cost in generating the result. Therefore, there are likely significant savings to be made from caching this data, even for a short time. Why? The business team are asking that the data be made available as HTML tables and as PDF/CSV. No matter how strangely their brains seem to operate sometimes, we can at least presume that their desires include displaying that data in the browser and then also providing links to downloadable files. These output format requirements are perfect candidates for being represented as injectable behaviours, with each desired output format being represented by a single, properly encapsulated algorithm for transforming the raw report data appropriately.
And favour Composition too.
236
With this in mind, let’s first take a look at how we might modify our ReportData class. class ReportData {
private $data; private $formatter; public function setFormatter(ReportFormatStrategy $formatter) { $this->formatter = $formatter; }
public function getOutput() {
if (isset($this->formatter)) { return $this->formatter->formatData($this-> data); }
return $this-> data; } }
What have we done here? First of all, we’ve vastly simplified the getOutput() method to just two simple choices that will fit every possible request for the data - it’s either a request for the raw data array or it’s a request for that data array to be formatted before it’s returned. The second thing to note is that our ReportData object now accepts a formatter via the setFormatter() method. Quite specifically, this method type hints for an object that’s sporting the ReportDataStrategy interface. When providing the interface for a Strategy pattern implementation, it’s often very useful to suffix the interface name with the word ‘Strategy’. This makes the interface’s intent perfectly clear.
As a result, we now have a ReportData class that will be much more succinct, yet able to output its data in any format that the business team can dream up both now and in the future and without requiring us to modify the ReportData class any further.
And favour Composition too.
237
We need to take a quick peek at the remaining code to see how this is achieved. First of all, the unifying interface will look like this:
interface ReportFormatStrategy {
public function formatData( array $data); }
Well, that couldn’t be much simpler. What of the individual strategies?
class JsonFormatStrategy implements ReportFormatStrategy {
public function formatData( array $data) { return json_encode($data); } }
Life is always so much simpler with JSON. What of the other formats?
class CsvFormatStrategy implements ReportFormatStrategy {
public function formatData( array $data) { $csv = fopen('php://temp', 'r+'); $fsize = 0;
foreach ($data as $row) { $fsize += fputcsv($csv, $row); } fseek($csv, 0); $csvData = fread($csv, $fsize); fclose($csv);
return $csvData; } }
And favour Composition too.
238
The CsvFormatStrategy is now neatly encapsulated within its own class and we’ve also taken the opportunity to refactor the algorithm to do a much cleaner, more elegant job of the required task. We might be preaching to the choir here, but this is a much more elegant approach. With just two output formatting strategies here, it should be super clear that we have some eminently reusable code here, allowing us to turn any array into either a JSON structure for sending as the response body to an API call, or for preparing a CSV file that can be emitted to the browser without needing to go anywhere near the file system. This is a key feature of the Strategy Pattern itself. Since these discrete units of behaviour are perfectly self contained, that is to say that they have no knowledge of the objects that consume them, they can be reused extensively. If you think back briefly to the chapter on Inheritance, you may recall how we considered the possibility of moving a terribly useful method upwards into a parent class so that it could be shared? Now at least, it should be patently obvious how employing composition is so much more powerful a mechanism for code reuse than by trying to fudge things with inheritance. With the State Pattern and the Strategy Pattern we have seen two ways in which we can inject switchable behaviours into an object to modify the way that such objects act at run time . These two powerful techniques let us change the way in which our principle objects interact within our system without having to modify the objects themselves other than to make them receptive to the behaviours that we wish to inject into them. These two patterns are more concrete examples of what is known as the Delegation Pattern in which an object, instead of performing a particular task itself, delegates that task to a helper object that it consumes. It is precisely in this way that we can support the kind of delegate polymorphism that we encountered back in Part One. We don’t always have to inject behaviours into our principle objects in order to favour composition over inheritance though. We can flip the injection concept on its head and inject the principle objects into wrappers to modify their behaviour instead.
And favour Composition too.
239
The Decorator Pattern Being able to change the behaviour of an object at run time isn’t merely restricted to the State and Strategy patterns. The third pattern for us to consider how we might favour composition takes the opposite approach by injecting that object into a wrapper in order to change its behaviour. How? To answer this question, we should start with the pattern’s definition itself. It goes a bit like this: The Decorator Pattern allows for either the static or dynamic wrapping of objects in order to modify their existing responsibilities or properties. I say “a bit like this” because it is possible to find a number of somewhat subtle but confusing variations on this particular theme. There are a few takeaways we can get down first though to help with our understanding of this particular beastie. Firstly, we will be creating a class that will wrap our principal object of concern in order to provide modifications to its regular mode of operation. That means we’ll need to be able to inject our principal into the wrapper. Since our wrapper is going to depend on the injected principal, that should occur on the constructor. Our second takeaway is that if we are modifying the way that the principal behaves, our decorator must also provide an interface that is identical to our wrapped object. This is where things can get confusing since the UML diagram for this particular pattern specifies that the decorators should subclass the principal and thereby use inheritance to acquire the parent class’ interface. We’ll take a look at this technique first with the rather stereotypical pizza example, just so that we have the original definition down pat. After this though, we’ll look at an alternative interpretation of the pattern and consider the pros and cons of each. Here’s your pizza.
And favour Composition too.
240
abstract class PizzaBase {
private $ingredients = []; public function addIngredient($ingredient) { $this->ingredients[] = $ingredient; }
public function getIngredients() {
return implode(', ', $this->ingredients); } }
class MargheritaPizza {
public function __construct() { $this-> addIngredient('cheese'); $this-> addIngredient('tomato'); } } $pizza = new MargheritaPizza();
echo $pizza-> getIngredients(); // output: cheese, tomato
Going by the UML definition of this pattern, we should now subclass this pizza base in order to start providing decorators. This means that first of all, we need a base decorator, so let’s do that now.
And favour Composition too.
241
abstract class PizzaDecoratorBase extends PizzaBase {
private $pizza; public function __construct(PizzaBase $pizza) { $this->pizza = $pizza; } }
Note how awkward this is starting for feel already. We’ve defined an abstract child class from an abstract parent class. Granted, our abstract decorator base class now possesses the required interface, which is precisely what we need to be doing but creating an abstract parent-to-be as a child class of an abstract class shouldn’t gel without our PHP Brilliance oriented minds. If we are supposed to be considering abstract classes as defining the blueprint for a new datatype, we ought to be feeling a little uncomfortable about defining one new datatype from the blueprint for another. We will fix that shortly, but for now please do note how the base decorator type hints for its own parent class on its constructor. This is a key feature of the decorator pattern - that the decorator and the object that it composes both present the same interface to the world. Now we need to create a few concrete decorators to modify our rather basic and slightly underwhelming Margherita offering.
class HamDecorator extends PizzaDecoratorBase {
public function getIngredients() {
return $this->pizza-> getIngredients() . ', ham'; } } $pizza = new MargheritaPizza(); $decoratedPizza = new HamDecorator($pizza);
echo $decoratedPizza-> getIngredients(); // output: cheese, tomato, ham
So far, everything is looking good. We’ve wrapped the original MargheritaPizza instance in a new instance of the HamDecorator and then called the getIngredi-
And favour Composition too.
242
ents() method
in order to see if the list has changed. Lo and behold, it has, since the decorator applies it’s own “transformation” to the return value of the principal’s own getIngredients() method. Note how those last two lines of example code can be rewritten to look like the following: $pizza = new HamDecorator(new MargheritaPizza());
echo $pizza-> getIngredients();
The result is the same; we are still passing a new MargheritaPizza instance to the HamDecorator’s constructor. What this will help us to do though is illustrate another characteristic of the Decorator Pattern: Decorators must allow for recursive composition. This particular characteristic is the thing that distinguishes decorators from regular adapter implementations. Where adapters will generally wrap an object just once in order to modify and/or mediate method calls between client code and the object being “adapted”, decorators provide for unlimited, recursive composition in order to modify the outcome of a method invocation an unlimited number of times. We can see this in action with the OliveDecorator below.
class OliveDecorator extends PizzaDecoratorBase {
public function getIngredients() {
return $this->pizza-> getIngredients() . ', olives'; } } $pizza = new OliveDecorator(
new HamDecorator( new MargheritaPizza() ) );
echo $pizza-> getIngredients(); // output: cheese, tomato, ham, olives
And favour Composition too.
243
Consequently, we have within our grasp the means to build a pizza that is three storeys tall and takes years to eat, given that each decorator that we add to the chain of pizza construction will supplement the existing list of ingredients with its own additions. This whole process still relies on a slightly awkward inheritance hierarchy though. Note that in our last example, the OliveDecorator class extends the PizzaDecoratorBase class, which in turn extends the PizzaBase class. The reasoning behind the inheritance approach is clear. If each decorator must support the same interface as the object being decorated, then putting the decorators into the same inheritance hierarchy ought to satisfy this requirement. The end result being that a ham-and-olive decorated pizza will satisfy a type hint for a PizzaBase instance in the same way that a plain old MargheritaPizza instance will. As far as the client code is concerned, a decorated pizza will behave in exactly the same way as an undecorated one in terms of properties and method availability precisely because both types ultimately share the same common ancestor. This is unacceptable though. Unacceptable to developers with PHP Brilliance on their minds. Why? Because the abstract PizzaBase is there to define a new data type, a pizza. Constructing an instance of the MargheritaPizza is simply a case of using the new keyword. A different set of rules comes into play for constructing instances from the “decorator” branch of the family tree. On top of this we have the hard-coded dependencies cropping up again. Each decorator class has a hard coded dependency on the abstract decorator, which is as it should be of course, but the abstract decorator itself has a hard-coded dependency on the abstract ‘PizzaBase class. Consequently, any changes to the ultimate parent in this familial hierarchy will ripple through not just the genuine child classes of the PizzaBase abstract, but also through the hierarchy of decorators too. This might be desirable of course, but then again it might not. With adverse changes, the threat to our precious pub time is not merely doubled through having two abstract bases to inherit from. It’s actually squared. There’s a second, much more relevant issue at play here. Can we safely assume we’re building an application for a fast food place here? I think so. In which case, the business team may very well be on the cusp of realising that there’s more to life than pizza.
And favour Composition too.
244
I realise that that very statement may be hard to fathom but it’s true. Some people eat salads! Some of those salad eaters actually like olives too. I’m sure you can see where this is going. The OliveDecorator that we created earlier isn’t exactly reusable in its current form given that it requires a PizzaBase instance on its constructor. Do we refactor? Do we create an AbstractFood as the new ultimate parent in our increasingly wobbly hierarchy of comestibles? Doing so would surely allow us to “decorate” any dish with extra olives. The problem that we’re faced with here though is that we would end up with a truly unwieldy inheritance hierarchy that we simply couldn’t reasonably expect any of our teammates to retain such a hierarchy in their heads. End result? An distinct absence of pub time sacrificed in the name of late night bug fixes. Fortunately, this is all eminently fixable. Fixable by throwing out the original definition of the Decorator Pattern and reforging it with something a little more modern. One that is distilled from the original but without the baggage that the inheritance requirement trawls along behind it. If a decorator is required to present the same interface as the object that it decorates, we need only provide an interface that covers the required behaviour. In other words, ditch the inheritance hierarchy and employ a common interface in order to guarantee the required behaviour. To illustrate how this works, I could go back over the fast food application and redo the examples in an interface oriented manner but I actually want to cover this in a much more appropriate fashion. And by more appropriate, I mean in a way that more succinctly demonstrates why the inheritance approach is flawed. Naturally then, we’ll start with a class definition for a Warrior.
And favour Composition too.
245
class Warrior {
private $health = 100; public function takeDamage($damage = 0) { $this->health -= (int) $damage;
if ($this->health < 1) { throw new Exception("Gosh darn it, I'm dead"); } } ... }
As per usual, we’re missing the bulk of the class that makes up our Warrior instance and only focussing on the relevant features: in this case, a method that allows our warrior character to take damage on the battlefield and a running tally of our character’s health. Of course, we’re here to decorate our warrior before sending him or her into battle not least because we want our side to be victorious over the enemy. Decorate how? With armour! First of all though, we need that all important interface. Based on our warrior class above, this will be an exceptionally simple one.
interface Damageable {
public function takeDamage($damage = 0); public function getHealth(); }
Next up, we need a confession. A confession that we’re not completely dispensing with inheritance since we’re still going to create a family of related decorators and we still need to honour the recursive composition capability that is an essential characteristic of such decorators. Let’s do that now.
And favour Composition too.
246
abstract class ArmourDecorator implements Damageable {
protected $health = 0; protected $subject; public function __construct(Damageable $subject) { $this-> subject = $subject; }
public function takeDamage($damage = 0) { $this->health -= (int) $damage;
if ($this->health < 1) { $this-> subject->takeDamage( abs($this->health)); $this->health = 0; // reset to zero required } }
public function getHealth() {
return $this->health + $this-> subject-> getHealth(); } }
As an abstract data type definition this one does everything for us. The constructor type hints for the required interface to allow us to achieve the recursive composition necessary for developing a family of decorators. Further to this, we have the implementations for both of the methods that the Damageable interface demands. This leaves us with some exceedingly light work to do in order to create our concrete decorators. First though, we need to put that decorate-able interface into the Warrior class itself and add the getHealth() method to satisfy the interface’s requirements.
And favour Composition too.
247
class Warrior implements Damageable {
public function getHealth() {
return $this->health + $this-> subject-> getHealth(); } ... }
Finally, we are in a position to start decorating our Warrior instance in an appropriate manner.
class BoiledLeatherArmour extends ArmourDecorator {
protected $health = 150; }
class ChainMailHauberk extends ArmourDecorator {
protected $health = 200; }
class OakShield extends ArmourDecorator {
protected $health = 75; } $decoratedWarrior = new OakShield(
new ChainMailHauberk( new BoiledLeatherArmour( new Warrior() ) ) );
echo $decoratedWarrior-> getHealth(); // output: 525
Our end result? A Warrior instance “wrapped” in boiled leather, chain mail and an oak shield. Once we send this character out into battle each layer of armour take as
And favour Composition too.
248
much damage as it can absorb, starting with the OakShield as the outer layer and working inwards. Is our work complete then? Not quite. There’s a very clear and obvious problem here but before we get on to working our way through it, we should take a moment to consider what we have achieved. By divorcing the family of decorators from the principle subject to be decorated, we have successfully dispensed with the rather odd notion that an oak shield or a chainmail shirt is-a type of Warrior in much the same way that it seems odd to suggest that an olive topping is-a type of pizza. To put it another way, we have set our decorators free and improved their reusability. The problem that we’ve introduced here though is a pretty obvious one. If we were to send our Warrior instance out onto the battlefield it wouldn’t be able to do anything except take damage and report its current, composite health status. Pretty rubbish as far as warriors go then. The reason for this is down to the fact that we created the interface purely for the methods of interest, of which there are just two. By decorating the warrior with armour we have effectively shut off access to the warrior’s other methods. Methods which presumably included providing the ability to fight back and hopefully defeat our enemies. How should we fix this situation? We could proceed to extend the required interface so that it also included all of the Warrior class’ combat related methods but that would be messy. We would end up with a ChainMailHauberk that somehow knew how to fight, which is clearly wrong. Instead, we would need to approach the construction of a Warrior instance with a completely different strategy. There’s that word again. Strategy. Since our armour decorators are all defence related, our best approach is to employ the Strategy Pattern in order to create core defence strategies, say StandAndFight, HideAndSnipe or RunAway, which we can then proceed to decorate with the appropriate armour as desired. The same approach could then also be applied to the notion of creating a fight strategy, with their corresponding decorators; Flaming perhaps? Or PoisonTipped? Both of these options could equally be applied to instances of Sword, Dagger or Arrow.
And favour Composition too.
249
If the health of our warrior, its armour and its weaponry is also represented by the State Pattern, an eminently appropriate candidate, then we clearly have quite a complex construction process to consider when building out our Warrior instance. Fortunately, that’s the topic of the next chapter, Instantiaphobia .
Summary In a chapter dedicated to helping us divest ourselves of the notion that inheritance is useful, we’ve taken in three common design patterns that neatly prove the case for abandoning inheritance as a means of creating reusable code. Unless you truly believe that an olive topping is fit for being considered a pizza in its own right, that is. With the three design patterns that we’ve considered within this chapter, the core entities within our application will be able to support any desired modifications to their behaviour without once requiring a change to the core entity’s actual code. Each one of these patterns provides an insanely powerful way for us to remain compliant with the Open/Closed Principle, the second of the SOLID cannon, to which Part Four is dedicated. We’re just a few chapters away of discovering how.
Tell, Don’t Ask Of all the principles cruising the dark waters of the software development ocean, it’s this one, the Tell, Don’t Ask principle that is perhaps the one least often seen amongst those wild currents. Indeed, if the principles that go into making up the SOLID cannon are the luxuriously appointed Caribbean cruise liners, then this particular one might be better considered an Ohio-class guided missile submarine, gliding silently through dark waters unseen and unheard. For the most part out of sight and out of mind. Rarely does it surface, but when it does, you get to appreciate the power in those sleek lines, even though it doesn’t quite manage to carry the full complement of one hundred and fifty four Tomahawk missiles. Nevertheless. and despite the fact that this particular principle is so rarely spotted out there in the wilds, it’s one of the more powerful ones that the developer with burgeoning PHP Brilliance can add to their mental toolbox. With all of this hype and bluster, you might be tempted to think that it’s a principle that comes loaded with mysteriously arcane and forbidden knowledge. That it is somehow imbued with otherworldly powers; an inter-dimensional plane-shifting beast to be tamed with runes and incantations cast safely from within the confines of a circle of salt poured upon the floor. With all due credit to those super smart people over at The Pragmatic Programmer , the reality though is nothing like this at all. Indeed, this rather aptly named principle is, in reality, super simple. Your code should be looking to tell objects to do things rather than asking objects for an element of their state and making a decision based on the values returned. With this in mind, let’s start with a quotation from Alex Sharp, who, in his book “Smalltalk by Example”, introduces the idea rather succinctly with just a few words. Procedural code gets information then makes decisions. Object-oriented code tells objects to do things. – Alec Sharp
250
Tell, Don’t Ask
251
The second sentence within that quotation bears the crux of the matter as far as this principle is concerned. Many a time-served web application developer will have seen the action methods of controller classes access the properties of model objects in order to either make a logic decision based upon the value of those properties or to actually perform some sort of operation on that object property directly. To illustrate this malfeasance, let’s consider the following code snippet. class OrderController {
public function addOrderAction() { ... $userBalance = $userAccount-> getBalance();
if ($userBalance < $orderTotal) { throw new InsufficientFundsException(); } $newBalance = $userBalance - $orderTotal; $userAccount-> setBalance($newBalance); ... } }
As usual, I’m relying on you somewhat to fill in the initialisation gaps mentally so that the code example doesn’t become too cluttered. The real focus of this example comes in two parts. The first is a logic decision (or rather, a validation test) to determine whether the user’s balance is sufficient to meet the cost of the order that is apparently being placed. $userBalance = $userAccount-> getBalance();
if ($userBalance < $orderTotal) { throw new InsufficientFundsException(); }
Here, we have extracted the balance value from the $userAccount object and then compared that value to the $orderTotal variable. In the event that the user’s balance is less than the order total, an InsufficientFundsException is thrown.
Tell, Don’t Ask
252
The second consideration within this example is the clear and direct manipulation of an object’s properties outside of the object itself. $newBalance = $userBalance - $orderTotal; $userAccount-> setBalance($newBalance);
In this particular case, we are setting up a new variable called $newBalance as the result of subtracting $orderTotal from $userBalance, which we then proceed to pass back to the $userAccount object as the new balance value that it is meant to maintain. It only takes a moment of careful consideration to realise that this is a terribly flawed approach to programming. In both cases, we’re accessing a property value that really ought to remain private and personal to the owning object instance. In both cases, we’re flying in the face of what we ought to be understanding the notion of encapsulation to be. The first case in our example is pretty bad but the second case is unforgivable. Even so, this is such an extraordinarily frequent occurrence in PHP code that you might be tempted to imagine that it’s actually the right way to do things. Take a moment to consider controller methods that you yourself have written in the past; how many times have you asked a model layer for data elements and then either massaged them into a structure for a view file to use or even made some sort of logic decision based on the values retrieved? I have certainly done this very thing hundreds of times in the past. If we are meant to be telling our objects to do things, instead of asking them for details and performing the operations ourselves, then we should no longer see such violations in our controllers. Why would creating such violations be a bad thing? The answer to that lies with the fact that the code that we write into an enterprise application is rarely ever complete, finished and done. Those pesky little minions on the business team will most assuredly see to that. Consequently, whenever we expose the finer details of an object’s state, as indeed we do twice in the example above, we are also tightly coupling that consuming code to those same details of state.
Tell, Don’t Ask
253
In other words, we’re leaving ourselves vulnerable to bug-like scenarios, which will certainly crop up should we modify the innards of the UserAccount class at some point in the future. Consider the implications that would become apparent should the business team decide that we should be maintaining the user’s account balance as a penny value, rather than as pounds or dollars? Would you rather modify just a single object or alternatively, have to search the entire codebase for all of the locations where your code is extracting and working with the balance value itself? The correct answer is of course that you would rather only have to modify the innards of a single class. With this in mind, and before we get much further, let’s fix up our earlier example code so that the two violation cases are dealt with appropriately. Doing so will allow us to compare the broken code with the corrected version. Such a fix might look like this: class OrderController {
public function addOrderAction() { ...
try { $userAccount->payForOrder($order); } catch (InsufficientFundsException $e) {
// handle exception } ... } }
What has happened here? Well, first of all, we’ve eliminated that piece of validation code from the addOrderAction() method. It never really belonged in there in the first place. Why not? For the simple reason that it’s not a controller’s job to have intimate knowledge of a model object’s state, in this case, the actual balance value that this UserAccount instance is responsible for maintaining. Only the model object itself, the UserAccount instance, should know how it works internally. The second case, that of actually modifying the instance’s balance value has also been removed from the controller.
Tell, Don’t Ask
254
Keep this in mind for all of time: action methods have no business modifying model object properties directly. Action methods exist in order to cause things to happen, not to do the work themselves. Look again at the fix. We’ve replaced the balance validation lines and the balance modification lines with just a single method call: $userAccount->payForOrder($order);
We have also indicated that this method call may result in an exception being thrown through the provision of a try/catch block. Even without seeing the code behind that new method in the UserAccount class, we can deduce what is supposed to happen, and indeed, what sort of error might most commonly occur here. The fix addresses the precise point where we violate the Tell, Don’t Ask principle; the act of extracting object properties in order to work on or with them outside of the owning object instance. Even though these sorts of coding crimes are perpetrated most commonly within the action methods of an MVC-like application’s controllers, you can certainly find evidence of such heinous activity occurring throughout an application’s codebase, MVC-like or not. Here’s another example of a very common violation. if ($user->isAdmin()) { echo "Hello Admin person. "; echo "Have fun administrating."; } else { echo "Hello user person. "; echo "Have a nice day."; }
How often have you seen something like this? Personally, I’ve committed this particular code crime countless times in the past. But not any more! In this last example, we’re not even retrieving a property value from the $user instance, but we are still making a logic decision based on its value. What would
Tell, Don’t Ask
255
happen to this rather binary construct if the business team wander over to our desks and says this to us? “You know, we now need more than just users and administrators, we need a member role, a moderator role, an editor role and a publisher role”. Were we to stick with the original, but erroneous approach given above, we would need to add several else if () cases to the original if else. Or worse, replace the if else with a switch() statement. However, if we had the foresight to apply the Tell, Don’t Ask principle correctly in the first place, our code might look this: echo $user-> getWelcomeMessage();
Now the business team can spend all of its time dreaming up fancy new roles for our users and we won’t have to care. At least, not about which bits of view code we might have to modify in order to greet each user type properly. The logic decision based on the value of an object property has been removed from the places where it simply doesn’t belong. “Ah but”, I hear you say, “surely that’s just moving the if else into the user object, is it not? It’s not eradicating the problem entirely.” To which will come the rather non-committal reply of “no, not necessarily”. Without wishing to go too far off-topic, let’s take a moment to look into the matter. Even if we were to move the if else construct into the User class directly, we would see a marginal improvement to our code quality inasmuch as that logical decision would now be restricted to a single location in the code base. Remember, we do want to keep our codebase DRY and not have repetitions of code strewn across several locations. However, since we’re looking to build enterprise-grade applications with the natural flair that Team Brilliance exhibits, we wouldn’t be using that if else construct at all in this case. Two possibilities for avoiding the if statement come immediately to mind. The first possibility would be to create an inheritance hierarchy surrounding the User class itself, and thus entailing the creation of a BaseUser abstract from which we
Tell, Don’t Ask
256
would create the more specialised child classes such as Administrator, Member and Editor classes. Each of these child classes would have their own implementations of the getWelcomeMessage() method coded up accordingly. This might be fine if our requirements for how our user roles are to be implemented are exceedingly simplistic. The reality, of course, is that there is no such thing as “simplistic” when it comes to dealing with the business team. Which leads us to the second, and more likely, approach to coding up user roles. We create an inheritance hierarchy of role classes, with a BaseRole as the abstract parent and more specialised child classes such as EditorRole, PublisherRole and AdministratorRole. These role instances are then created and injected into our User instances whenever the User instances are themselves instantiated. The end result? In this case, the process of displaying a welcome message doesn’t utilise a single if() statement in order to display the desired message. The User instances are appropriately equipped with role instances that allow them to respond to the getWelcomeMessage() method call correctly and without any decision making to get in the way. Let’s get ourselves back on track though. If the key element of Alec Sharp’s quotation is “Object oriented code tells objects to do things” then what are we really doing if not creating object methods that read, write and manipulate an object’s state? Does this sound familiar? It certainly should given that we devoted rather a lot of time to examining encapsulation. Yes, this is one of those circular references that I mentioned previously. The Tell, Don’t Ask principle has a direct correlation with what we know about encapsulation and how it is “the bundling of properties and the methods that act upon those properties into a single logical unit”. It is also precisely the reason why Lizzie is currently hopping from one foot to the other whilst also pointing at her screen enthusiastically. The reason for such suddenly frenetic and slightly out-of-character behaviour? Certain luminaries of the programming world frown on this particular principle, considering it to be superfluous and unnecessary in many cases, downright dangerous in some. Martin Fowler is one such luminary who argues some very valid points concerning this principle in his bliki post¹³. ¹³http://martinfowler.com/bliki/TellDontAsk.html
Tell, Don’t Ask
257
To a certain degree, the naysayers are right to call this one out. If the Tell, Don’t Ask principle is only really saying to us that we should be doing encapsulation properly, then yes, the principle itself is an unnecessary addition to the catalogue of stuff that we as brilliant programmers are supposed to know. If the Tell, Don’t Ask principle requires us to add a method to an object for every single operation we need to perform concerning the data that the object holds then there’s a very real possibility that our classes will become quite bloated over time. We should certainly be striving to avoid bloated classes wherever possible. Lastly, if we were to get fanatical about the the Tell, Don’t Ask principle, then we might start to eliminate all of the getter methods that we’ve been writing into our classes previously. In some quarters, getter methods are criticised for breaking the principle of encapsulation by exposing the inner state of an object. Fanaticism is not normally a good thing though and in this case, eliminating getters entirely would also severely restrict our ability to create collaborators for the objects within our system. Imagine trying to create a SalesReport object that, when fed appropriate SalesOrder instances, wasn’t able to get any information about those orders for the reports that it is trying to build. Clearly the SalesOrder instances need to collaborate with the SalesReport instance and that means the order objects have to be able to supply the details of the sales order that each one represents. So where does that leave us, the members of Team Brilliance? The good news is that it leaves us in a very good place indeed. By adding this principle to our mental toolbox, we are backing up and reinforcing what we already understand about encapsulation . It isn’t always easy to keep the principle of encapsulation at the forefront of our minds whilst we are banging out enterprise-grade application code. It is, however, much easier to catch ourselves when we inadvertently make a logic decision based on the values held within an object’s state. When we find ourselves performing a logic test on an object’s state in order to determine which block of code will be executed, we are almost certainly violating this principle and therefore by extension, we’re also violating the principle of encapsulation too.
Tell, Don’t Ask
258
Conclusion The power of this particular principle comes not directly from its own content but instead, in the way that it helps us to remain compliant with the principle of encapsulation. If we’re doing encapsulation correctly, then it is highly likely that we are honouring the Tell, Don’t Ask principle. Conversely, if we are doing Tell, Don’t Ask correctly, then we are equally likely to be doing encapsulation correctly too. The key thing here though is that the Tell, Don’t Ask principle is much more likely to lodge in the front of our coding minds and give us pause when we trip up in our code. Hence the analogy to an Ohio-class guided missile submarine. You may not always see it, but you’ll know it’s there when it fires on you!
Instantiaphobia You might have noticed that I’ve name-dropped this particular principle several times already. Well, the good news is that we’ve finally arrived at the chapter that will be discussing the principle of Instantiaphobia . Granted, this is a homemade principle of my own making but rest assured, it’s been out of the oven long enough to allow you to pick up, turn it over and inspect it from every angle without burning your fingers. In essence, this principle deals with what I like to consider as developing an allergy to the use of the new keyword. Shortly, you’ll see why I consider this to be one of the key techniques for developing that apparently psychic awareness of how to avoid future bug scenarios. The principle itself is just a single, simple sentence so let’s get that one out of the way early. For every class defined in an application’s codebase, there should be one and only one location where objects are instantiated from that class. If you think back to the Talking Points chapter at the end of Part One, you might recall that I mentioned how I told a little white lie to one of my private students: that an object can only be created from inside another class called a Factory. That wasn’t even remotely true of course, but it did allow the student to leapfrog large numbers of other beginners in terms of achieving better quality code in a shorter time frame. Think back again to the chapter on Encapsulation and how we went through the process of developing and refactoring a User class through many iterations. One thing that you might not have noticed especially during that process was just how many times we changed the method signature of the User class’ constructor. At one point, we had a list of scalar parameters being passed in. At another, it was an array. At yet another point, it was an object. This is the point where adhering to the principle of Instantiaphobia will pay the greatest dividends. 259
Instantiaphobia
260
The reverse is just as true. Failing to acknowledge this particular principle will lead to a codebase that exacts a terrible price on that most valuable of commodities: high quality pub time.
In an agile world, with over-eager business teams possessing a seemingly endless supply of spanners that they can throw into the works, we are not always able to know in advance what the final version of a particular class should look like before we’ve built it. Heck, in an agile enterprise world, we rarely ever arrive at a finished object. Instead, we have to satisfy ourselves with interstitial moments of operational software. These moments on the development timeline are commonly known as “releases” and get tagged as such. Of course, modifying the constructor of a particular class inside application code that is already in production is a major sign of some poor initial design. Fortunately, if you take on board all of the ideas that we discuss in this book, those poor design days should be well and truly behind you. This is why this chapter exists in the first place - it’s about avoiding those future headaches by adopting a quite specific coding technique today. At some point in the future, your teammates are quite possibly going to think of you as being psychic if you take this approach on right now. But what is this approach? Let’s start with a snippet of code: $user = new User();
Our beloved user class returns again! Once we’ve finished celebrating this fact, we should ask ourselves “*Where is this code?”. In a naively designed system, or more likely, one that simply grows organically, it’ll be everywhere an instance of a User object is required. Quite literally, everywhere a $user instance is required, it’ll be instantiated. One project that I worked on previously had the $user instance for the currently loggedin user instantiated at the start of every request, irrespective of whether that instance would even be used to service that request or not. For a largish application with a
Instantiaphobia
261
significant number of features, this means that this particular line (with or without constructor parameters) may be found in hundreds of different locations. If we were to go through that encapsulation chapter again, but inside an application that invoked a new User() more than a hundred times, would that not represent an awful lot of leg work in order to update every occurrence? Granted, IDEs, terminals and command line clients all tend to provide a global search function, which in theory will allow you to locate and modify every occurrence in order to accommodate the changed modified method signature of the constructor. That is, as long as none of your colleagues has implemented code that creates objects using a variable that holds a class name. What do I mean by that? Well, let me show you another snippet of code, one that illustrates dynamic class name resolution: public function getObject($className) {
return new $className(); }
No matter how efficient the global search in your editor of choice is, it will not be able to find every code location where a new User is instantiated if there’s even a single line of code that calls this method with ‘User’ set as the value of $className. Using the new keyword in multiple locations for the same class simply sets you up for a world of pain. It’s as simple as that. As easy as it is to type $user = new User() in the piece of code that you’re working on currently, I can say with a level of confidence verging on the level of smug git that you’ll rue the decision at a later point. This is where Instatiaphobia comes in. Ok, I confess - this is a completely made up word but it’s one that I think captures the essence of this particular lesson, as it were. What I’m hoping to arrive at here is the development of an irrational fear of creating new instances . I think calling it a phobia serves us very well. If a phobia is an irrational fear, then developing a fear for using the new keyword that can’t easily (initially) be explained should serve us well in the future.
Instantiaphobia
262
Much has been written about when an object may be considered newable and indeed, the arguments presented do hold true for software intended to run on the desktop in an offline world. Some developers may very well argue that it’s perfectly possible to create a large and complex, yet robust and secure application that still invokes new on the same classes hundreds of times throughout the code base. Such an application may be heralded as the singular proof that our new found fear is indeed irrational. But Such an approach carries with it enormous risks. Even if the codebase right now is secure and stable, any future changes to the constructor methods of any of the classes would entail a significant amount of work just to implement all of the necessary changes. Do bear in mind that our world isn’t the desktop oriented offline one. Our software isn’t delivered as a finished product on an installable CD or DVD ROM. Instead we work on delivering tagged releases through an iterative cycle that follows lean or agile methodologies, coding up individual stories to be added to the product with the next release and always with the business team leaning over our shoulders scratching out the next change request. How much better would it be if, for every class declaration within your application code, there was only ever one, singular and unique location where the new keyword was used for each class in question? Incidentally, I never meant to imply that we should always avoid the use of new. It is simply impossible to create object oriented code without it. I just wanted to make that bit clear at the very least.
Let me just go over that again. For each and every class that we declare in our application’s codebase, we should have one and only one location where the new keyword is used on that class. So, if we create a User class in our application, we need to have only one location where the corresponding new User() code is written. The benefits from adopting this as an in-house rule are numerous even though it’s almost certainly going to be a
Instantiaphobia
263
case of delayed satisfaction - those benefits will only be felt at some undefined point in the future. We may not know exactly when those benefits will be felt, but we do know the circumstances that will give rise to those benefits. This is the point where I suggested that your colleagues might begin to think of you as being psychic. Those circumstances are, of course, the ones that require us to change the way that a particular object is set up. This can be something as simple as providing a new constructor parameter, or as complex as injecting the target instance with other, supporting objects. This is all well and good, I hear you say, but how should we go about achieving this? If I’m supposed to develop an allergy to using the new keyword, if I’m supposed to suffer from an irrational fear of littering my codebase with new myClassNameHere(), are you going to tell me how to achieve this? Why, yes. Yes I am. Let’s start with the simplest idea first and take it from there. For those of you out there who hold a morbid aversion to static methods, please hold fire before taking a contract out on my head.
A factory method in every class If we want to achieve that goal of only ever having one instance of the new keyword per class defined, the easiest approach to making a start on this is to add a factory method to the class itself. Or to put it another way, every time we create a new class in our application, we can add a factory method to that class and we’ve gotten the single instance of the new keyword covered. Here’s an illustration of that idea.
Instantiaphobia
264
class User {
private function __construct() {
// Init object }
/** * Factory method included in the class definition * */ public static function create() {
return new User(); // A real create() method would of course // do much more to set up the user instance // rather than just "return new" } }
// Elsewhere in our application $user = User::create();
Now, with the code above, not only have we declared the User class, we’ve also gotten the instantiation of User objects covered by including a factory method directly within the class. On its own, this strategy will doubtless have an army of developers outraged and reaching for the pitchforks and torches. I’ll get on to the reasons why this might be so shortly, but firstly, let’s take a look at the benefits that this technique provides. We’ll be able to keep those benefits in mind as we progress through steadily improving solutions. From the get-go, we’ve provided our entire application’s code base with the means necessary to get hold of a User instance. Regardless of whereabouts in the application you are, if you need a User instance, you can pick one up in our newly approved fashion simply be calling the corresponding create() method. $user = User::create();
But that’s not all. If, during the process of building out the User class, we find ourselves needing to specify certain dependencies that must be supplied to the
Instantiaphobia
265
constructor at creation time, we also have the one and only call to new User() right there in the same file. Imagine if you will that all of your class declarations within your application automatically included this static create method. How much easier does it become to manage the object creation process? How much easier does it become to safely modify the method signature of the class constructor when the only location that objects of this class are instantiated is right there in the class? Of course, this isn’t the final solution. This is especially the case for large and complex applications that have equally complex objects. Readers with a keen eye and a critical mind will have already deduced that, for objects with a complex set of constructor parameters, this technique saves us nothing at all if the factory method itself does nothing but proxy those parameters to the constructor. In code, this might look a little like this: public static function create($userId, $userRole, $userAccount) {
return new User($userId, $userRole, $userAccount); }
However, this factory method technique does protect us from some significant amounts of future pain if the method does some actual work too. public static function create($userId) {
// Collect the role and account instances $userRole = UserRole::create($userId); $userAccount = UserAccount::create($userId);
// Collect the actual user details $dbConn = Database:: getInstance(); $userData = $dbConn->query("SQL statement here"); $user = new User($userData, $userRole, $userAccount);
return $user; }
Instantiaphobia
266
The difference is appreciable. Out there in application code, the call for a user instance is simply $user = User::create($userId);, but the object that is returned is one that has been properly prepared with the role and account objects already provided and ready for use. For smaller applications, this approach makes the codebase immediately much more maintainable. Where previously you may have had a hundred or more locations where you were instantiating User objects, now there’s just one. No longer do you need to perform a global search and replace when you modify the constructor there’s just a single method that you now have to make corresponding changes to and it’s right there alongside the constructor that’s being changed. This technique also brings greater convenience to you as a lone coder and greater consistency if you work in a team that has adopted this approach. Think on it - if all of your team members follow the same convention and create the object’s factory method in the same class definition as the target class itself any future change to the process of instantiation for that particular object need only ever to involve editing just the one file. There’s an optional, ancillary benefit to employing this technique too. By having the factory method within the same class (or in the base class for a family for related classes), it allows you to make the constructor method itself non-public. Consequently a private or protected constructor guarantees that client code cannot bypass the factory method process in order to obtain a useable instance simply by calling new directly. In a team based environment the newer members will sometimes need to be coerced into following the house rules by constraints placed into the code directly. A constructor that has been set to private does this automatically. This all works just fine as long as the factory method is the thing that is doing the work, rather than just acting as a proxy for a list of parameters. Even so, it remains little more than a halfway house solution. For sure, it’s vastly superior to the habit of littering codebases with new This() and new That() that our more inexperienced colleagues might be used to doing, but it certainly is a long way short of the ideal. Why is this? Well, let’s take a short while to explore the two key reasons why you might also want to avoid this technique. The first reason is one that we’ve already encountered briefly. If our factory method
Instantiaphobia
267
does nothing but proxy a list of constructor parameters to the constructor itself, then our factory method is entirely redundant and the benefits are lost. The thing is supposed to be protecting our wider application code from the need to marshall that list of parameters in the first place. The thing is supposed to be saving us from having to perform a global search of the code base every time the list of constructor parameters changes. The second key problem arises when we include considerations of automated testing, a topic that will be examined in greater detail in PHP Brilliance in Practice . For starters, any aficionado of unit testing will automatically baulk at the presence of so many static methods, and rightly so. In a world where we are building large, complex applications we ideally want as much high quality code coverage as possible. Yet, in our most recent version of the create() method, we’ve made unit testing impossible. Or rather, I have, and quite deliberately so and the breakages come in two distinct flavours. First up in the dock to stand accused are the calls to other static create methods: // Collect the role and account instances $userRole = UserRole::create($userId); $userAccount = UserAccount::create($userId);
These hard coded dependencies, whilst certainly valid in this instance, mean that any “unit” test on the User::create() method would also be indirectly testing the other create() methods at the same time. Not very unitary at all, as it turns out. Second up in the dock to stand accused though is the much greater of the two offenders; the call to collect what seems to be a singleton instance of the database connection. $dbConn = Database:: getInstance();
Here’s a hard and fast rule that should never be broken: Unit testing should be carried out without requiring the application’s storage system getting dragged along with it. A database connection is always a performance bottleneck in any application, even in a single server setup where the database server and the application server reside on the same physical host.
Instantiaphobia
268
Nothing kills the appetite for automated testing quite like the test suite that takes in excess of ten minutes to run. A test suite, even one that runs to thousands of individual unit tests, should complete in a matter of seconds. Ever the creative sorts, programmers have gotten around this “dragging the database around” problem by allowing their database singleton to be set up with a mock connection instance. Whether it’s appropriate to write test oriented code rather than application oriented code is another debatable topic though.
So where do we go from here? What’s our next option for improving on this scenario? Well, now that we’ve successfully extricated our feet from the boggy mire of free and wanton new keyword usage and having dried our feet and warmed our toes on the fire of the factory method, we should now be well prepared for the next stage of our journey.
Providing dedicated factory classes As the subheading suggests, our next step along the object creation adventure is to move those factory methods into dedicated classes of their own. Remember if you will that our goal remains to restrict the instantiation of a particular object to a single location in the codebase, no matter how large that codebase is going to become. Our next step on this journey is to ascend from the foothills of the factory method and scale the heights of the Factory pattern itself. At this stage then, the in-house rule becomes: For every class in the application, there should be a corresponding factory class. For our rather simplistic example, the migration process of moving from factory methods to dedicated factory classes becomes a straightforward cut and paste operation. We cut from the User class, leaving us with:
Instantiaphobia
269
class User {
// Note that we must revert // to "public" access here. public function __construct($userData, $userRole, $userAccount) {
// instance setup code } }
And paste the cut content into a new UserFactory class, which would then look like this: class UserFactory {
public static function create($userId) {
// Collect the role and account instances $userRole = UserRole::create($userId); $userAccount = UserAccount::create($userId);
// Collect the actual user details $dbConn = Database:: getInstance(); $userData = $dbConn->query("SQL statement here"); $user = new User($userData, $userRole, $userAccount);
return $user; } }
You might recall that this is precisely the content of that little white lie that I alluded to at the end of Part One , that every object in our application has to be instantiated from within a correspondingly named factory class. Cynically, we might say that this has achieved nothing but redundant extra code and adding unnecessary overhead to our application, which would now have to source in the file for the factory class as well as the file for the user class. But we don’t do cynical here at Team Brilliance. Instead, we look to the positive aspects of what this achieves.
Instantiaphobia
270
Switch your attention back to the User class and you should see the immediate benefit. Our instances of the User class are no longer having to lug around a theoretically redundant static method with them and as a result, the memory footprint of our application is reduced, albeit marginally. More importantly though, our User class is fully testable again. This singular piece of code allows us to instantiate user objects by providing mocks, stubs and dummy data to the constructor and subsequently test all of the object’s methods without once encountering a static method call or picking up the database singleton along the way. Unit testing aficionados rejoice! Let’s look at that again with some slightly different wording in order to reaffirm what we’ve just considered. The User class has been slimmed down to only contain the methods that directly pertain to user object instances (granted, this is already a pretty slim object in our code example right now). Whilst I accidentally on purpose neglected to mention this downside to the factory method approach previously, it’s worth just a quick peek at it now. Whenever you instantiate an object from a class that also contains static methods, the resulting object instance also carries with it those same static methods. Let’s just illustrate that for the benefit of understanding. // Pick up a user instance in the regular fashion $fred = User::create(1);
// $fred now also carries the create() method $mary = $fred->create(2);
// So too does $mary $jenny = $mary::create(3);
As you can see, our User instances are picking up that factory method and making it available for use. Or indeed, abuse, depending upon your point of view. Of course they are. That factory method is part of the class definition so it’s only to be expected that each User instance also provides access to the create() method too.
Instantiaphobia
271
This is one of the things that the transition to dedicated factories allows us to avoid. By moving the creational methods into their own dedicated class files, we no longer waste processor cycles and memory maintaining redundant methods on objects which, let’s face it, are going to be used extensively throughout a busy application’s day. On top of that, there’s no real reason for calling the factory method on a particular User instance, not when the code itself declares its usage intent; that being a statically invoked class method. Which brings us back to the matter of providing dedicated factories for the objects in our system. If you’re already at a senior level within your company, there’s a very good chance that you are already familiar with the Factory Pattern and as such, you will quite likely be aware that what I’m proposing here isn’t a properly specified implementation of this pattern. In common parlance, what I’ve actually done here is to create something commonly referred to as a simple factory. In reality, all that has happened is that the factory method itself has been abstracted away into it’s own class. Instantiaphobia is The key point as far as this principle of Instantiaphobia is concerned, is that we are still successfully limiting the use of the new keyword to just a single location. Let’s look at that code again. class UserFactory { create($userId $userId) ) public publi c stati static c func function tion create( {
// Collect the role and account instances $userRole = UserRole:: UserRole::create create( ($userId $userId); ); $userAccount = UserAccount:: UserAccount::create create( ($userId $userId); );
// Collect the actual user details $dbConn = Database:: Database:: getInstan getInstance ce(); (); $userData = $dbConn-> $dbConn->query query( ("SQL stat statemen ement t here here" " ); $user = new User($userData User($userData, , $userRole, $userRole, $userAccount); $userAccount); $user; return $user; } }
Instantiaphobia
272
From a unit testing standpoint, this is still a nightmare, but we are at least making some progress. You could You could elect elect to run the test suite that covers this and any other simple factories factories independently from the test suite that covers the core application code. It is worth bearing in mind that the corresponding User class no longer contains this dependency-laden static factory method and as such, can now be fully unit tested in its own right by passing mocks and stubs into the constructor as desired. Our simple factory is still a bit of mess though. How should we go about cleaning it up? For starters, we should turn our attention to the dependencies that are listed inside that create() method. Here are the lines of code in question. $userRole = UserRole:: UserRole::create create( ($userId $userId); ); $userAccount = UserAccount:: UserAccount::create create( ($userId $userId); ); $dbConn = Database:: Database:: getInstan getInstance ce(); ();
The first two lines should clearly get the same “migrate the factory method to its own dedicated class” treatment, allowing both the UserRole and UserAccount classes to become fully unit-testable. $userRole = UserRoleFactory:: UserRoleFactory ::create create( ($userId $userId); ); $userAccount = UserAccountFactory:: UserAccountFactory ::create create( ($userId $userId); );
Those are still hard coded dependencies but it could be said that those dependencies are justifiable, assuming our instances of the User class cannot live without their corresponding role and account counterparts. The real elephant in the room though is the third dependency; the call to the database singleton. $dbConn = Database:: Database:: getInstan getInstance ce(); ();
It’s this that truly knackers our automated testing strategy since we’ve already learnt that our automated tests shouldn’t have to drag the database connection along with them.
Instantiaphobia
273
A static method that is fully self-contained is one thing. A static method that calls other static methods located elsewhere is another thing. A chain of static methods that includes a call out to another, entirely separate system such as a database or a web hosted API is the worst possibly thing of all. Such an approach will sneakily introduce a form of brittleness into your application development. What do brittle things all have in common? A tendency to break. Now it’s not just the application itself that can break, although that is likely to happen with a lot of hard coded dependencies lying around. No, the development process itself can also be broken and that’s certainly something we want to avoid. Consider this; if a test suite inadvertently relies on the presence of the database in order to run, and that test suite then takes seventeen minutes and thirty-four seconds to complete, how many times a day do you think that test suite is going to be run? Before each commit? Or just the once and just as the developer is about to head out for lunch? So we need to fix this, and fix it right now. But how? Easily, is the fortunate answer. Instead of having those hard coded dependencies reaching out to other parts of the code base, we can inject them into the factory that needs them. What this actually means is that we will need to turn our factories into objects themselves. Like this: class UserFactory { $db; ; private $db $roleFactory ; private $roleFactory; $accountFactory ; private $accountFactory; __construct( public publi c func function tion __construct( Database $db $db, , UserRoleFactory $roleFactory, $roleFactory, UserAccountFactory $accountFactory ) { $this-> $this -> db = $db $db; ;
Instantiaphobia
274
$this-> $this ->roleFactory roleFactory = $roleFactory; $roleFactory ; $this-> $this -> accountFac accountFactory tory = $accountFactory; $accountFactory ; }
public publi c func function tion create( create($userId $userId) ) { $role = $this-> $this->roleFactory roleFactory-> ->create create( ($userId $userId); ); $account = $this-> $this-> accountFac accountFactory tory-> ->create create( ($userId $userId); ); $userData = $this-> $this-> db db-> ->query query( ("SQL stat statemen ement t here here" " ); $user = new User($userData User($userData, , $role, $role, $account); $account); $user; return $user; } }
This is, This is, in fact fact,, an illu illust stra rati tion on of the the Inve Inversi rsion on of Contr Control ol principl principle. e. Rather Rather than than having having our UserFactory reach out to the classes and objects upon which it depends, those dependencies are provided to the factory itself. The factory becomes dependent upon the abstractions declared by the type hinting present in the constructor’s method signature. A very desirable side effect of this is that our object factories have become unit testable too and crucially, the database connection object can be mocked appropriately so that we no longer have to drag the database around with us. Our full suite of automated tests can once again execute in a matter of seconds, rather than minutes. This is fantastic news, isn’t it? We should go out and celebrate immediately! This is one particular statement that I could certainly get behind. After all, one of the key themes of this book has been to preserve our all important pub time. In this particular case though, we need to hold our horses for just a little while longer. You see, by turning our factories into objects we’ve dropped not one, but two key problems onto our own doorstep. The first of these problems is the matter of when and where do we put the new UserFactory() line of code? At what point do we create new instances of the UserFactory? In a bootstrap file that gets run at the start of every request? The fact that the UserFactory has a declared dependency on the database connection seems to suggest that this would be prudent.
Instantiaphobia
275
The second problem is the more pertinent one though: How do we make these factories become available at the points where they are needed? Back at the point where they were accessed via a static method call, we could call out to the factories to make an object for us at any point in the code. Now that they are object instances, we have to solve the issue of making them available at the points where they are needed. If, by any chance, you are experiencing a little difficulty in visualising a solution to this particular problem, fear not, you are certainly not alone. This is one of the knottier problems of designing and laying out an application’s codebase. Despite this second problem being the more pertinent one, we’ll deal with the solution in a more appropriate place. For now though, here’s a hint: It’s not “Build an implementation of the Registry Pattern ”. ”. Since we’re currently in a chapter that deals specifically with minimising the use of newing up the new keyword, let’s tackle the issue of newing up factory instances. What we’re looking at as a solution to our problem is some sort of super factory, some sort of object that can yield up instances of our User class fully made up with all of it’s dependencies satisfied. What we’re really looking for is one of these:
Dependency Injection Containers It’s a fancy name but in many regards, a dependency injection container behaves just like a regular factory, but on steroids. In essence, a dependency injection container is an object that knows how to prepare, configure, create and manage other object instances. How does it know this? You tell it, is how. Whether you use a third-party container from a library or code up your own is irrelevant at this stage. In both cases, you must seed the container with the appropriate object building knowledge, either through code directly or via configuration parameters. In either case, you will end up with a container object that provides access to the relevant factory methods in order to furnish your client code with the properly configured objects that it needs.
Instantiaphobia
276
Migrating our UserFactory class into some sort of container-like context might look something like this: class Container { $db; ; private $db __construct(Database $db $db) ) public publi c func function tion __construct(Database { $this-> $this -> db = $db $db; ; } createUser($userId $userId) ) public publi c func function tion createUser( { $userData = $this-> $this->collectDbRow collectDbRow( ('users' 'users', , $userId); $userId); $role = $this-> $this->createRole createRole( ($userData $userData); ); $account = $this-> $this->createAccount createAccount( ($userData $userData); ); $user = new User($userData User($userData, , $role, $role, $account); $account); $user; return $user; } createRole( array $userData) $userData) public publi c func function tion createRole( { ... }
public publi c func function tion createAccount( createAccount( array $userData) $userData) { ... } }
Agai Again, n, for for brev brevit ity y, I’ve I’ve left left ou outt the the meth method od bo bodi dies es for for the the createRole() and createAccount methods but hopefully you can still see how, instead of maintaining the object creation methods inside their own specific classes, we’ve brought them together into a unified container structure. Even with such a simplistic structure, it is possible to note that the createUser() method has been provided with all of the knowledge that it requires in order to provided a finished User instance.
Instantiaphobia
277
public publi c func function tion createUser( createUser($userId $userId) ) { $userData = $this-> $this->collectDbRow collectDbRow( ('users' 'users', , $userId); $userId); $role = $this-> $this->createRole createRole( ($userData $userData); ); $account = $this-> $this->createAccount createAccount( ($userData $userData); ); $user = new User($userData User($userData, , $role, $role, $account); $account); $user; return $user; }
Here, I’ve isolated that method for a better view of it. You can see quite clearly that, in order to furnish us with a properly prepared User instance, the container knows to fetch a row of data from the database as well as prepare two other objects that the User instance will depend upon. Instantiaphobia ? Will this satisfy our principle of Instantiaphobia Resoundingly yes.
Summary This chapter has all been about developing an aversion to using the new keyword and thereby using it the least number of times possible. Of course, the absolute minimum number of times that you can use the new keyword will be singularly enticing once enticing once per class , which is precisely what we aim to achieve by following the principle of Instantiaphobia . Let’s just look at the wording again: For every class defined in an application’s codebase, there should be one and only one location where objects are instantiated from that class. Incidentally, this doesn’t have anything to do with being finicky or fickle. Adhering to this principle will have a very positive effect on preserving your all-too-precious pub time. If there’s one thing that you can be certain of it’s this: Creating objects willy-nilly at the point where they are needed is a practice guaranteed to prove a maintenance
Instantiaphobia
278
nightmare in the future. One thing that maintenance nightmares always seem to drag along with them are late nights of office-bound bug fixing. What it all comes down to is this: When the circumstances surrounding the creation of a particular object changes, you’ll want to amend the one and only location where those objects are created, and not have to hunt through the codebase to modify them all, whilst simultaneously hoping that you have indeed found and correctly edited each and every one. I know I’d rather be down the pub with my teammates than being the one left behind to fix up the mess.
Do shoot the messenger Now that we have had at least a very brief look at dependency injection containers, the time has come to examine another anti-pattern. Regrettable? Yes. Avoidable? No. The anti-pattern in question is one known as The Courier . The reason that The Courier crops up so frequently when a dependency injection container enters the fray has everything to do with that second problem that I intimated at in the latter half of the previous chapter. If you instantiate an instance of a “super factory” at bootstrap time, how do you make it available to the objects that need it? The answer is not to turn intermediate objects into couriers in order to ferry the container to where it is needed. We are getting ahead of ourselves though. Just what is the anti-pattern known as The Courier ? The term was coined by Tom Butler in his blog post OOP: The courier anti pattern¹⁴ and I’m paraphrasing somewhat to describe it as thus: The Courier is used as a means of making dependencies accessible to parts of a system which need them by passing those dependencies through components that do not. If we push Lizzie to one side for a second, we can use her machine to code up a quick and dirty example to illustrate this idea. First, we need a dependency. // bootstrap.php $db = new PDO($dsn);
¹⁴https://r.je/oop-courier-anti-pattern.html
279
Do shoot the messenger
280
An active database connection, in this case a PDO instance, is a very popular choice for a dependency that will get passed around by couriers until they reach their intended target. Now, we need a courier, and what better choice of courier than a page controller? class ProfileController {
private $db; public function __construct(PDO $db) { $this-> db = $db; } ... }
I have deliberately omitted the code for any actions methods thus far in order to highlight the relevant parts, as far as The Courier is concerned. Our controller class here has a declared dependency on the PDO instance, as indicated by the constructor’s method signature. If we have even a modicum of good sense, we will not be accessing the database directly from within the action methods, so why is it here? We can answer this question with the provision of an action method. class ProfileController {
private $db; public function __construct(PDO $db) { $this-> db = $db; }
public function updateAction() { $userId = $_SESSION['user_id']; $userProfile = ProfileModel::create($this-> db, $userId); $userProfile->update($_POST); } }
Do shoot the messenger
281
Ok, I accept that it’s a rather spurious example but it does serve our purposes here; the only reason that our controller accepts the $db object in the first place is to pass it on to a factory method on a model class. This is The Courier anti-pattern in action. An object receives a dependency that it doesn’t actually use. Instead, it receives a dependency to pass it on to another object, method or function. Why is this such a problem? It’s one of those problems where, whilst it works just fine today it will almost certainly be broken tomorrow. Plus it’s simply plain old bad design. Let’s concentrate on the broken scenario though. Whilst we appear to be following that first fix that was presented in the previous chapter, that of implementing a factory method on the target class itself, that very factory method itself declares a dependency in its method signature. This in turn means that any code that invokes the factory method must supply that dependency. But what if our requirements change? What if we come to our senses and decide to migrate from factory methods to full blown factory objects or, better yet, a dependency injection container? In either case, a change to the way that we create Profile instances would entail a code change inside our ProfileController class. No matter how good the global search and replace function in your IDE or editor of choice is, you would be faced with the need to make changes, potentially far-reaching changes, in parts of the code base that needn’t be touched in the first place. This is precisely one of those situations where “I fixed this bit here, but now that bit over there is broken” becomes an oft-heard phrase. Yet this ever unwelcome intrusion upon our most valuable commodity, high quality pub time, is entirely unnecessary. The way to fix this is to start asking questions of the code itself. In a way, this is rubber duck debugging¹⁵ but with experience, the rubber duck becomes entirely optional. What are we really asking the controller to do? ¹⁵https://en.wikipedia.org/wiki/Rubber_duck_debugging
Do shoot the messenger
282
In essence, we need it to pass the array of post data to an instance of the Profile model so that the model can be updated accordingly. What this actually means then is that the controller needs a means of acquiring that Profile model instance. That’s not to say that we need to inject the controller with the specific Profile instance that we’re looking for, although that would certainly be a plausible solution. Instead, we should be looking to inject our controller with a factory instance of some description; one that can yield properly constructed Profile instances for us. class ProfileController {
private $profileFactory; public function __construct(FactoryInterface $profileFactory) { $this->profileFactory = $profileFactory; }
public function updateAction() { $userId = $_SESSION['user_id']; $profile = $this->profileFactory ->create($userId); $profile->update($_POST); } }
There are a number of things to note from these changes. First and foremost, the courier-like nature of our controller’s previous state has now been removed completely. The only dependencies provided to it are the dependencies that it actually needs to use itself. A second, but no less important consideration in this particular instance is that our controller is now completely divorced from any notion of the application’s storage system. This is a critically important change. Controllers have no business knowing anything about an application’s persistence mechanism and the changes that we’ve made here ensure that that is the case.
Do shoot the messenger
283
We’ll get into some of the whys and wherefores in the next chapter, But Don’t Talk To Strangers
It is of course somewhat ironic that a dependency injection container might itself be considered a dependency but the actual truth of the matter is that it quite often is. At the end of the day, this is the reason for both why this chapter exists and why it immediately follows on from the chapter on Instantiaphobia . Quite often a developer’s first inclination when they first encounter this fancily named super factory is to consider the dependency injection container to be the one and only, singular and unique source of all prepared objects within the application. As such, we’ll start seeing couriers cropping up as the container needs to be transported from one layer to the next, and on to the next. And the next. Addressing this particular scenario is shockingly simple; look at the process in reverse and work backwards. As Lizzie might be keen to point out, I used to be the sort of developer that would think of the process in terms of a request coming in, getting processed and resulting in a response going out. That is of course how it is likely to happen within our application code. That’s also how a lot of beginners will address the matter of coding up a particular feature. But that’s not necessarily how we should think it before building it. In a multi-tiered application, this may involve devising a model factory that has a dependency on the persistence layer, with the model factory in turn being set on service instances by a service broker, which in turn might be set on controller instances. In other words, there’s no rule that say we cannot create multiple dependency injection contains and have them act as the bridge between each layer. The controllers, services, models and persist-able data objects can all be furnished by their own dedicated containers; discrete units with a laser sharp focus of operation. And with nary a courier to be seen.
Do shoot the messenger
284
Now we really ought to let Lizzie get back to her machine so that she can continue turning out awesome code for us.
Conclusion The courier is an anti-pattern that frequently rears its ugly head when a dependency injection container comes into play, quite simply because, as an instantiated object the container needs to be made available to the objects that require it. Nevertheless, when you encounter a situation that involves setting one object on another simply to have the receiving object pass it on again, rather than use it directly, you have an instance of The Courier . Invariably, this will lead to a critical loss of pub time quite simply because the demands of the final recipient are likely to change. When that happens, and with a courier in play, you will invariably find yourself having to change the couriers as well as the code of the final recipient. That’s a critical loss of valuable pub time right there. In other words, the presence of a courier drags along with it the spectre of tighter coupling and all of the problems that such an apparition will invariably produce. Consequently, do indeed shoot the messengers at every opportunity. In the kneecaps. With the code equivalent of at least a .38 caliber. And take no prisoners whilst doing so.
Don’t Talk To Strangers In the previous chapter, we started to look at how even the accidental implementation of couriers will, in all likelihood, make portions of your application fragile and susceptible to bugs. The example we looked at passed a database connection through a controller in order to reach its final destination inside a model instance. The issue wasn’t that the code doesn’t work, or that it breaks occasionally (although it might if you haven’t wrapped the establishment of a database connection with the appropriate safeguards and error-handling). No, the issue is the fact that when the target code, in this case the model class, is revisited at some point down the line, you also have a string of other classes to consider if the reception of the database connection changes in any significant way. It’s very much a case of “I fixed this bit here, but now that bit over there is broken! ” This chapter focuses on a very similar problem and concerns itself with something known by many names, one of which is the Law of Demeter . Devised by Ian Holland way back in 1987, the Law of Demeter isn’t even a law, but a guideline. I suppose we can forgive the fine folks that came up with that particular moniker though, since “The Guideline of Demeter” is somehow less punchy . Emerson Macedo¹⁶ boils down the Law of Demeter into three succinct statements, which are thus: 1. Each unit should have only limited knowledge about other units: only units “closely” related to the current unit. 2. Each unit should only talk to its friends; don’t talk to strangers. 3. Only talk to your immediate friends. In a way though, these three statements are a little too much on the succinct side. Without appropriate context or prior knowledge of this particular guideline, it isn’t particularly easy to intuit either their meaning or their intended application. But hey, that’s what this chapter is for. Let’s get going. ¹⁶http://emerleite.com/
285
Don’t Talk To Strangers
286
The Principle of Least Knowledge That sub-heading up there gives us another one of the names that this particular guideline is known by and at least this time it’s a little more indicative of what this topic is actually about. This new name for it ties in nicely with those three bullet points that we looked at on the first page of this chapter. The idea behind the Principle of Least Knowledge is that wherever you happen to be coding inside your application, the code that you write should only express knowledge of its immediate surroundings. When you follow this particular guideline, you’re automatically promoting the notion of loose coupling within your codebase, which in turn leads to a much higher degree of maintainability . What on earth does this all mean though? Clearly, we need one of those examples around about now. class OrderController {
public function addOrderAction() { ...
try { $user-> getAccount() -> getBalance() -> deductAmount($orderTotal); } catch (InsufficientFundsException $e) {
// handle exception } ... }
Here we are again with that woefully used and abused addOrderAction method, looking for a simple way for our customer to pay their damned bill. In this case, it looks like we are passing the value of the order through to a method called deductAmount(), catching the exception that might be thrown if the user doesn’t have enough funds in their account to pay for the order. This is fine, surely? We’ve got error handling in there, and even if you can’t see it in such a small snippet of code, there’s lazy loading going on in the background with the UserAccount and AccountBalance objects.
Don’t Talk To Strangers
287
Lizzie’s certainly familiar with this sort of approach; they did it all the time at her old place and the old place managed to turn out some reasonably useful applications. The latter doesn’t change the fact that Lizzie’s last place was rubbish though. Don’t tell her I said that, we’re lucky to have her here with us. Back to the code. Just what is the problem with it, given the fact that it works when all is well, and catches the exception when they’re not? The answer to that lies not with the fact that it’s perfectly operational today, but that it presents us with a potential problem in the future. Nothing will impact so negatively upon our valuable pub time than code constructs that provide fertile ground for defects to geminate within. With the preservation of pub time firmly set at the forefront of our minds, it’s time to look at yet another name that this particular principle is known by.
The one dot principle. The first thing to note here is that this particular name was devised for those languages that separate objects from their fields with the dot (period) symbol, which in real terms, is pretty much all of them except PHP and Perl. You know the sort of thing. If you’ve ever taken even the briefest look at Java, you will have encountered the following line: System.out.println("Hello, World");
Of course, in PHP we use the arrow ( ->) to separate an object from its fields but the idea is just the same; if you’re utilising more than one dot to access a property or method, the chances are exceedingly high that you’re not following the Principle of Least Knowledge correctly. The second thing to note is that this particular version of the name for the guideline/idiom/principle is the least accurate of the bunch so far. Determining whether we are conforming to or violating the Law of Demeter is a little bit more involved than a process of counting the dots. Still, we haven’t even gotten as far as considering why it’s a bad thing, so let’s get straight onto that now by recalling the offending “line” of code.
Don’t Talk To Strangers
288
$user-> getAccount() -> getBalance() -> deductAmount($orderTotal);
In PHP terms, that’s three “dots” right there… Could we fix that code by adhering to a “single dot” approach like this? $account = $user-> getAccount(); $balance = $account-> getBalance(); $balance-> deductAmount($orderTotal);
The answer is emphatically no. All that we have achieved here is to move each of those three troublesome “dots” onto their own lines. The key problem remains, there’s just too much knowledge here. Not only is it aware that invoking the getAccount() method will return an object that presents the getBalance() method, it’s also aware that the return value of the getBalance() method offers the deductAmount() method. That’s just too much knowledge. In the language of the Law of Demeter, this line of code is reaching through the intermediate objects in order to invoke the deductAmount() method at the end of the chain. The way that we’ve written that code, even after we decomposed the chain down to three individual statements, expresses intimate knowledge of how that chain is structured all the way through to the final method call. What we have here is another instance of tight coupling , which is something that we should always be trying to avoid. Tight coupling reduces the quality of the application code that we write by making it harder to maintain. Changes made to one piece of tightly coupled code require us to review and potentially modify all the other members of the tightly coupled relationship. To put this into perspective, imagine a situation where, six months down the line, the business team pops up with a new requirement: instead of requiring our users to pre-fund their accounts, we’ll be offering credit facilities to a select few of them.
Don’t Talk To Strangers
289
Just to make things a little more complicated, let’s imagine that the pre-funders are going to get to enjoy a discount on their orders but the users that pay on credit are going to be subject to a small surcharge. This is a significant change for the business, but the way things are currently, it’s massive change for the codebase, all thanks due to our tightly coupled design. When this particular story lands in Lizzie’s lap, it’s apparent that she will be working primarily within the UserAccount class to handle the credit versus pre-funded account scenarios. Unfortunately, thanks to our tightly coupled code, she’s also going to have to check through the entire codebase to determine whether other parts of the application are affected as well. For one thing, we know that the addOrderAction() method will have to be modified since it most certainly doesn’t accommodate the notion of juggling credit facilities currently. What if the company offered a monthly subscription service? A notion that’s certainly not beyond the realms of possibility but if the answer is to be a yes, then poor Lizzie’s faced with the prospect of digging through the cron scripts as well. We’d better make sure there’s plenty of coffee in the kitchen in this case. Had we been aware of the Law of Demeter from the outset, all of this extra work might have been avoided. Instead of having our action method reach through multiple objects in order to trigger an order payment, our controller code might have looked like this. class OrderController {
public function addOrderAction() { ...
try { $user->payForOrder($order); } catch (OrderPaymentException $e) {
// handle error } ... } }
Don’t Talk To Strangers
290
The critical line of code is this one: $user->payForOrder($order);
In one fell swoop we’ve eliminated all of that extra knowledge that the action method shouldn’t have had in the first place. Even though I’ve left you to imagine how this method acquires the $user and $order instances in the first place, the simple act of passing the $order instance to the user’s payForOrder() method would have served the pre-credit account era just as well as the post-credit account one. In other words, knowledge of the inner workings of the order payment process has been removed from where it doesn’t belong. Our action method no longer has any kind of idea as to what happens to that Order instance on the other side of the payForOrder() invocation. Now, the only knowledge it has of that process is that if an exception is thrown, it has to deal with it, which is exactly as it should be. If you’ve already encountered the Law of Demeter previously, you might have noticed how there are some distinct parallels between the example that we’ve been working through here and the subject of a paper by David Bock entitled The Paperboy, The Wallet and The Law of Demeter in which he describes the process of paying the paperboy not by having the customer hand over his wallet, but by having the paperboy request payment and allowing the customer to organise how the bill gets settled. In other words, the paperboy has no need of ever seeing the wallet or even to know that it exists in the first place. I heartily recommend reading this paper. It’s quite short and beautifully illustrates the point. http://www.ccs.neu.edu/research/demeter/demeter-method/LawOfDemeter/paper-boy/demeter.pdf
But does this mean that the logic that handles order payments now has to be moved into the User class? Not at all. Given the code that we’ve already seen, and the nature of the story that Lizzie has to deal with, we know that it’s the UserAccount objects that are going to be dealing with order payments. What we’re adding to the User class is a proxy method, like so:
Don’t Talk To Strangers
291
class User { ...
public function payForOrder(Chargeable $order) {
try { $this-> getAccount()->payForOrder($order); } catch (InsuffientFundsException $e) {
// throw new OrderPaymentException(); } } }
For the sake of the One Dot Principle , we are safe to exclude the arrow that comes after any references to $this. I’ll come to that bit shortly.
A proxy method is a pretty simple construct. In this instance it does just two things. Firstly, it makes sure that it’s receiving a parameter that conforms to our Chargeable interface, whatever that may be. Secondly, it passes that Chargeable item on to this user’s UserAccount instance. Remember that I mentioned that the UserAccount instance would be lazily loaded? If this is the case, then acquiring it via a getAccount() method is a sensible way to achieve this. The alternative would be to inject a pre-fabricated UserAccount instance via the constructor, in which case we would probably access it as $this->account instead.
This is the point where we start encountering the downsides to achieving loose coupling by following the Law of Demeter ; a proliferation of proxy methods. In this rather simple scenario, it isn’t really a problem. The addition of the payForOrder() method to the public interface of our User model is both short and to the point. Any other developer on our team ought to be able to look at that code as
Don’t Talk To Strangers
292
see that an order can be paid for just by sending Chargeable compliant instance to the payForOrder() method. Where things do start to get a bit messy is the point where we are adding too many proxy methods. It doesn’t take too long before we end up polluting the public interface of our User model with methods that are only vaguely related to our concept of what a User actually is. With too much of this going on, our User model will end up looking like an implementation of the Facade Pattern . We’re getting ahead of our selves though. If we didn’t want to start going down the route of adding proxy methods, how do we solve the issue in our controller’s action method. One possibility would be to cut out the middleman and work with the UserAccount instance directly, given us code that looks something like this: public function addOrderAction() { ...
try { $userAccount->payForOrder($order); } catch ( OrderPaymentException $e) {
// handle exception } ... }
If you read that aside earlier on in this chapter you might be tempted to exclaim “Aha! But isn’t that just like handing the wallet over to the paperboy?” to which the answer is, thankfully, no. If our action method is behaving like the paperboy, it’s still successfully ignorant of the actual payment process, which is still nicely abstracted away behind the payForOrder() method. What we’re actually looking at here is the action method working directly with an immediate collaborator. Admittedly, our controller might be better served by handing off the entire order/payment process to a service and letting that service co-ordinate the appropriate interactions with the model, but that’s an architectural concern that we’ll look at in Part Five of the book.
Don’t Talk To Strangers
293
Connecting code with its immediate collaborators is exactly what the Law of Demeter wishes for us to do. Let’s take another look at those three bullet points that we started this chapter with. 1. Each unit should have only limited knowledge about other units: only units “closely” related to the current unit. 2. Each unit should only talk to its friends; don’t talk to strangers. 3. Only talk to your immediate friends. By revising our code to cut out the middle man (the User instance), we’ve brought our three “friends” closer together; the Order and UserAccount instances, and the OrderController itself. Even so, we’re still promoting loose coupling between the players within this system; the knowledge of how to handle a Chargeable item remains firmly where it should be and nowhere where it shouldn’t. This in turn frees Lizzie up to play merry hell inside the UserAccount object itself. As long as she receives a Chargeable object and emits an exception when the value of that object can’t be charged to the user’s account, it’s not going to matter too much what the actual logic implementation looks like. No other piece of code has that knowledge, so no other piece of code will get broken due to changes in the way that that knowledge is implemented. This is the situation that we want to find ourselves in, one that promotes loose coupling and one that helps us to properly encapsulate the knowledge of potentially complicated operations into the places where they should be. The Law of Demeter helps us to achieve both of these desirable outcomes.
Summary The Law of Demeter wants us to keep our friends close and our enemies as far away as possible, erecting barricades and boundary fences along the way. Tight coupling is one such enemy and, whilst it’s certainly unavoidable 100% of the time, erecting the proper boundaries allows us to avoid the miserable loss of pub time due to action at a distance effects. In our erroneous early example, our controller code was reaching through both the $user instance and the $userAccount instance to get at the deductAmount()
Don’t Talk To Strangers
294
method of the $balance instance. Hopefully now it should be clear how this can lead to problems. The developer that is working on the UserAccount or the Balance class isn’t necessarily going to be aware that the controller code was accessing the $balance instance’s methods directly. With the appropriate barrier in place, the payForOrder(Chargeable $order); method, the controller code becomes ignorant of any changes to the way that order payments are settled, which is precisely how we want it. One thing that we still have to guard against though is the proliferation of proxy methods that can lead to a bloated, muddied interface. But with loose coupling and encapsulation promoted through the application of The Law of Demeter , we’re golden. Don’t talk to strangers. Unless they’re offering to pick up your bar tab.
Talking Points Now that we have arrived at the end of Part Three, it is time to wrap up those loose ends again. In this part of the book, our journeying has lifted us clear of the grassy veldt and propelled us into the thickly forested sides of the PHP Brilliance mountain-range. We’ve taken a winding trail, always upwards, between mighty boles and sinewy saplings, pausing only to inspect our surroundings when a break in the lush, leafy canopy overhead allowed spears of golden sunlight to push back fleeting shadows and bathe our surroundings in the warming glow of inspection and understanding. For the forest through which we walk is a very particular biome of the software engineering world, where every tree represents an individual design pattern, programming paradigm or software principle. If we were to catalogue every pattern, paradigm or principle out there, it’s entirely likely that we’d need an entire forest just to produce the index cards. There are a lot of these “P” out there, with many a book dedicated to them, though I’d only be inclined to recommend reading (wading) through the two essential design pattern books, Gang of Four book ¹⁷ and Martin Fowler’s Pattern of Enterprise Application Architecture ¹⁸ even though they can be a bit heavy going at times. The forest of patterns, principles and paradigms can be so densely populated at times that it can prove difficult to pick your way through, without getting caught up and tangled on one fancy idea or another. You see, Developers like shiny new ideas and there’s many a developer out there who, once stumbling across a shiny new design pattern, will endeavour to crowbar that pattern into every bit of code that he or she writes for the next week or month or so. It always worth remembering that design patterns and software principles are intended as solutions to problems; very real problems in the case of the former and rather nebulous, conceptual problems in the case of the latter. ¹⁷https://phpbrilliance.com/surl/1a0 ¹⁸https://phpbrilliance.com/surl/x11
295
Talking Points
296
To use a pattern or a principle effectively, it’s important to understand it first. To internalise the hows and the whys and the essence of the problems that these things provide solutions for. After that, it becomes necessary to be able to recognise when you have that problem.
Loose coupling and the art of testability One of the key concerns that have been dealt with in this part of the book, even if it wasn’t mentioned by name all too often, is the idea of loose coupling versus tight coupling. Tightly coupled code is brittle code, code that is prone to breaking, code that presents a very real threat to leaving work on time and getting some quality pub time in. Here’s another example of some tightly coupled code employing the much maligned Singleton design pattern. class Database {
private $pdo; private static $instance; private function __construct($dsn, $user, $pass) { $this->pdo = new PDO($dsn, $user, $pass); }
public static function getInstance() {
if (is_null(self::$instance)) { $config = Config:: getInstance()-> getParams('db'); self::$instance = new self($config); }
return self::$instance; } }
This is the old way of doing singletons, through the provision of a private constructor and a static instance accessor. Now don’t get me wrong, singletons still have their
Talking Points
297
place and it remains a perfectly valid pattern. To be certain, you will never want to instantiate a new database connection for every query that you perform, and so it remains appropriate to employ the Singleton pattern for things such as this. However, the problems with the approach above are manifold, with the principle one being less about the code itself and much more about how this code encourage its usage throughout the rest of the codebase. With or without a namespace, the database singleton instance is globally accessible, leading to code that looks very much like the following. // elsewhere... class OrderModel {
private $db; public function __construct() { $this-> db = Database:: getInstance(); } ... }
So when a developer, one who knows that he needs to avoid using the Courier antipattern and not to pass the database connection through a controller, does something like this in their code, the result is a bit of a tangle. What we end up being left with is a hard-coded dependency; an OrderModel class that is tightly coupled with the Database class, which in turn depends on the Config class. In other words, all kinds of wrongness. This is a prime example of tightly coupled code. Our OrderModel has a clear dependency upon the database class, given that it is hard-coded into the body of the __construct() method. Worse than this is the flagrant disregard for the idea of only talking to our closest friends as per the Law of Demeter . Yet, the worst is still to come; acquiring the database connection in the __construct() method presents us with a particular challenge. All of these things render our OrderModel class prone to failure. How so? Ignoring the fact that there’s no error handling in the code above that would allow it to safely
Talking Points
298
recover from a failed database connection attempt, the tightly coupled nature of this code imposes additional responsibilities on the development team to ensure that there are no adverse effects from code changes occurring both now and in the future. The very best way for a development team to avoid adverse effects from the changes that they make to a codebase is to employ tests. Lots and lots of tests. Software testing can be done in a variety of different ways; functional testing, integration testings, regression testing and unit testing to name but four. Each type of testing involves varying amounts of labour at various stages. Unit testing requires the most amount of labour at the setup stage; you write the code and you write the tests to go with it (or the other way around if you’re into test driven development ) but there’s little, or ideally no effort involved in running the test suites so created. At the other end of the spectrum, functional testing requires little to no effort at the code creation stage but is disproportionately labour intensive at the code delivery stage, especially if the development team is working in concert with a QA team. Engaging in iterative build cycles, such as those encouraged by agile methodologies and lean processes, means that we want most if not all of our testing to be automated. It would serve us ill to engage in a two-week sprint only to have to devote one week of each sprint to testing and sign-off! Clearly then we want to save ourselves as much pain and anguish as possible whilst we go about delivering our code one story at a time. And that means applying the right amount of automated testing at the most appropriate levels. So which levels are the right ones? Fortunately, this problem has been solved for us already. Mike Cohn, in his excellent book Succeeding with Agile ¹⁹, proposed the concept of the Testing Pyramid , which illustrates perfectly how we concentrate most of our efforts at the unit-testing level, proving the pyramid’s broad base with the widest level of test coverage and proceeding to narrow the scope of each testing level until we reach the functional testing tip. Unit testing. Lots of developers hate it, though mostly for unjustifiable reasons, the most common on of which is that it takes too much effort to do it properly. The answer to that though is that bug-fixing takes too much effort and you have no choice but to do it properly, mostly at the expense of spending time working on the new and exciting stuff or worse, sacrificing pub time for it. The horror! ¹⁹https://phpbrilliance.com/surl/e3w
Talking Points
299
Adopting a regimen of effective unit testing, of building up an effective set of test suites with a high degree of code coverage will save you and your team countless hours of future bug fixing. This is a very good thing. In order to build up those test suites though, you have to create code that is testable in the first place. Just to be clear, let’s bring that OrderModel class back into view. class OrderModel {
private $db; public function __construct() { $this-> db = Database:: getInstance(); } ... }
As it stands, this poor model class is impossible to unit test. Why? Firstly, we have a hard coded dependency on the Database class, right there in the code, making it virtually impossible to test this class without having the Database class also present, due to the call on the Database class’ static getInstance() method. I say virtually impossible because there’s a rather clever piece of kit called Patchwork which allows you to rewrite class definitions on the fly, and that includes statically defined class methods too. You should consider as an escape plan for testing legacy code though, rather than a get-out clause for writing untestable code now and in the future! https://github.com/antecedent/patchwork
Secondly, that hard coded dependency is written into the constructor, which in turn means that you can’t test the method directly, only the after-effects that become evident after it’s called (i.e. A new object instance is created). Clearly then, we need a different approach to development since “going live” with an untested and untestable OrderModel doesn’t bear thinking about. Which is what this
300
Talking Points
part of the book has been largely about; promoting code that exhibits high structural quality by promoting loose coupling as a means of achieving testability. Tell, Don’t Ask wants us to encapsulate business logic in the right places; places where that logic is readily testable. The Law of Demeter wants us to decouple our code by only talking (collaborating) with our closest friends. Friends that can be readily replaced with mocks and stubs in a testing environment. Instantiaphobia wants us to concentrate the process of object creation into very distinct locations, either dedicated factories or a dependency injection container. Why? Because new ing up objects willy-nilly automatically entails tight coupling to the classes so referenced. For a more thorough treatment of writing highly testable code, I have one more book recommendation for you and as book recommendations go, this one’s a strong one. In The Grumpy Programmer’s Guide to Building Testable PHP Applications ²⁰, Chris Hartjes delivers some excellent advice on how to focus on writing testable code from the outset. To quote the fellow directly: The best applications are ones that consist of small, loosely-coupled modules that can be combined to solve problems. - Chris Hartjes
This has been a key focus of Part Three ; raising the structural code quality of the applications that we write by promoting the concept of loose coupling. In doing so, we are able to provide a broad base of unit test coverage and thereby go a long way towards guaranteeing that quality. Which brings us back to our poor, old OrderModel, which, if we’re honest, is a likely candidate for needing 100% test coverage. Before we even get onto the matter of testing its methods, we need to fix that constructor.
²⁰https://leanpub.com/grumpy-testing
Talking Points
301
class OrderModel {
private $db; public function __construct(Database $db) { $this-> db = $db } ... }
There. Done. The OrderModel has been fixed in precisely the same way that we saw way back in Part One where our Car class was injected with an EngineAbstract instance. Now our test suites can inject a fake database instance when it comes to testing this class thereby avoiding having to drag an actual, live database connection around with us. What we haven’t covered though is precisely how our OrderModel is going to be given that database connection in the actual application itself. That sort of thing requires the application of “backwards thinking”
Backwards thinking It’s not uncommon, when faced with a new story to code up, for a developer to approach the task from the start. Beginning at the beginning is generally sound advice. But for a web application developer, it’s often more fruitful to flip our coding approach on its head and start thinking things through from the end first. What do I mean by that? For web development, we’re used to seeing the request-response cycle occurring in a certain order; the request comes in and triggers the initialisation stages of our application or framework. Once initialisation is complete, the next stage is commonly handled by some sort of routing process which will identify the controller or service to invoke. The controller or service will then need to talk to a model, which then needs to talk to the database before passing the result back, which in turn gets delivered to a view of some description in order to populate the response to be sent back to the browser.
Talking Points
302
It’s only natural to think of coding up a story based on the order in which things need to happen inside our application, from initial request through to final response. After all, this is quite often the approach taken in the online tutorials scatted around the web. If you Google for “How to make a blogging site in (insert framework of choice here)”, the chances are exceedingly high that the walkthrough of the process will have you setting up the router, then the controller, then the model and lastly the view template. However, this is the approach that most often leads to the appearance of poltergeists and couriers, both of whom are inclined to drag along the spectre of tight-coupling with them. If instead, we take a backwards thinking approach to the problem, we’re more likely to achieve code that is cleaner, more loosely coupled and therefore more resilient to bugs. Let’s walk through the process of creating a blog post backwards just to illustrate this point. First, we need a view that displays a confirmation message and the content of our newly minted blog post. Great, that means we know what data the view requires. Let’s code up the view. Knowing what data our view requires means our model is going to need to return a BlogPost instance, oh and that’s going to need to be persisted to the database. Great, that means we need to set up our BlogPost model with the required database connection. Let’s code up the model, injecting the database connection. Then, hmm, something’s going to need to set that data on the model based on what the user submitted so let’s code up a BlogPostService object that gets given the model object that it collaborates with. Ok, done. How do we get the service instance? Quite likely this will be the way that we normally do at the boundary between our application specific code and the code of whatever framework that we choose to build upon; through the routing component. Building backwards like this allows us to get our dependency injection done right the first time around. It also makes it easier to avoid courier-like situations where we may be tempted to pass things through controllers or services in order to get them to the model or persistence layers.
Talking Points
303
By all means, write the process down in start-to-finish order in the customer story but when it comes to coding it all up, try doing it backwards. And if you’re so inclined, write the unit tests first.
Is your code “iffy”? I touched on this earlier but we’re going to call it out into its own sub-section now. Way back in Tell, Don’t Ask , we hit upon the point of not pulling elements of an object’s state out of the object itself in order to make logical decisions. We need to take a moment to expand on that idea. The keyword IF is an exceedingly dangerous one. You might not believe me, of course. After all, it’s one of the first keywords that we all encounter when learning a programming language. That doesn’t make the initial statement any less true though. In fact, the danger is present when we employ any kind of conditional logic in our code. Statements such is if (...), or while (...) and even foreach (...) all add complexity to the process flow that our application code executes. Anything that adds complexity to our application also provides a fertile breeding ground for bugs. And what do bugs mean? Critical, devastating blows to the preservation of our pub time! Consider this as a slightly extreme example. $uri = $_SERVER['REQUEST_URI'];
if (preg_match ('#^/$#', $uri, $matches)) { // load the home page } else if (preg_match ('#^/blog/([0-9+)$#' , $uri, $matches)) { // load the blog with post id $matches[1] } else if (preg_match('#^/contact$#' , $uri, $matches)) { // load the contact page }
Is this maintainable? No, it’s rather a long way off from being maintainable in the slightest. Yet it illustrates the point. When you employ conditional logic in order
Talking Points
304
to determine which code path to execute next, you’re also setting yourself up for bug-ridden scenarios down the line. As you proceed through the routing process in a manner such as this, you’ll end up with a preponderance of IF statements too unwieldy to manage. Something will go wrong at some point in the future, even if it appears to work just fine right now. The modern way to tackle this particular problem is to build your own router, or much better yet, use a routing package built by someone else and used by thousands of other developers. Two of the more popular routing packages available are Nikita Popov’s FastRoute ²¹ and The League of Extraordinary Packages’ route²², both of which will allow you to set up routing in a highly maintainable fashion. Avoiding complexity is key to achieving maintainability and improving the testability of your code. Save the IF keyword for validation wherever possible and employ patterns like Strategy , Command , Decorator or even Chain of Responsibility to provide branching logic within your codebase. ²¹https://github.com/nikic/FastRoute ²²https://github.com/thephpleague/route
Brain Check Welcome to the very last chapter of Part Three . This is the part where we test ourselves on the material that we’ve covered in this part of the book by asking ourselves, “Could I give a lightning talk on each one of these?” It’s a checklist and a guide for highlighting areas of further reading. Poltergeists The Poltergeist is an anti-pattern used to describe a short-lived object that bears no state of its own and is used to make things happen to other objects or other parts of the system. An object whose sole purpose is to configure another object is a prime example of a poltergeist. Facade Pattern Facades provide a simplified interface to a more complex subsystem. A distinguishing feature that separates facades from poltergeists is that a facade should be provided with access to the components that it is intended to interact with, whereas a poltergeist reaches out and helps itself. Favouring interfaces over implementations A mechanism for achieving loose-coupling by working with the notion of what a collaborator does , rather than with what a collaborator is . In this way, a consumer may be provided with a collaborator that supports a particular interface and therefore a guarantee of what the collaborator is capable of without the consumer ever knowing the actual concrete type of the collaborator. This allows our code to switch different implementations in and out without breaking the application. Favouring composition over inheritance Another mechanism for achieving loose-coupling by allowing code reuse through encapsulating the reusable code into its own discrete object. Whereas 305
Brain Check
306
inheritance hardwires a particular dependency to the parent class, composition allows a dependency to be injected (and therefore switched at run time). The State Pattern Is a particular implementation of composition that allows an object’s state (and therefore behaviour) to be modified at run time by allowing objects representative of the desired behaviour to be injected into an object whose behaviour needs to be changed. The Decorator Pattern Is another way to achieve code reuse through composition, allowing the target object to be wrapped in such a way that one or more methods of the target object may be intercepted by the decorator to provide a different but desirable outcome. The decorator should present the same interface as the decorated object so that interactions with collaborators conclude as expected. The Strategy Pattern Akin to the State pattern, the Strategy pattern allows an object to receive custom logic or algorithms based on environmental conditions. Where the State pattern represents changeable, but internally focussed behaviour, the Strategy pattern allows for changeable external behaviour when dealing with collaborators. Tell, Don’t Ask An object should not be queried for elements of its internal state in order to make logical decisions based on the values found there. Instead, an object should be told to perform a particular operation itself, providing any necessary parameters in order to achieve this goal. Extracting elements of state from an object in order to inspect and potentially make decisions based on them leads to a form of tight-coupling as the code doing the inspecting will rely on the internal structure of another object; knowledge that it should not have. Instantiaphobia By applying a rule of “one new per class”, developers know that there will be only one other location to inspect when coding changes affect the way that an object instance is created. This principle also supports loose-coupling by eliminating the hard coded dependencies created through the use of the new keyword at unspecified locations in the codebase.
Brain Check
307
The Courier anti-pattern The Courier anti-pattern crops up whenever an object receives a dependency that it doesn’t use itself. Instead, that dependency was passed to it in order to subsequently pass it on to another of the object’s collaborators. This presents a form of tight coupling in that the intermediary (or courier) has implied knowledge of what the final recipient needs in order to function. If those needs change, then it follows that code changes will be required in the courier also, adding fragility to the codebase. The Law of Demeter This law that isn’t a law is perhaps the most effective principle when it comes promoting loose-coupling. Also known as the principle of least knowledge, it becomes violated whenever code exhibits awareness of another object’s internal structure. All objects should only be aware of the public interfaces of the other objects that it directly collaborates with. In this way breakages may be avoided when the internal implementation of other objects is changed but the public interface remains the same. The Testing Pyramid An illustrative concept that encourages us to have a broad base of unit testing rising to a narrow pinnacle of functional and manual testing, with layers of other types of testing in between; e.g., integration testing. Anything with a broad base is generally more stable than its narrow based brethren. However, many software projects take the inverse approach and focus on testing the user interface first, leading to something known as the testing ice-cream cone . Ice-cream cones are not generally stable, even when placed on a level surface.
Moving on to SOLIDs “I know you gentlemen have been through a lot, but when you find the time, I’d rather not spend the rest of this winter TIED TO THIS FUCKING COUCH! ” - Garry
308
Introducing the SOLID principles Placeholder chapter - content coming soon!
309
Placeholders Is this the end? Not at all, but some readers have suggested that I include placeholder chapters (essentially empty chapters) in order to provide some way of illustrating the content that is still to come. Experimentally then, I’ve done just that. The chapters that follow are empty placeholders that will progressively be filled in with content as I work through each release. Is this a good idea? Do you find it preferable to see what more there is to come? Or do the empty chapters get annoying if you keep flicking from the table of contents only to find an empty page? Please do let me know your thoughts: Thunder [email protected]²³ In the meantime, you can jump straight to the content at the end of the book by clicking / tapping here ²³mailto:[email protected]
310
The Single Responsibility Principle This is an empty placeholder chapter. Content coming soon!
311
The Open / Closed Principle This is an empty placeholder chapter. Content coming soon!
312
The Liskov Substitution Principle This is an empty placeholder chapter. Content coming soon!
313
The Interface Segregation Principle This is an empty placeholder chapter. Content coming soon!
314
The Dependency Inversion Principle This is an empty placeholder chapter. Content coming soon!
315
Talking Points This is an empty placeholder chapter. Content coming soon!
316
Brain Check Welcome to the very last chapter of Part Four . Being able to deliver a lightning talk on each of the SOLID principles is especially challenging, but not beyond the realms of possibility for sure.
317
Applying Software Architectures “I was wonderin’ when El Capitan was gonna get a chance to use his popgun. ” Palmer
318
Introducing Architectures This is an empty placeholder chapter. Content coming soon!
319
To MVC or not MVC This is an empty placeholder chapter. Content coming soon!
320
Service Oriented Architecture This is an empty placeholder chapter. Content coming soon!
321
API Oriented Architecture This is an empty placeholder chapter. Content coming soon!
322
The Architectural Fortress This is an empty placeholder chapter. Content coming soon!
323
Talking Points This is an empty placeholder chapter. Content coming soon!
324
Brain Check This is an empty placeholder chapter. Content will be coming soon.
325
DON’T PANIC! We’re not finished yet. Even though we’re about to move into “appendices” territory, there is still plenty of the book’s main content still to drop over the coming weeks. I’ve established the beginning of the appendix on PHP7 due to its highly topical nature. At this stage we now have an agreed upon list of features that are expected to make it into the final release in October 2015. Not only that, with the nightly builds already available, we have the opportunity to try out and familiarise ourselves with some of the new features before the language changes impact upon our daily coding.
326
Appendices
327
PHP7 It should be borne in mind that, at the time of writing, PHP7 has only just gone GA, meaning that we are likely to see a flurry of bug reports and bug fix releases in the very near future. That being said, most of the content included here is based on the RFCs meshed with my own experimentations of using the PHP7 nightly build. This appendix is likely to be in a state of great flux. With a new major version landing on our desktops, we may have a little time to wait before we see some sort of stability.
With due appreciation of that warning up above, let’s now start to take a look at how some of the changes coming in PHP7 will impact upon the material that we’ve already covered in the first three parts of the book.
Scalar Type Hinting This is perhaps one of the most impactful changes to the language that PHP7 brings with it, something that is readily apparent simply by observing the amount of heated debate that the topic has generated whilst the RFCs were being proposed, debated, rejected, redrafted and resubmitted. Why such a heated debate? The polarisation stems from the consideration of whether PHP should be a weakly typed language or not, and if not how is the stronger typing support to be implemented in a way that causes the least disruption to the PHP developer community.
In the blue corner The proponents for retaining weak typing argue that PHP should continue to honour its historical roots and behave like it always has in the past. This means that it should continue to support such well established expectations as being able to send a string that looks like a number to a function that requires an integer or a float. Like this 328
329
PHP7
class Calculator {
function addThem($a, $b) {
return $a + $b; } } $calc = new Calculator();
// A string $second = 4.5; // An float $first = "12";
echo $calc-> addThem($first, $second); // Output:
16.5
Even though the first parameter provided is a string variable, internally PHP can handle the type conversion to a float and perform the addition quite readily, leading to the result that we would expect. Retaining the weakly typed nature of PHP allows the language to follow the principle of least astonishment: It’s always worked like this, there’s no reason why we need to surprise everybody by breaking their applications when they pass number-like strings to functions requiring strictly numeric values. This argument is particularly relevant in the context of the application that employs Composer as a dependency manager. What would happen if one of the packages that we’re importing into our application started declaring method parameter types that we are unprepared for? Bearing in mind that the values in the $_GET and $_POST arrays are always strings, one vendor releasing a strongly typed update to their package could mean that we end up with a lot of refactoring to do just to get our application working again. How so? Imagine that the calculator class from the first example was in a package that I provided and that your application depended upon. If I elected to go for a strongly typed implementation, it might end up looking like this.
330
PHP7
class Calculator {
function addThem(float $a, float $b) {
return $a + $b; } }
That’s only a very small, but very significant change. The signature of the addThem() method is now declaring that it needs to be given two floats as the type hinting suggests. Despite the fact that the actual code change is such a small one, the impact is enormous. To use the new version of this package would now mean that we have to search through our entire codebase to find every location where this method is called, making the relevant changes as we go along to ensure that that method is only ever given two float values.
// Where previously we used $calc-> addThem($first, $second);
// We would now need something like this $calc-> addThem((float) $first, (float) $second);
Naturally, that places quite a significant burden squarely upon our shoulders. If the package provider was also bundling a major security fix with that update, we would have no choice but to follow the refactoring route - we either meet the new demands expressed through the type hinting in order to take advantage of the security fix or we switch to an alternative package that provides much the same functionality. Given that it’s already perfectly possible for us to write high quality applications, do we really need to upset the apple cart so much by introducing such a significant change to the language? The tools are already there to accommodate the foibles of userland implementations by casting the inputs to the specific types that we expressly want to work with. If we really want to get pernickety about the variable types coming in, we can validate for those types and throw up errors and exceptions if client code is getting it wrong.
PHP7
331
Clearly then, the opponents have a very strong case. How do the arguments stack up on the other side of the ring?
In the red corner The benefits for being able to type hint for scalar values are immediately obvious, but before we even get on to considering those there’s a particularly relevant point to make first. We can type hint for everything else. If you loop your mind back to the beginning chapters of the book, you’ll remember the effort we went through to consider abstract classes as a means of laying down definitions for new data types. Why did we do this? We did this in recognition of the fact that the language’s native data types don’t fully represent our needs. They’re not intended to of course. Through the provision of our own custom data types we are making it possible for our application’s code, and more importantly the developers who write it, to enjoy the knowledge of how these things are intended to work. A key part of this is that the abstract class documents as best as it can the intended behaviour of the instances derived from this data type. Part of the abstract’s documentation includes informing the client code, the collaborators, how many parameters each method expects and of what type they should be. Unless those parameters are intended to be scalar values, that is. Further along, we re-examined the role played by interfaces and how they allowed us as developers to express guaranteed method availability in the objects that implement them. In much the same way as with abstract classes, we are able to express the requirements of the methods in those interfaces quite succinctly. Again, with the exception of parameters expected to be scalar values. The provision of scalar type hinting completes the circle. This alone provides enormous value to us as developers keen to exhibit PHP Brilliance. It’s another weapon in our armoury in the fight against the army of bugs that threaten constantly to creep into our applications and eat away at our precious pub time. Going forward then, scalar type hinting would make it possible for us to “type hint all the things”. As far as the input parameters are concerned, every method we come to write would be able to declare for everything it needs and expects, rather than just for most of them.
332
PHP7
In this regard, scalar type hinting would eliminate one more element of fragility in our code. Reducing fragility naturally means increasing the robustness and security of the resulting applications. If we can type hint for strings, integers, floats and booleans like we can for everything else, we will no longer have to expressly validate for them. Nor will we need to recast input parameters to the types that we’re expecting to work with, a sub-optimal approach that can lead to the subtle and silent corruption of our data.
class Calculator {
public function addThem($a, $b) { $a = (int) $a; $b = (int) $b;
return $a + $b; } } $calc = new Calculator();
echo $calc-> addThem("3.14", 2); // Output:
5
instead of the expected 5.14
We made that bug. Well, okay, I’m quite happy to accept all of the blame for the above. As the author of this particular version of the Calculator class, I was only expecting to work with integer values, which is why I cast the inputs to integers before adding them. Nevertheless, the client code beneath the class definition is expecting the calculator instance to just, you know, add the numbers together. It’s a perfectly reasonable assumption since that is what is implied in the method signature. Of course, the code in my class would have been better made if it didn’t silently mangle the inputs that the addThem() method was taking in by casting them to integers. Silent input mangling happens all over the place in PHP world and leads to everso subtle bugs cropping up simply because our code is trying to coerce unexpected values into the types that the internal logic needs to work with. Scalar type hinting
333
PHP7
allows us to avoid this kind of mangling, leading to much more expressive method signatures and much more explicit instances of where casting from one type to another is required.
It’s a knockout! After rather a lot of to-ing and fro-ing on the RFC front, those smashing ladies and gentlemen on the PHP Internals team finally arrived at a version of the document²⁴, which went on to find favour with an overwhelming majority. This means that we can start to consider how we might take advantage of this quite significant change. Mercifully, distilling that rather voluble document down into actual implementation details leaves us with something terribly easy to assimilate into the box of knowledge sat atop of our shoulders. Once a technique is in our heads it becomes correspondingly easy to implement in the code that we produce. There’s a caveat though. There’s always a caveat. First of all comes a quick run down of the scalar types and the keywords that we use for hinting them. Scalar data type
Type hint “keyword”
Strings Integers Floats Booleans
string int float bool
It’s worth noting at this stage that the RFC specifically excludes the provision of aliases, so for example, in the case of integer values, we have to use int as the type hint and not integer. Equally, we’ll need to adhere to bool as a type hint rather than boolean. A little sample code should make their usage abundantly clear, so let’s take a look at that.
²⁴https://wiki.php.net/rfc/scalar_type_hints_v5
334
PHP7
class ScalarTypeHints {
public function setInt(int $int) {} public function setFloat(float $float) {} public function setString(string $string) {} public function setBool(bool $bool) {} }
There isn’t really a lot to add there in terms of the implementation details, but there are a few curios to consider in terms of their operation still. Simply writing the scalar type hints into the method signature isn’t sufficient to guarantee that the method will only accept the data types that are being hinted for. If you’re presently scratching your head over that particular statement, you wouldn’t be alone in doing so. Clearly, we need to look into this then.
class Calculator {
public function addThem(float $a, int $b) {
return $a+$b; } }
Notice here that I’ve subtly changed the second parameter to type hinting for an integer instead of a float. You might reasonably expect that this would mean PHP will error out if we tried to give it anything other than an integer for parameter number two. Even though this is the case with any other data type, it’s sadly not true for the scalar type hinting implementation as it stands today. What this means in real terms is this: When you type hint for a particular class or interface, failing to provide a parameter that supports that class or interface definition results in a fatal error. This is exactly how it should be of course.
PHP7
335
With scalar type hinting it’s different. Since PHP is historically a weakly typed language, it’s always supported the idea that the underlying type of a particular variable can be changed to suit the circumstances described in code. Like this: $myVar = "42"; // $myVar defined as a string var_dump($myVar * 2);
// $myVar type internally changed to integer to perform multiplication // Output: int(84)
As a result of this, it looks like PHP will continue to support the internal casting of a variable’s type from one to another. So when we look again at the addThem() method, what will actually happen is that PHP will attempt to cast the first parameter to a float if it isn’t already is one. Likewise, the second parameter will be cast to an integer, again if it isn’t already an integer. Where these casts are possible, it does at least mean that the logic inside the method will get the data types that are being type hinted for. On the other hand though, it allows for a very subtle form of data corruption, which we should be particularly interested in avoiding. $calc = new Calculator(); var_dump($calc-> addThem(3.5 , "2.75"));
// Output: float(5.5);
Quite reasonably, we would have expected the result to be 6.25. That isn’t what we’ve gotten back though, we’ve been given the answer of 5.5 instead. It would appear that our calculator class is broken. The reason why comes from the fact that our second parameter, “2.75”, was silently cast to an integer on the way in. As a result, our method’s internal logic was dealing with an integer value of 2, rather than the intended value of 2.75. If the type cast is possible, PHP won’t complain. Even if it means that converting floats and float-like strings into integers results in data corruption occurring from our perspective as the developers and bug-fixers of the code.
336
PHP7
This represents a significant challenge. If we type hint for a parameter that supports the ArrayAccess interface, PHP will complain when we fail to give it one. If we type hint for a scalar value, PHP7 will only complain if it can’t perform the type cast internally. Given that we’re perfectly capable of explicitly casting from one scalar type to another if we truly needed to, i.e. $int = (int) "31.25;", I think it’s preferable to have PHP complain when we have a method expecting one type but getting a different type passed in. Can this be achieved? The answer is yes, but the solution is rather a clunky one in my opinion. We can tell PHP7 to turn on strict type checking by adding a statement to the top of every file that needs it. The format of that statement is this.
declare(strict_types=1);
It should be noted at this stage that that declare() statement must be the first directive in each file that requires strict type checking. When that declaration is in place at the top of the file, then our scalar type hinting will work as we expect it to. That is to say, PHP will raise an error if we tried to pass a float as a parameter that was hinted as being an integer. Equally, we’ll also get an error when we try to pass a numeric string when the hint requires a float or an integer. Just to confuse matters, when strict type checking is turned on, you can still present an integer to a method expecting a float. This is a “feature” since integers can be readily cast to floats without any loss or corruption. Integer 42 quite simply becomes float 42.0.
This is what is needed then. Rather than the type casting happening silently in the background, if what we have is a string and it needs to be either of the numeric types, we will be required to explicitly perform the cast in our code instead. This significantly improves the visibility of any errors that we make when performing the type casts manually and as such, will give us an easier time of it when identifying and fixing the bugs as a result.
337
PHP7
One thing more must be noted; the declaration only applies to the code that appears in the same file, or to code that is consumed via a trait. Any code that is pulled in via the require or include directives won’t be covered by the strict types declaration unless those files also have the same declaration at the top too. I mentioned traits up above and I think these little beasties are worthy of further consideration. You might be tempted to think that adding the declaration at the top of a trait’s file should be sufficient to turn on strict type checking for the contents of that trait. What do I mean by this?
declare(strict_types=1); trait AddingTrait {
public function addThem(float $a, int $b) {
return $a+$b; } }
Despite having told you that the declaration needs to be at the very top for any code that requires strict type checking to be turned on, it doesn’t work for traits. Whether this is intentional or it has been overlooked, it’s certainly something to be wary of. We might be tempted to think that the following code would now work.
class Calculator {
use AddingTrait; }
Unfortunately, that’s not the case. In short, a trait cannot turn on strict type checking for its own methods. Instead, the onus is on the class that consumes the trait. With our example, this means that the file containing the Calculator class is the one that must make the strict types declaration. Whether the trait does it or not is irrelevant.
PHP7
338
declare(strict_types=1); // trait methods are now covered class Calculator {
use AddingTrait; }
So, in order to use strict type checking for scalar type hints, you have to remember to add the declare(strict_types=1); as the first line of PHP in every file that needs it. Should you miss one, PHP will try to silently cast mismatched parameters to the types being hinted for. As I mentioned before, this feels rather clunky to me. I would much rather have seen the strict type checking being the de facto standard, with the option to turn on implicit casting instead. Even if this goes against the grain of PHP’s historical background, the resulting improvement in code quality would be worth it. From the perspective of achieving PHP Brilliance, we need to remember to turn on strict type checking in every file that will have code type hinting for scalars. Any scalar values that need to be cast from one type to another should be done explicitly in the code so that our intent remains clear both to ourselves and anyone else reading it. Even with the additional burden of remembering that declaration at the top of the file and then having to perform any type casts manually, the provision of scalar type hinting is a major step forward for the language.
Return Type Hinting I left half of the tale untold in the previous section on scalar type hinting since I wanted to bring that forward into this section. Return type hinting is exactly what you might imagine it to be: the ability to declare for the data type being returned from a method or function call. You might assume that I’m about to go on again about interfaces, contracts and guarantees, but actually I’m not going to do that. We’ll get there shortly, but first of all, let’s take a look at the proposed implementation.
339
PHP7
declare(strict_types=1); class Calculator {
public function addThem(float $a, $float b): float {
return $a+$b; } }
Hopefully, you’ll recognise this class from the previous section of this appendix. If you don’t, where is your brain at? We were only just discussing this bit a few pages ago. I suppose you could be forgiven if you happen to be reading this in a coffee shop crewed by remarkably attractive baristas. Alright, focus. Here we are again with a very simple object that allows us to add two numbers together. As you can see, we’re declaring for the float parameters types of both $a and $b. In addition, we’re now declaring for the float return type by appending to the method signature a colon followed by the return type. As a result, the public interface of objects of this type has now become much more expressive. Not only can we see what types of parameters the addThem method requires, but also we can be confident that the method invocation will give us back a float value as the return type. Not a string, nor an array, but a float. This is a great boon for us. Since we should be striving for code that is as bug-free as it possibly can be, this gives us the ability to express in code a fully documented interface finally. The whole gamut of input parameter types are now catered for with the addition of scalar type hinting, complemented by the ability to type hint the whole gamut of data types that a method will return. When our colleagues and co-developers read the code that we’ve written, the ability to readily discern the type of thing that will come back from a particular method invocation has enormous value. Should such a method fail to return the correct data type, the failure will be isolated and localised to the method itself, which in turn can lead to some very targeted bugfixing attention. The benefit of this becomes even more apparent when you consider
340
PHP7
the situation that we frequently face in PHP code today. When a method returns a value of an unexpected type, the errors that crop up from this can appear pretty much anywhere within the code base. What does this mean for our guarantees, our contracts, our abstract class interfaces? Quite a lot as it happens. The key factor in what we are about to consider is the fact that the initial implementation of return type hinting in PHP7 is of the invariant variety. This will have a direct impact on our use of return type hinting in child classes that derive from an abstract base class that employs return type hinting itself. We’ll consider this in more detail shortly but in brief it means that the methods in specialised child class implementations are unable to apply a return type hint that is different to the same method in the parent. That statement deserves a little code to help clarify the point. If we return once more to a much earlier example, that of the SportsCarFactory, we can see how invariant return type hinting impacts upon our coding approach. abstract class CarFactory {
/** * @return Car */ public function getCar(): Car { ...// make a car
return new Car(); } }
Here the abstract parent CarFactory class is hinting for the return type on the getCar() method. Specifically, it’s declaring that the return type is Car . Now, having created a specialised child class of type SportsCarFactory, we might reasonably think that since our actual return value will be an instance of a SportsCar, we should be type hinting as such in the method declaration.
341
PHP7 class SportsCarFactory extends CarFactory {
public function getCar(): SportsCar { ..// make a sports car
return new SportsCar(); } }
No matter how much we might want to instruct our client code that calls to this method are going to return a SportsCar instance, to be able to do so would require PHP7 to support covariant return type hinting. Since PHP7 is getting the invariant flavour, the code above will not work and the interpreter will quite happily tell us why with an error. So let’s honour the invariance and fix our code class SportsCarFactory extends CarFactory {
public function getCar(): Car { ..// make a sports car
return new SportsCar(); } }
The problem is now solved by changing the return type to be exactly the same as declared in the parent class, i.e. Car. Having done so, we are now satisfying the requirements of invariant return type hinting. We might also start to feel a little itchy about the declaration of this class’s interface, given that it declares for a Car but actually returns a SportsCar instance. This itchiness only comes about though when we consider the SportsCar class in isolation. As long as we remember that it is actually the parent class that is laying down the rules, we can see that the invariant flavour works massively in our favour. For the developers accepting the stance that an abstract class provides the blue print for a new data type, the application of invariant return type hinting puts them at a distinct advantage. It does this by inadvertently but beneficially locking down the
PHP7
342
characteristics of that data type. If every child class behaves at the interface level as the abstract parent demands, which is precisely what this type of hinting supports, then our application code will naturally remain type safe. The end result of achieving this for us is a lot less bug fixing in the future, at the expense of more careful coding now. Remember though that fewer bugs equals more pub time! For a lot of the developers that engage in inheritance abuse , this feature will most likely prove problematic for them. As such, I can’t see a great deal of widespread adoption going on for this new language feature. For those that do adopt return type hinting, there are a few ancillary considerations which need to be borne in mind. Unlike type hinting for scalar input parameters, return type hinting for scalar values does not require the declare(strict_types=1) declaration at the top of the file. This apparent inconsistency between scalar type hinting for input parameters and scalar type hinting for return values is worrisome. In any case, I would propose that the developer performing any kind of type hinting for scalar values makes it an unswerving habit by always including the declare() statement at the top of the file. Why? Because it’s the safer assumption to take that a fellow developer is likely to infer from the presence of scalar return type hinting that scalar type hinting for input parameters has already been “switched on”. Should we place the onus on our colleagues to verify that the declaration for strict types has indeed been included in the file even though we don’t require it for our own return type hinting? Or are we better to mitigate against any future unhappy accidents by providing the declaration anyway? The difference in behaviour between scalar type hinting for input parameters and scalar type hinting for return values is already confusing enough. Moving on, we should look at the implications for concrete interface implementations. If the application of return type hinting on abstract classes provides us with the ability to extend the rules applied to child classes, the good news is that applying return type hinting inside interface definitions helps us to achieve the same sort of rigorous guarantees inside the otherwise unrelated objects that implement such an interface. A contrived example should illustrate this perfectly.
PHP7
343
declare(strict_types=1); interface ReturnTypesInterface {
public function getFloat(): float; public function getString(): string; public function getArray(): array; public function getDatabaseAdapter(): PDO; }
As we can see quite clearly, the application of the return type hinting at the interface level gives us the power to extend the coverage of the guarantee that we provide quite significantly. Any objects that implement such an interface, even though they may be perfectly unrelated to each other in every other sense, would now be required to honour the return type declarations of such a guarantee. This is fantastic news for the maintenance of type safety within our application code. To illustrate this effect, let’s consider the following code
declare(strict_types=1); interface FooGuarantee {
public function getFoo(): Foo; }
Here we have the interface declaration that requires implementers to provide a getFoo() method. Those implementers must also ensure that their version of this method includes the Foo return type exactly as is written here. That is, the implemented method must also specify “: Foo” in its method declaration. Since we are now required to provide the return type hint, our implementations of this method will now also be required to actually return a value that is either a specific instance of a Foo object or an instance of a child class that extends the Foo parent.
344
PHP7 class FooMaker implements FooGuarantee {
public function getFoo(): Foo {
return new Bar(); } }
Our FooMaker class honours the terms of the FooGuarantee by providing an identically specified getFoo() method, including the return type hint. However, you might note that the actual method itself returns a Bar instance. This can only work when Bar is a child class of Foo. If Bar doesn’t include the required Foo class as one of the parents in its inheritance hierarchy, PHP7 will happily choke and die. Avoiding fatal errors, even catchable ones, is something that we wish to strive for. Whilst we’re on the topic of interfaces, we need to consider whether an interface itself can be used in the return type declaration. In other words, are we able to guarantee that the value returned from a method call will itself honour its own guarantee. A quick peek at an example should clear this right up. interface Foo {
public function getArray(): array; }
Here’s our basic starting point. An interface called Foo that guarantees an array will be returned from the method getArray(). interface FooAble {
public function doFoo(): Foo; }
And here we’ve specified that implementers of the FooAble interface must provide a doFoo() method. So far, this is as expected.
PHP7
345
However, the doFoo() method includes the return type hint of “: Foo”. What this means in practice is that the concrete implementations of the doFoo() method are required to return objects that subsequently implement the Foo interface. This in itself is hugely significant for PHP code. Conclusion? Return type hinting represents a very positive step towards achieving very robust, type safe application code. Thumbs up.
Up next. In your next thrilling instalment… This section of the book will change with each successive release and as such, it’s my touchstone with you the reader. I’ll keep this page up-to-date as a form of communicating what’s happened since the last release and what’s coming up in the next one.
Happy New Year! I hope 2016 brings you happiness and success in all of your endeavours. This is the second content drop of 2016 and as you will have seen, we’ve reached the end of Part Three. There’s a little polishing to do, for sure, but the exciting news is that we’ll be making inroads on Part Four in the very near future. I’m very much looking forward to sharing my ideas on the SOLID principles with you. It’s my belief that a mastery of this canon, this collection of the five key principles, underpins what makes a developer truly shine. As such, they’re key to achieving PHP Brilliance and I can hardly wait to start sharing that content with you. You might have noticed that I’ve finally also added The More Pub Time principle right at the start of the book. If you missed it, please do check it out now. Your help and assistance in the realisation of this project is, as always, hugely appreciated. There are a number of ways that you can help. Primarily, feedback. Whilst I’ve had some very good suggestions from a few of you, for which I am grateful, I could use some more. How are you finding the content? Too wordy? Not wordy enough? Am I missing topics that you might have expected to be in here. If you can take a few moments (or more!) to drop me a line with your thoughts, impressions and most importantly, criticisms and complaints, please do so. The 346