Demystifying Serverless Computing
120 ` 120
A Glimpse Of Microservices With Kubernetes And Docker
ISSN-2456-4885
Volume: 05 | Issue: 12 | Pages: 112 | September 2017
Splinter Simplifies Web App Testing
How To Secure And Test
WEB APPLICA APPLICATIONS
A Few Tips For Scaling Up Web Performance Getting Started With PHP, The Popular Programming Language
Interview: Karanbir Singh, Project Leader, CentOS
Case Study: Open Source Enables PushEngage To Serve 20 Million Push Notifications Each Day!
https://2ndQuadrant.com •
[email protected] India +91 20 4014 7882 • USA +1 650 378 1218 • UK +44 870 766 7756
Experience Innov Innovation ation without lock-in!
2ndQPostgres More speed More reliability More scalability
ISSN-2456-4885
Admin 29
A Primer on Software Dened Networking (SDN) and the OpenFlow Standard
35
Taming the Cloud: Provisioning with Terraform
40
Visualising the Response Time of a Web Server Using Wireshark
42
DevOps Series Creating a Virtual Machine for Erlang/ OTP Using Ansible
47
An Introduction to govcsim (a vCenter Server Simulator)
57
A Glimpse of Microservices with Kubernetes and Docker
Developers 59
Selenium: A Cost-Effective Test Automation Tool for Web Applications
65
Splinter: An Easy Way to Test Web Applications
73
Crawling the Web with Scrapy
Serverless Architectures: Demystifying Serverless Computing
51
Using the Spring Boot Admin UI for Spring Boot Applications
61
77 Five Friendly Open Source Tools for Testing Web Applications 81 Developing Research Based Web Applications Using Red Hat OpenShift 85
A Few Tips for Scaling Up Web Performance
90
Regular Expressions in Programming Languages: The Story of C++
FOR U & ME 88
Open Source Enables PushEngage to Serve 20 Million Push Notications Each Day!
96
Eight Top-of-the-Line Open Source Game Development Tools
4
REGULAR FEATURES 06 FOSSBytes
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
18
New Products
108
Tips & Tricks
FOSSBYTES Google launches ‘Made in India’ programme to showcase local developers Google has launched its own ‘Made in India’ initiative. The new development is designed to promote Indian developers, giving them a chance to feature their work on the Play Store in a special section. “At Google Play, we are committed to helping Indian developers of all levels seize this opportunity and build successful, locally relevant businesses,” said Purnima Kochikar, director of business development for games and applications, Google Play. Google highlights that more than 70 per cent of Internet users in India enter the Web primarily using smartphones. This growth in smartphone usage has prompted the company to encourage domestic developers to build more apps and games. More content could pave the way for the Android maker to strengthen its presence in India and around the globe. Revealing some numbers, Google underlines that Indian users on Android install more than a billion apps every month from Google Play, and this number is growing by 150 per cent each year. The ‘Made in India’ initiative was launched as a part of the App Excellence Summit that was recently hosted in Bengaluru. Google also showcased success stories from developers including Dailyhunt, Healthifyme, RailYatri and UrbanClap that are all natively building apps and services for the Android platform. Skill-building consultation sessions and demonstration booths were available at the venue for developers. Indian developers who want to participate in the ‘Made in India’ programme by Google need to ll in a self-nomination form. The apps need to be based on Google’s ‘Build for Billions’ guidelines that were launched last year.
PiCluster v2.0 brings better container management for Docker deployments Linux Toys has announced PiCluster 2.0. The new version of the open source container management tool is written in Node.js and is designed to deliver an upgraded experience through cleaner CSS and JQuery dialogue windows. PiCluster 2.0 brings automatic container failover to different hosts. It xes reported errors in npm build dependency as well as utilises enhancements on the CSS front to deliver a fresh look to the tool’s Web console. Additionally, users can deploy container management without Internet access by using the Web server to deliver required libraries. On booting up PiCluster 2.0, you’ll be welcomed with a new screen. The open source community has also contributed a lot of features to the latest PiCluster
6
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Compiled by: Jagmeet Singh
Angular 5 is out, with a focus on progressive Web apps Google has released the next major version of its JavaScript framework, AngularJS. This latest version, Angular 5, is the second major update in 2017. While the initial release is a beta build of Angular 5, the search giant is clearly aiming to introduce major support for Google-driven progressive Web apps with the latest development. The new version includes a build optimiser that helps to reduce the code of progressive apps designed through the framework. Google is working hard at simplifying the effort that goes into building progressive Web apps. The purpose of this new innovation is to improve the experience for users accessing services through their mobile devices.
In addition to its progressive Web app focus, Google is integrating Material Design components into Angular 5. The design components in Angular 5 are now compatible with server-side rendering. Google is not the sole enabler to have enhanced browser-based apps. Mozilla is also set to offer a nativelike experience on its Firefox browser by bringing progressive Web apps to the front. The team behind PWAs (progressive Web apps) is working towards making these as the technology that everyone can use. The release cycle of Angular has been quite aggressive. Google plans to release the next major version, slated as Angular 6, sometime in March or April next year. Meanwhile, the theme for Angular 5 is ‘Easier, smaller, faster’.
FOSSBYTES
version. One of the initial contributors worked on the x for npm dependency errors and pm2 support. Another notable contribution improved the Web console by adding personalisation options. Previous versions of PiCluster were used to display a server icon on specic operations. However, this new build shows the operating system’s or distribution’s logo for each server. There is also an automatic container failover that helps you to automatically migrate a container to another host after three failed attempts. Many developers have started contributing to the PiCluster project. You can access the PiCluster 2.0 code through its GitHub repository. It also includes a detailed readme to help you deploy the tool effectively.
ActiveRuby debuts with over 40 gems and frameworks ActiveState, the open source languages company, has graduated its Ruby release to the rst beta version. The commercially supported Ruby distribution is supposedly far better than other available options. Ruby is actively used by a diverse set of developers around the world. The language is preferred for its complete, simple, extensible and portable nature. ActiveRuby is based on Ruby v2.3.4 and includes over 40 popular gems and frameworks, including Rails and Sinatra. There is also seamless installation and management of Ruby on Windows to reduce conguration time as well as increase developer and IT productivity. Enterprise developers can adopt the latest Ruby distribution release internally to host Web applications. The Canadian company claims that ActiveRuby is far more secure and scalable for enterprise needs. The beta release of the language has xed some issues of gem management to enhance security. The new ActiveRuby version also includes non-GPL licensed gems. All major libraries for database connectors, such as MongoDB, Cassandra, Redis, PostgreSQL and MySQL, are also included. Additionally, ActiveRuby beta introduces cloud deployment capabilities with Amazon Web Services (AWS) along with all the necessary integration features for AWS. “For enterprises looking to accelerate innovation without compromising on security, ActiveRuby gives developers the much-needed commercialgrade distribution,” said Jeff Rouse, director of product management, ActiveState, in a statement. ActiveRuby is currently available only for Windows. The release for Mac and Linux is supposed to roll out later in 2017. You can download the beta through the ofcial ActiveState website.
GNOME’s disk utility to get large file support in v3.26 Though GNOME’s disk utility will receive an update to version 3.26 in September, it is now expected to receive features such as disk resize and repair functions. The new version will also get large le support to handle giant les. The new disk utility will be launched as part of the GNOME 3.26 release. Kai Lüke, the developer of GNOME Disk Utility, has published a blog post that highlights the new features in the upcoming release. The latest version is touted to offer a le system resize. Generally, it is not possible to estimate the exact space occupied by a specic le system. So the new disk utility package will resize le systems that are in partitions. The future releases will also receive improved support for both NTFS and FAT le system resizing. The updated GNOME disk utility will also have the ability to update the window for power state changes. Additionally, the new version will prompt users when it stops any running jobs while closing an app. It will debut with better support for probing and unmounting of volumes. GNOME developers will enable an app menu entry in the new disk utility. This will help you create an empty disk image. Likewise, you will get the option to check the displayed UUIDs for selected volumes. GNOME 3.26 is scheduled to go live on September 13. You can download Disk 3.25.4, which has been released for testing. Its source tarball is available for download, and you can use it with your GNU/Linux distribution.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
7
FOSSBYTES
Arduino founder plans ‘sustainable’ growth Massimo Banzi, the developer of the Arduino board, has agreed to acquire 100 per cent ownership of Arduino AG, the company that owns all Arduino trademarks. The latest development is supposed to help the company generate sustainable growth through its open source hardware and software developments. “This is the beginning of a new era for Arduino in which we will strengthen and renew our commitment to open source hardware and software, while in parallel setting the company on a sound nancial course of sustainable growth,” said Banzi, in an ofcial statement. As a result of the acquisition, Banzi, 49, has become the new chairman and CTO of Arduino. The CEO, Federico Musto, has also been replaced by Dr Fabio Violante. “In the past two years, we have worked very hard to get to this point. We envision a future in which Arduino will apply its winning recipe to democratise the Internet of Things for individuals, educators, professionals and businesses,” said Dr Violante. Developed as an open source project back in 2003, Arduino is aimed at providing affordable solutions to individuals to build new devices. The boards under the Arduino range are available as open hardware and are compatible with a range of sensors and actuators. Last month, the Banzi-led company even partnered with the LoRa Alliance to start building hardware with the LoRaWAN standard. In May this year, the Arduino Foundation began to build an open source ecosystem for sectors like education, IoT markets, makers and receivers. “Our vision remains to continue enabling anybody to innovate with electronics for a long time to come,” said Banzi.
8
PayPal launches Technology Innovation Labs in India PayPal has launched two of its Technology Innovation Labs in India to support developments specic to new age technology. Located at PayPal’s Chennai and Bengaluru centres, the labs are the rst in India, opened after the Palo Alto rm launched its US and Singapore labs. “India is a hotbed for technology innovation given its evolving startup ecosystem, diverse merchant proles and enormous talent pool,” said Mike Todasco, director of innovation, PayPal. “To cater to their needs in the most effective manner, we are delighted to announce the launch of our newest Technology Innovation Labs in India, where the focus will be on fuelling new age technology and giving rise to unconventional ideas with the potential to transform the ecosystem we operate in,” he added. PayPal’s Technology Innovation Labs will support diverse elds including machine learning, articial intelligence, data science, Internet of Things, penetration testing, software-dened radios and wireless communication, virtual and augmented reality, computer vision and basic robotics. The company will provide equipment like Raspberry Pi boards with sensor kits, AlphaBot kits, Amazon Echo, LeapMotion and 3D printers, among others. “Enabling innovation and creating amazing experiences for our customers is at the heart of PayPal’s global success, and the Innovation Lab is another step to foster this spirit in our development centres in India,” said Guru Bhat, general manager — technology and head of engineering, PayPal. In addition to providing relevant hardware to kickstart the innovative developments, PayPal is set to integrate its native incubation centre. Called the PayPal Incubator, the centre was launched back in 2013 with an aim to support India-origin startups.
Google starts discriminating against poor quality Android apps Google is all set to improve the user experience on Android by enhancing its search and discovery algorithms on Play Store. This will have a direct impact on apps that have quality issues. A new Android vitals dashboard in the Google Play Console was revealed at I/O 2017 earlier this year. The technology is designed to understand and analyse inferior app behaviour such as excessive battery consumption, slow render times and crashes – and hence set benchmarks for what passes as a quality app. “Developers focusing on performance can use the Play Console to help nd and x a number of quality issues,” wrote Andrew Ahn, product manager of Google Play, in a blog post. Google reports that the change in its algorithms has shown that more users have downloaded quality apps. The Android maker also recommends developers to examine ratings and reviews that they received on their apps to get additional insights about their quality. If you are about to launch your app and test its functionality in the alpha or beta stage, you can use the pre-launch report to x issues ahead of mass downloads. Likewise, Android vitals can be applied to identify performance issues reported by opt-in devices.
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
FOSSBYTES
Developers ask Adobe to open source Flash Player Many developers have not welcomed Adobe’s decision to end support for the Flash Player plugin in 2020. Thus, a petition seeking the open source availability of Flash Player has been released on GitHub. While Adobe may have plenty of reasons to kill Flash, there are a bunch of developers who want to save it. GitHub user Juha Lindstedt, the developer who has led the petition, believes that Flash is an important part of the Internet’s history. Killing support for Flash means that future generations would not be able to access old games, websites and experiments, Lindstedt has said.
World’s first software-defined data centre gets launched in India Pi Datacenters, India’s native enterprise-class data centre and cloud services provider, has launched Asia’s largest Tier IV-certied data centre in Amaravati, Vijayawada. The company claims the new offering, called Pi Amaravati, is the world’s rst software-dened data centre. “Pi Amaravati is a major milestone for the entire team,” said Kalyan Muppaneni, founder and CEO, Pi Datacenters. The new data centre uses the OpenStack virtualisation framework to deliver an advanced computing, storage and networking experience. It is capable of offering league modular colocation and hosting services with a capacity of up to 5,000 racks. Also, the company’s enterprise cloud platform Habour1 is powered by open source provider SUSE. Vijayawada-based Pi Datacenters has recently been awarded Uptime Institute Tier IV design certication — known as the highest standard for infrastructure, functionality and capacity. “With the launch of Pi Amaravati, we will be offering highly innovative and tailored solutions with Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), Disaster-Recovery-as-a-Service (DRaaS) and a host of other cloud-enabled product and services to our esteemed partners,” Muppaneni said. Along with launching the Pi Amaravati data centre, Pi Datacenters has entered into a Memorandum of Understanding (MoU) with companies like PowerGrid, IRCTC, Mahindra and Mahindra Finance, Deutsche Bank and Unibic. These partnerships will expand open source developments in the data centre space.
LibreOffice 5.4 has ‘incremental’ compatibility with Microsoft Office files “Open sourcing Flash specs would be a good solution to keep Flash projects alive safely for archival reasons,” the developer wrote in the petition. Interestingly, over 3,400 people have so far signed the petition, which compares Adobe Flash with saving and restoring of old manuscripts. The developers who’ve signed the petition also want the interactive artwork created with Flash to be saved. The petition clearly states that it is not requesting Adobe to release the licensed components. Instead, the petitioners are ready to volunteer for either bypassing the licensed components or to replace them with open source alternatives.
10
The Document Foundation has released an update to the LibreOfce 5 series -- the version 5.4, which has new features for Writer, Calc and Impress. In the list of major tweaks over the previous version, the Document Foundation states that there are a large number of ‘incremental’ improvements to Microsoft Ofce le compatibility. “Inspired by Leonardo da Vinci’s belief that ‘simplicity is the ultimate sophistication’, LibreOfce developers have focused on le simplicity as the ultimate document interoperability sophistication,” said the non-prot organisation in a blog post. The Writer element of the LibreOfce 5.4 brings improved compatibility for Microsoft Word les. The ODF and OOXML les written by the LibreOfce suite are also more robust and easier to share than before. The simplicity concept translates the XML description of a new document with 50 per cent smaller ODF/ODT les and 90 per cent smaller OOXML/DOCX les as compared to Microsoft Ofce. The other highlight of the latest LibreOfce update is the new standard colour palette based on the RYB colour model. The Document Foundation has integrated better support for embedded videos and OpenPGP keys. Also, the rendering of imported PDF documents is much better in this version. The new version of Writer can help you import AutoText from MS Word DOTM templates. Users can preserve the le structure of exported or pasted lists
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
FOSSBYTES
Linux gets a preview of Microsoft’s Azure Container Instances Microsoft is adding a new service to its cloud portfolio dubbed Azure Container Instances. While the development is yet to receive Windows support, a public preview for Linux containers is out to help developers create and deploy containers without the hassle of managing virtual machines. Microsoft claims that Azure Container Instances (ACI) takes only a few seconds to start. The conguration window is highly customisable. Also, users simply need to select the exact memory and count of CPUs that they need. Designed to work with Docker and Kubernetes, the new service allows developers to utilise container instances and virtual machines simultaneously in the same cluster. Microsoft is also releasing ACI connector for Kubernetes to help the deployment of clusters to ACIs. “While Azure Container Instances are not orchestrators and are not intended to replace them, they will fuel orchestrators and other services as a container building block,” said Corey Sanders, director of compute, Azure, in a statement. The company executives are hoping that ACIs will be used for fast bursting and scaling. Virtual machines can be deployed alongside the cloud to deliver predictable scaling so that workloads can migrate back and forth between two infrastructure models. Windows support for ACI is likely to be released in the coming weeks. In the meantime, you can test it on your Linux container system.
12
as plain text. This allows them to create custom watermarks for their documents. Additionally, a new context menu is available to help users with footnotes, endnotes, styles and sections. The new version of Calc has support for pivot charts. Users can customise pivot tables and comment via menu commands. Impress helps users in specifying fractional angles while duplicating objects. There is also an auto save feature for settings to help in duplicating an operation. This is a part of Calc as well as Impress. LibreOfce 5.4 is available for download for Mac OS, Linux and Windows through its ofcial website. The organisation has also improved the LibreOfce online package with better performance and a more responsive layout. You can access the latest LibreOfce source code as Docker images.
OpenSUSE Leap 42.3 is out with new KDE Plasma and GNOME versions OpenSUSE has released the new version of its Leap distribution. Debuted as OpenSUSE Leap 42.3, the new release is based on SUSE Linux Enterprise (SLE) 12 Service Pack 3. The new update includes hundreds of updated packages. There is the new SUSE version that is powered by Linux kernel 4.4. The development team has spent a good eight months in producing this rocksolid Leap build. The most notable addition in OpenSUSE Leap 42.3 is the KDE Plasma 5.8 LTS desktop environment. Users have the option to either pick the latest KDE version or go with GNOME 3.20. There is also a provision to install other supported environments. Apart from the new desktop environment options, the OpenSUSE Leap update comes with a server installation prole and includes a full-featured text mode installer. The platform also ofcially supports Open-Channel solid-state drives through the LightNVM full-stack initiative. Likewise, there are numerous architectural improvements for 64-bit ARM systems. The OpenSUSE team has provided PHP5 and PHP7 support in the latest Leap distro. There is also an updated graphics stack based on Mesa 17, and GCC 4.8.5 as a default compiler. Considering the list of new changes, OpenSUSE 42.3 appears to be an advanced Linux version. It also comes preloaded with packages for streaming media, editing graphics, creating animation, playing games and building 3D printing projects. The new OpenSUSE Leap version is available for download for both 32-bit and 64-bit systems. Existing OpenSUSE Leap users can upgrade their systems using the built-in update system.
Google blocks Android spyware family Lipizzan Google’s Android Security and Threat Analysis teams have jointly discovered a new spyware family that gets distributed through various channels including Play Store. Called Lipizzan, the software has been detected in 20 apps that have been downloaded on fewer than 100 devices. Unlike some of the earlier spyware, Lipizzan is a multi-stage spyware that can be used to monitor and exltrate email, text messages, location, voice calls and media. It
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
FOSSBYTES
is typically available as an innocuous-sounding app such as ‘Backup’ or ‘Cleaner’. Once installed, the spyware downloads and loads a second ‘license verication’ page that validates some abort criteria on the hardware. “If given the all-clear, the second stage would then root the device with known exploits and begin to exltrate device data to a command and control server,” the team, comprising Android Security’s Megan Ruthven and the Threat Analysis Group’s Ken Bodzak and Neel Mehta, have written in a blog post. The second stage of Lipizzan is capable of performing and exltrating results of tasks such as call recording, VoIP, voice recording, location monitoring, screenshot capturing and taking photos. Additionally, it is capable of helping attackers retrieve data from apps like Gmail, Hangouts, KakaoTalk, LinkedIn, Messenger, Skype, Snapchat, Viber and WhatsApp, among others. Google researchers found the presence of Lipizzan while investigating Chrysaor — a recently emerged spyware that was believed to be written by the NSO Group. Once spotted clearly, Google’s Play Protect service released a notication on all affected devices and removed the apps with Lipizzan from the online store. Moreover, Google has enhanced Play Protect’s capabilities to continuously detect and block targeted spyware on the Android platform. Developers need to use official resources only when building their apps to ensure a secured and safe experience.
GitHub adds new features to grow community engagements Supporting open source efforts by developers, GitHub has brought out a list of new features to enhance community engagements around your projects. “Thanks to some subtle (and not so subtle) improvements in the past few months, it’s now easier to make your rst contribution, launch a new project or grow your community on GitHub,” the GitHub team wrote in a blog post. First on the list of new features is contributor badges. Being a maintainer, you can now see a ‘rst-time contribution’ badge that helps you review pull requests from users who have contributed to your projects for the rst time. The ‘rst-time contributor’ badge becomes a ‘contributor’ badge in the comments section once the pull request is merged. Furthermore, you can expose the information in the additional ag via the GraphQL API. Apart from providing badges to your contributors, you have been provided with the option to add a licence le to your project using a new licence picker. This new section helps you pick an appropriate licence by providing the full text. It also allows you to customise any applicable elds prior to committing the le or opening a pull request. As privacy is one of the major factors preventing you from contributing to a new project, GitHub has added the ability to let you keep your email address private. GitHub also provides a warning that lets you make an informed decision about contributing to a project you were blocked from previously. Moreover, blocked users on the platform will not be able to comment on issues or pull requests in third-party repositories.
www.lulu.com
•
www.magzter.com
•
Createspace.com
•
www.readwhere.com
•
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
13
FOSSBYTES
Microsoft is now a part of the Cloud Native Computing Foundation Continuing its developments around open source, Microsoft has now joined the Cloud Native Computing Foundation (CNCF). The latest announcement comes days after the Redmond company entered the board of the Cloud Foundry Foundation. “Joining the Cloud Native Computing Foundation is another natural step on our open source journey, and we look forward to learning and engaging with the community on a deeper level as a CNCF member,” said Corey Sanders, partner director, Microsoft, in a joint statement. Microsoft has chosen the Platinum membership of the CNCF. Gabe Monroy, a lead product manager for containers on Microsoft Azure and former Deis CTO, is joining CNCF’s governing board. Led by the core team members of the Linux Foundation, CNCF has welcomed the new move of Microsoft. The non-prot organisation considers it a “testament to the importance and growth” of cloud technologies and believes the Windows maker’s commitment to open source infrastructure is a ‘signicant asset’ to its board. “We are honoured to have Microsoft, widely recognised as one of the most important enterprise technology and cloud providers in the world, join CNCF as a platinum member. Its membership, along with other global cloud providers that also belong to CNCF, is a testament to the importance and growth of cloud native technologies,” stated Dan Kohn, executive director of the Cloud Native Computing Foundation.
14
“We hope these improvements will help you make your rst contribution, start a new project, or grow your community,” GitHub concluded in its blog. First launched in October 2007, GitHub is so far used by more than 23 million people around the globe. The platform hosts over 63 million projects with a worldwide employee base of 668 people.
Mozilla aims to enhance AI developments with open source human voices While elite digital assistants like Alexa, Cortana, Google Assistant and Siri have so far been receiving inputs from users via the spoken word, Mozilla is planning to enhance all such existing articial intelligence (AI) developments by open sourcing human voices on a mass level. The Web giant has already launched a project called Common Voice to build a large-scale repository of voice recordings for future use. Mozilla has started capturing human voices since June to build its open source database. The database will be live later this year to “let anyone quickly and easily train voice-enabled apps” that go beyond Alexa, Google Assistant and Siri. “Experts think voice recognition applications represent the ‘next big thing’. The problem is the current ecosystem favours Big Tech and leaves out the next wave of innovators,” said Daniel Kessler, senior brand manager, Mozilla, in a recent blog post. Tech companies are presently using different voices to teach computers to understand the variety of languages for their solutions. But the data sets with the voice collections are mostly proprietary as of now. Therefore, a large number of developers have no access to voice recording samples to test their own voice recognition projects. This ultimately leads to a limited number of apps understanding our speech. Things are appearing to be changing with Common Voice. “The time has come for an open source data set that can change the game. The time is right for Project Common Voice,” Kessler stated. Mozilla is asking individuals to donate their voice recordings either on the Common Voice Web page or by downloading a dedicated iOS app. Once you are ready with your recording, you need to read a set of sentences that will be saved into the system. The recorded voices, which would come in a variety of languages with various accents and demographics, will be provided to third-party developers. In addition to simply receiving voice donations, Mozilla has built a model by which users will validate the recordings that are stored in the system. This process will help train an app’s speech-to-text conversion capabilities. All this will enable not just one or two but 10,000 hours of validated audio that will power tons of AI models in the near future. Notably, recordings received through the Common Voice initiative will be integrated into the Firefox browser as well. But the main purpose of this exercise is to provide a public resource.
For more news, visit www.opensourceforu.com
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
CODE
SPORT
Sandya Mannarswamy
In this month’s column, we discuss a few interview questions related to machine learning and data science.
A
s we have been doing over the last couple of months, we will continue to discuss a few more computer science interview questions in this column as well, particularly focusing on topics related to data science, machine learning and natural language processing. It is important to note that many of the questions are typically oriented towards practical implementation or deployment issues, rather than just concepts or theory. So it is important for interview candidates to make sure that they get adequate implementation experience with machine learning/ NLP projects before their interviews. Data science platforms such as Kaggle ( www.kaggle.com) host a number of competitions that candidates can attempt to practice their skills on. Also, many of the data science or machine learning related academic computer conferences host data challenge competitions such as the KDD Cup ( http://www.kdd.org/kdd-cup). Data science enthusiasts can sign on for these challenges and hone their skills in solving real life problems. Let us now discuss a few interview questions. 1. You are given 100,000 movie reviews that are labelled as positive or negative. You have been told to perform sentiment analysis on the new incoming reviews by classifying each review as positive or negative, which is a simple binary classication problem. Can you explain what features you would use for this classication problem? Once you decide on your set of features, how would you go about selecting which classier to use? 2. Let us assume that you decided to use the ‘bag of words’ approach in the above problem with each vocabulary term becoming a feature for your classier. Essentially, you can construct a feature set where the dimensions of this set are the same as the size of your vocabulary, and each feature corresponds to a specic term in the vocabulary. The feature value can either be the count of the term or merely the presence or absence of the
16
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
3.
4.
5.
6.
term in the document; or you can employ Tf-Idf count for each term-review combination, etc. You had used a random forests classier for sentiment classication. Now you are told that your vocabulary size is 100,000. Would this change your decision about which classier to use? For problem (1), you had decided to use a support vector machine classier. However, now you are told that instead of just doing binary classication of the reviews, you need to classify them as one of ve categories, namely: (a) strongly positive, (b) weakly positive, (c) neutral, (d) weakly negative, and (e) strongly negative. You are given labelled data with these ve categories now. Would you still continue to use the ‘support vector machine’ (SVM) classier? If so, can you explain how SVM handles multi-class classication? If you decide to switch from SVM to a different classier, explain the rationale behind your switch. For the sentiment classication problem, other than the review text itself, you are now given additional data about the movies. This additional data includes the reviewers’ names, address, age, country of residence, date of review and the specic movie genre they are interested in. This additional data contains both numeric and string data, with some of the features being categorical. A country’s name is string data, and the movie genre is string data which is actually categorical. What kind of data preprocessing would you do on this additional data to use it with your classier? Generally, interviewers expect you to be familiar with some of the popular libraries that can be used for data science. So some of the questions can be library-specic as well. In question (4), you may be asked to mention how you would convert categorical data to numeric form. Can you write a piece of Python code to do this conversion? Let us assume that you decided to use a SVM
Guest Column
classier for the sentiment classication problem. You nd that your classier takes a long time to t the training data. How would you reduce the training time? List all the possible approaches. 7. One of our readers suggested feature scaling/data normalisation as a preprocessing step before you train your model always. Is she correct? Is feature scaling or normalisation always needed in all types of classiers? Why do you think feature scaling can help achieve faster convergence of your learning procedure? One of the wellknown methods of feature scaling is Min-Max scaling. By feature scaling, you are actually throwing away your knowledge of the maximum and minimum values that the feature can take. Wouldn’t the loss of this information affect the accuracy of your classier on unseen data? If you are using a decision tree classier or random forests, should you still do feature scaling? If yes, explain why. 8. Scikit-learn is a popular machine learning library available in Python, which provides ready-made implementations of several classiers such as decision tree, support vector machine, random forests, logistic regression, multilayer perceptron, etc. These classiers provide a ‘predict’ function, which predicts the output for a given data instance. They also provide a ‘predict_proba’ function, which returns the probability for each sample (data instance) belonging to a specic output class. For instance, in the case of the movie review sentiment prediction task, with two classes positive and negative, the ‘predict_proba’ function would return the probability of the sample belonging to the positive sentiment category and negative sentiment category. When would you use the ‘predict_proba’ function in your sentiment classication task? 9. In the sentiment classication problem on the movie reviews data, you found that some of the reviews did not have the date, country of reviewer and the movie genre. How would you handle these missing data? Note that these features were not numeric; so what kind of data imputation would make sense in this case? 10. In the movie reviews training labelled data set, you are now given certain additional data features that include: (a) the star rating reviewers give to the movie, (b) whether they would like to watch it again, and (c) whether they liked the movie. Would you use these additional features in your training data to train your model? If not, explain why you wouldn’t. 11. What is the data leakage problem in machine learning and how do you avoid it? Does the scenario mentioned in question (10) fall under the data leakage category? Detailed information on data leakage and its avoidance can be found in this well-written and must-read paper ‘Leakage in data mining: formulation, detection, and avoidance’ which was presented at the KDD 2011 conference and is available at http://dl.acm.org/citation.cfm?id=2020496. 12. You are using k-fold cross validation for selecting the hyperparameters of your model. Given that your training data has
CodeSport
features which are on widely varying scales, you have decided to do feature scaling. Should you do data scaling once for the entire training data set and then perform the k-fold cross validation? Or should you do the feature scaling within each fold of cross-validation? Explain the reason behind your choice. 13. You are given a data set which has a large number of features. You are told that only a handful of these features are relevant in predicting the output variable. Will you use Lasso regression or ridge regression in this case? Explain the rationale behind your choice. As a follow-up question, when would you prefer ridge regression over Lasso? 14. Decision tree classiers are very popular in supervised machine learning problems. Two well-known tree classiers are random forests and gradient boosted decision trees. Can you explain the difference between the two of them? As a follow-up question, can you explain ensemble learning methods in general? When would you opt for an ensemble classier over a non-ensemble classier? 15. You are given a data set in which many of the variables are categorical string variables. You decided to encode the categorical variables with One Hot Encoding. Consider that you have a variable called ‘country’, which can take any of the 20 values. With One Hot encoding, you end up creating 20 new feature variables in place of the single ‘country’ variable. On the other hand, if you use label encoding, you convert the categorical string variable to a categorical numerical variable. Which of the two methods leads to the ‘curse of the dimensionality’ problem? When would you prefer to go for One Hot encoding vs label encoding? Please do send me your answers to the above questions. I will discuss the solutions to these questions in next month’s column. I also wanted to alert readers about a new deep learning specialisation course by Prof. Andrew Ng coming up soon on the Coursera platform ( https://www.coursera. org/specializations/deep-learning). If you are interested in becoming familiar with deep learning, there is no better teacher than Prof. Ng whose machine learning course on Coursera is being taken by more than a million students. If you have any favourite programming questions/software topics that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_ AT_yahoo_DOT_com. Till we meet again next month, wishing all our readers wonderful and productive days ahead.
By: Sandya Mannarswamy The author is an expert in systems software and is currently working as a research scientist at Conduent Labs India (formerly Xerox India Research Centre). Her interests include compilers, programming languages, le systems and natural language processing. If you are preparing for systems software interviews, you may nd it useful to visit Sandya’s LinkedIn group ‘Computer Science Interview Training India’ at http://www.linkedin.com/ groups?home=&gid=2339182
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
17
NEW PRODUCTS Pocket-friendly Bluetooth speaker from Kodak Super Plastronics Pvt Ltd, a brand licensee of Kodak, has launched its rst Bluetooth portable speaker in India, the Kodak 68M. The device offers a complete sound experience in an affordable price, company sources claim. Apart from Bluetooth connectivity, the speaker supports an auxiliary wire and a micro USB jack. It is equipped with a low sound output with a reach of up to 10 metres. Powered by a 3.7W battery, the speaker is capable of lasting for over ve hours. For enhancing the sound experience, the device can also be connected to an additional speaker. The Kodak speaker can also be connected to any TV, with or without Bluetooth, making it a complete package for entertainment lovers. The Kodak 68M speaker is available online and at retail stores.
Price:
` 3,290 Address: Super
Plastronics Pvt Ltd, 1st Floor, Dani Corporate Park, 158 Dani Compound, Vidya Nagari Road, Kalina, Santacruz East, Mumbai – 400098; Ph: 022-66416300
18
Earbuds with heart-rate monitor from Jabra Audio and connectivity devices manufacturer, Jabra, has introduced superior quality earbuds for music and voice calls, called Jabra Elite Sport. The device sports advanced wireless connectivity, which lters out background noise ensuring distraction-free usage. The buds come with the ease of portable charging and deliver 4.5 hours of playtime. They have customisable tting options, enabling users to stay connected comfortably during outdoor and sports activities. A special button on the earbuds, ‘Hear Through’, lters out surrounding noise. The device has four microphones and offers personalised tness analysis using an in-ear heart rate monitor.
Price:
` 18,999 With IP67 certification, the earbuds have a three-year warranty for damage by sweat and water, enabling hasslefree usage. The wireless Jabra Elite Sport is available in lime green, grey and black, online and at retail stores. Jabra India Pvt Ltd, Redington India Limited, New No. 27, NRS Building, Velachery Road, Saidapet, Chennai-600015. Address:
Mechanical keyboard for gamers from Galax Galax, the manufacturer of gaming products, has unveiled its latest HOF Black edition mechanical keyboard, specially designed for gamers. The keyboard uses a genuine Cherry MX mechanical key switch with 50 million keystrokes for long lasting and quick response, giving users a stable and long-term option. The stylish-looking keyboard is built with an anodised (black)/baking paint (white colour) aluminium plate. It offers up to 112 lighting effects with software and 88 lighting effects without software. Its Macro keys make each key of the device programmable. The keyboard comes with media control buttons along with a die-cast volume and lighting roller. With the n-key rollover, the company claims the keyboard is 100 per cent antighosting. The HOF keyboard can enter all the signals accurately, even when played faster or when pressing multiple keys. It has a USB 2.0 hub with audio-out
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Price:
` 7000 and a mic-in jack. The hub allows the users to connect USB devices of all types quickly. It also has a magnetic, detachable, soft-touch wrist rest to make prolonged use comfortable. The Xtreme Tuner Plus system enables users to customise the keyboard by controlling Macros. It also has per-key programming, back light setting and lighting patterns. The HOF black edition carries a three-year warranty period and is available at Amazon. Address: Amazon
India, Brigade Gateway, 8th Floor, 26/1, Dr Rajkumar Road, Malleshwaram West, Bengaluru, Karnataka – 560055; Ph: 1800-30009009
Budget-friendly smartphone with front LED flash from Lenovo Lenovo has launched an affordable dual camera smartphone, the Lenovo K8 Note. The device sports a 13.9cm (5.5 inch) full HD (1080 x1920 pixels) display with Corning Gorilla Glass protection. It is powered by a decacore MediaTek MT6797 SoC with four Cortex-A53 cores clocked at 1.4GHz and 1.85GHz, as well as two Cortex-A72 cores clocked at 2.3GHz. Built with 5000 series aluminium and polycarbonate, the device is supposed to be splash-resistant. The dual SIM (Nano) device runs on Android 7.1.1 Nougat and is backed with a huge 4000mAh battery with turbo charging. On the camera front, the smartphone sports a rear 13 megapixel primary sensor, accompanied by a 5
megapixel depth sensor with a dual LED-CCT ash module; and a 13 megapixel front camera with an LED ash module for seles. The connectivity options of the device include 4G VoLTE, dual band (2.4GHz and 5GHz), Wi-Fi 802.11ac, Bluetooth v4.1, GPS, micro USB and a 3.5mm audio jack. The Lenovo K8 Note comes in two variants – 3GB RAM/32GB storage and 4GB RAM/64GB storage — both available in ‘Fine Gold’ and ‘Venom Black’ colours online and at retail stores. Price:
` 12,999 for the 3GB Lenovo India Pvt Ltd, Vatika Business Park, 1st Floor, Badshah Pur Road, Sector-49, Sohna Road, Gurugram-122001 Address:
RAM/32GB storage option and ` 13,999 for the 4GB RAM/64GB storage variant.
Water-resistant Bluetooth headphones Motorola has introduced its Bluetooth in-ear headphones in India – the Verve Loop. Aimed at sports and tness enthusiasts, the headphones offer a hassle-free, comfortable t during outdoor and workout sessions. The device comes with an IP54 rating for water and splash resistance, enabling damage-free use. It is designed to deliver a balanced, high quality audio experience along with easy Bluetooth pairing with voice prompts. Powered by a lithium-ion battery, the device delivers up to six hours of playback with a single charge. Company sources also claim that it provides one hour of play time on just a 20-minute charge. The headphones offer balanced sound at any volume and superb noise isolation. Features include A2DP (Advanced Audio Distribution Prole), HFP (Hands Free Protocol) and AVRCP (Audio/Video Remote
from
Motorola
Control Prole), enabling hands-free calling and voice assistance. The headphones come with inline control buttons for volume, play/ pause, etc, three sets of extra ear gels and three sets of ear hooks for stable support. The Motorola Verve Loop is compatible with all Android and Apple smartphones and tablets, apart from supporting Siri and Google Now voice assistants. Available in combinations of charcoal grey/black and orange/ black, the headphones can be purchased online and at retail stores. Price:
` 2,499 Address: Motorola Solutions India, 415/2, Mehrauli-Gurugram Road, Sector 14, Near Maharana Pratap Chowk, Gurugram, Haryana – 122001; Ph: 0124-4192000; Website: www.motorola.in
The prices, features and specications are based on information provided to us, or as available on various websites and portals. OSFY cannot vouch for their accuracy.
Compiled by: Aashima Sharma
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
19
Exploring Software
Guest Column
Anil Se th
Importing GNUCash Accounts in GNUKhata gkcore is the REST API core engine of GNUKhata. The GNUKhata app comprises two applications — gkcore and gkwebapp. The objective of this tutorial is to get to know the API.
G
NUKhata is an application developed using the Pyramid Web Framework. It comprises two Web applications – a core application called gkcore, and a Web application called gkwebapp. You may easily get started with the installation and development by referring to https:// gitlab.com/gnukhata/gkwebapp/wikis/home. As a way of learning how to extend GNUKhata, you may consider importing data from GNUCash into GNUKhata. Since the core and user interface are two separate applications, a good way to learn the core application interface is to create a utility program which will add the GNUCash data. The utility program will rst need to log into the core server, and then issue the commands to add the needed data. Make sure that you are able to run the core and the Web applications; use the latter to create an organisation and an admin user for the organisation. It is important to keep in mind that the gkcore application needs to be run using the gkadmin user, assuming that you are following the steps from the wiki article; otherwise, it will not be able to access the database.
The login process You may examine gkwebapp/views/startup.py to understand the logic of the steps needed for logging in. The process involves selecting an organisation rst, and then supplying the credentials of a user for that organisation. In order to keep the code as simple as possible, as the objective is to learn the API, select the rst organisation. The login credentials are hard-coded. In case of any errors, the utility will just crash and not attempt any error handling. Once the login is successful, a token is issued. This token will authorise all subsequent calls to the core server. You will notice that the calls to the core server are simple get or post requests. The data objects transferred between the two are JSON objects. import requests, json
return requests.post(gkhost + route,data=jsondata,headers =hdrs).json() def getOrg(): gkdata = getJsonResponse(‘organisations’)[‘gkdata’] rst_org = gkdata[0] route= ‘/’.join([‘orgyears’,rst_org[‘orgname’], rst_ org[‘orgtype’]]) gkdata = getJsonResponse(route)[‘gkdata’] return gkdata[0][‘orgcode’] def orgLogin(orgcode): gkdata = {‘username’:’anil’, ‘userpassword’:’pswd’,’orgc ode’:orgcode} return postJsonResponse(‘login’,json.dumps(gkdata)) [‘token’] orgcode = getOrg() gktoken = orgLogin(orgcode)
Adding accounts GNUCash can export the accounts and transactions in CSV les. In the current article, you may extract the accounts into a le, accounts.csv. The Python CSV modules make it very easy to handle a CSV le. The rst row contains the column labels and should be ignored. You may use the DictReader for more complex processing of the le. For this application, in which only a few columns are needed, the CSV reader is adequate. There are a few differences in the top level account/group names of GNUCash and GNUKhata. So, you need to create a dictionary to map the names from GNUCash to the ones used in GNUKhata. Some groups in the level below ‘Assets’ in GNUCash appear as top level groups in GNUKhata, e.g., ‘Current Assets’ and ‘Fixed Assets’. You may ignore ‘Assets’ from the account hierarchy when transferring the data. As before, the code below ignores error handling and assumes ‘all is well’:
gkhost = ‘http://127.0.0.1:6543/’ def getJsonResponse(route,hdrs=None): return requests.get(gkhost + route, headers = hdrs).json()
import csv def addSubGroup(name,parent,header): data = json.dumps({‘groupname’:name,
def postJsonResponse(route,jsondata,hdrs=None):
20
‘subgroupof’:parent})
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Exploring Software
Guest Column
return postJsonResponse(‘groupsubgroups’,data,header) [‘gkresult’]
login_hdr = {“gktoken”:gktoken} toplevel_mapping = { ‘Capital’:’Capital’, ‘Current Assets’:’Current Assets’,
def addAccount(name,parent,header):
‘Fixed Assets’:’Fixed Assets’,
data = json.dumps({‘accountname’:name, ‘groupcode’:parent
‘Liabilities’:’Current Liabilities’,
,’openingbal’:0.00})
‘Expenses’:’Direct Expense’,
res = postJsonResponse(‘accounts’,data,header)
‘Income’:’Direct Income’} createAccounts(‘accounts.csv’, toplevel_mapping, login_hdr)
def createAccounts(fn,toplevel,header): # get exiting groups and their codes groups = getJsonResponse(‘groupsubgroups?groupatlist’,he ader)[‘gkresult’] f = open(fn) rows = csv.reader(f) skip_rst = rows.next() for row in rows: fullname = row[1].split(‘:’) name = row[2] is_group = row[-1] == ‘T’ # Map top-level to GNUKhata top level # Ignore Assets and use ‘Current Assets’ and ‘Fixed Assets’ as top level if fullname[0] == ‘Assets’: fullname = fullname[1:] if len(fullname) > 1 and fullname[0] in toplevel:
fullname[0]=toplevel[fullname[0]] parent = groups[fullname[-2]] if is_group:
There are some potential issues in transferring accounts from GNUCash to GNUKhata. For example, in GNUKhata, you either have an account or a sub-group. However, in GNUCash, a sub-group can function as a normal account as well. An account in GNUCash is consistent with a sub-group of GNUKhata if the placeholder ag is true. However, the objective of this article is to become familiar with the communication between a client application and the core server, and no attempt is made to handle corner cases. GNUCash transaction data may also be exported as CSV les. The above utility may be similarly extended to handle that data as well. The Web-based frontend, gkwebapp, makes it easy to view and enter the data. The communication with the server happens as in the utility above, and it is done from the code residing in the views directory of the gkwebapp. You may, as an exercise, extend the Web application to import GNUCash accounts into GNUKhata and learn that as well!
if not (name in groups): # add this to the list of groups
By: Dr Anil Seth
groups[name]=addSubGroup(name,
The author has earned the right to do what interests him. You can fnd him online at http://sethanil.com, http://sethanil. blogspot.com, and reach him via email at
[email protected].
parent,header)
else:
addAccount(name,parent,header)
OSFY Magazine Attractions During 2017-18 MONTH
THEME
March 2017
Open Source Firewall, Network security and Monitoring
April 2017
Databases management and Optimisation
May 2017
Open Source Programming (Languages and tools)
June 2017
Open Source and IoT
July 2017
Mobile App Development and Optimisation
August 2017
Docker and Containers
September 2017
Web and desktop app Development
October 2017
Artificial Intelligence, Deep learning and Machine Learning
November 2017
Open Source on Windows
December 2017
BigData, Hadoop, PaaS, SaaS, Iaas and Cloud
January 2018
Data Security, Storage and Backup
February 2018
Best in the world of Open Source (Tools and Services)
22 | SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
For U & Me Interview
is built on a lot of past experience Q
How did you start your journey with the CentOS project? It was in late 2004. I was not one of the founders of CentOS but showed up on the scene in its early days. At that time, we had a small team and a lot of machines running Red Hat Linux 7.3 and Red Hat Linux 9. With Red Hat moving down the path towards Red Hat Enterprise Linux, a model that didn’t work well for us, I started looking at options. We explored Debian and SUSE initially, but found management and lifecycle on each of them hard to map out workow into. It was during this time that I came across the Whitebox Linux effort and then the CentOS Project. Both had the same goal, but the CentOS team was more inclusive and seemed more focused on its goals. So, in late September 2004, I joined the CentOS IRC channel and then, in November, I joined the CentOS mailing list as a contributor. And I am still contributing 13 years down the road!
Q
What were the biggest roadblocks that emerged initially while designing CentOS for the community, and how did its core development team overcome them? A lot of our problems were about not getting off the ground. Initially, there was no clear aim. And then, we faced the challenge that the build systems and code audit tools in 2003/2004 were either primitive, absent entirely or the contributors were unaware of them. A Linux distribution is a large collection
24
Managing a Linux distribution for a long time requires immense community effort. But what is the key to success in a market that includes hundreds of competitive options? Also, what are the challenges in building a brand around an open source offering? Karanbir Singh, project leader, CentOS, answers these questions and outlines the future of the platform that has been leading Web developments, in an exclusive conversation with Jagmeet Singh of OSFY.
of a lot of code, written in many languages — each with its own licence, build process and management. Three main strategies saw us past that painful process. The rst was consistency. Whatever we did, we had to be consistent and uniform across the entire distribution, and make sure all developers had a uniform understanding of the process and ow. The second was a self-use focus. Regardless of what the other people were targeting, all developers were encouraged to focus on their own use cases and their personal goals. The third was the hardest, to try and disconnect commercial interests from developer and contributor work.
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Q
Why was there a need for CentOS Linux when Fedora and Red Hat Enterprise Linux already existed in the market? The Fedora project was still getting sorted out around then. Its team had a clear mandate to try and build an upstream-friendly Linux distribution that was going to move fast and help the overall Linux ecosystem mature. Red Hat Enterprise Linux, on the other hand, has been built for the commercial medium to large organisations, looking for value above the code. This left a clear gap in the ecosystem for a community-centric, manageable, predictable enough Linux distribution that the community itself, small vendors, and niche users around the mainstream could consume. Initially, the work we did was quite focused on the specic use cases that the developers and contributors had. All of us were doing specic things, in specic ways and CentOS Linux tted in well. But as we started to mature, we saw great success in specic verticals, starting from academia and education institutions to Web hosting, VoIP (Voice over Internet Protocol) and HPC (high performance computing).
Q
What were the major outcomes of the Red Hat tie-up? Red Hat came on as a major sponsor for the CentOS Project in January 2014. From the CentOS Project’s perspective, this meant we were then able to start looking beyond just the platform and the Linux distribution. It allowed us to build the infrastructure and resources needed to support
Interview
For U & Me
Karanbir Singh, project leader, CentOS
other projects above the platform, develop a great CI (continuous integration) service as well as a much better outreach effort than we were able to earlier. The real wins have been from the user perspective. If today you are looking for an OpenStack user side install, the CentOS Project hosted on the RDO stack is the best packaged, tested and maintained option. In a nutshell, the Red Hat relationship has allowed the CentOS Project to dramatically expand the scope of its operations beyond just the Linux distribution and enable many more user scenarios.
doing so in order to achieve a goal— either to run a website or to run a mail server. Helping users achieve their end goals easily has been our constant focus. It is a key metric we still track in order to reach our goals. This means that as the user base adapts to the new world of cloud-native, container-based and dynamically orchestrated workloads, the CentOS Project continues to deliver the same level of user focus that we have had over the years. Protecting the user’s investment in the platform across the base without giving up on existing processes is something we deliver till date. For instance, people can choose when and how they jump on the container process, or just entirely opt out. It is not something that will be inuenced by a CentOS Linux release. It is this duality, which maintains an existing base while allowing the user to seamlessly move into emerging tech, that creates a great value proposition for CentOS Linux.
Q
How do you manage the diversication of different CentOS Linux versions and releases?
Q
What makes CentOS Linux a perfect choice even 13 years after its rst release in May 2004? When users install a Linux distribution, they are almost always
The way the contribution process and ownership works makes it relatively easy to manage the diversication. Primarily, the aim is to ensure that if we are doing something specic, the people doing the work are directly invested in the result of the work itself. This helps ensure quality as there are eyes scrutinising incoming patches and changes—since the developers’ own requirements could be impacted by shipping a sub-optimal release.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTMBER 2017 |
25
For U & Me Interview
Participation from the target audience for the specic media or release is a very critical requirement. And because this typically comes through the common project resources, it also means that the people doing this work are well engaged in the core project scope and Linux distribution areas, allowing them to bridge the two sides nicely. At the moment, there are dozens of different kinds of CentOS releases, including atomic hosts, minimal installs, DVD ISOs, cloud images, vagrant images and containers. Each of these comes through a group that is well invested in the specic space.
contributors don’t always have the time to work on each request, but if you look at the CentOS Forums, almost every question gets a great result. There is also a lot of diversity in the groups. The CentOS IRC channel idles at over 600 users during the day, but a large number of users never visit the forums. Similarly, the CentOS mailing lists include over 50,000 people, but a large number of them are never reaching the IRC channel or the forums.
Q
What are the major differences you’ve observed being a part of a corporate entity like Red Hat and a community member? Which is tougher among the two?
Q
What are the various efforts that lead to a consistent developer experience on CentOS Linux? Application developers working to consume CentOS Linux as their base can trust the API/ABI efforts where content deployed today will still work three or ve years down the road. The interfaces into that don’t change (they can evolve, but won’t break existing content/scripts/code). Therefore, working with these interfaces also means that they work within the same process that the end user is already aware of and already manages for simple things like security patching, an area often overlooked by the casual developer.
Q
How does the small team of developers manage to offer a long period of support for every single release? We invest heavily in automation to enable long-term support. And that means that a very small group of people can actually look after a very large codebase. It changes the scope of what the contributors need to do. Rather than working on the code process, we work on the automation around it, and aggressively test and monitor the process and code. The other thing is that we get community support. Developers and
26
most of my focus is around enablement and making sure that contributors and developers have the resources they need to succeed. The other 70 per cent of my time is spent as a consulting engineer at Red Hat, working with service teams, helping build best practices in operations roles and modern system patterns for online services. Additionally, I have been involved in some of the work going on in the containers world and its user stores, which includes DevOps, SRE-Patterns, CI and CD, among others.
Q
Is it solely the community feedback that helps you build new CentOS updates, or do you also consider feature requests from the Red Hat team?
CentOS Linux is built as a downstream from Red Hat’s sources. They are being delivered via git.centos.org, and then the additional layers and packages are built by various community members. We also encourage people to come and join the development process to build, test and deliver features to the audience that we target through the open source project. All this is entirely community focused. So if someone at Red Hat wants to execute something, they would need to join the relevant community and work that route for engagement on CentOS. Having said that, we have a concept of Special Interest Groups that can be started by anyone, with a specic target in mind. Of course, this is only above the Linux distribution itself.
Q
Apart from your engagements related to being a CentOS Project member, what are your major tasks at Red Hat? These days, I spend around 30 per cent of my time working on the CentOS Project. Rather than in the project itself,
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
I have been involved with open source communities for over 15 years. During such a long period, open source work has never been my primary job. It’s always been something that I do in my free time or in addition to what I was already doing, similar to my move with the CentOS Project. But what makes Red Hat unique in a way is that this isn’t an odd role. A large number of people at Red Hat participate and execute their day job via open source communities. And that makes it a lot easier, being a longterm contributor. There is only one key challenge that one needs to keep in mind when working on an open source project as a part of the day job, though. It is to set realistic expectations around community participation, and recognise that the community is there because its members often care about something far more than the people paid to work on a project. However, this typically isn’t a concern when a community comes together around smaller pieces of the code. The CentOS Project has quite a widespread and extensive footprint. It involves talking to and participating in a wider community where a large majority is unknown personally. Managing expectations and ensuring
For U & Me Interview
there is respect both ways has been a balancing act that’s never easy. It’s something I work quite hard on, and hope we can keep getting it better.
Q
What is your advice to those who are planning to begin with CentOS? The most important thing to remember for those planning to begin with CentOS is that you are not alone. There is lots of documentation as well as tremendous support and help from a vast community. So as a new user, make sure you engage with the existing user base, ask questions, and spend a bit of time to understand what and how things are. In the long term, understanding the details will pay off. CentOS Linux is built on a lot of past experience. Anyone starting down the path of adoption should keep this in mind. We’ve tried to build a culture of helping those that need the most help, but also encourage new users to learn and grow with the community.
28
Q
What are the biggest features that we can see in the next CentOS release? The Special Interest Groups (SIGs) are constantly releasing new content all the time. As an example, the Cloud SIG released OpenStack Ocata within a few hours of the upstream project release. There is also a lot of work being done in the Atomic SIG around containers, and in the Virt SIG on existing and upcoming virtualisation technologies.
Q
Lastly, where do you see CentOS in the future of open source? CentOS Linux, due to its characteristics, is a great t for areas like Web hosting, Web services, cloud workloads and container delivery. Also, as a platform for long-term community centric workloads, it is a good option in areas like HPC and IoT. The Linux distribution also specically suits the needs of the education sector—starting and supporting not only IT education
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
but education as a whole. Moreover, I would like to see CentOS Linux extend its footprint as a vehicle for enablement. And this is where the CentOS Project provides a great space for other open source projects—building services for CI/CD, interacting with the users, nding the niches that people care about and solving real-world problems. If you are involved today with an open source project, I strongly encourage you to get in touch with me and discuss its development areas. We measure our success on the basis of how successful CentOS has been for the people, the communities, the open source projects and the users who have invested their time, resources and support for the CentOS Project. And we look forward to solving more problems, building better solutions and bridging more gaps together. You can reach Karanbir directly at
[email protected] or meet him on Twitter at @kbsingh
Let’s Try Admin
A Primer on Software Defined Networking (SDN) and the OpenFlow Standard Continuous innovation and the need to adapt to the constraints of conventional networking has made software defined networking (SDN) pretty popular. It is an approach that disassembles the network’s control plane and data plane. This allows network administrators to program directly without having to worry about the hardware specifications.
penFlow, the rst SDN standard, is a communication protocol in software dened networking (SDN). It is managed by the Open Networking Foundation (ONF). The SDN controller or the ‘brain’ interacts with the forwarding (data) plane of the networking devices like routers and switches via OpenFlow APIs. It empowers the network controllers to decide the path of network packets over a network of switches. The OpenFlow protocol is required to move network control out of exclusive network switches and into control programming that is open source and privately overseen. Software-dened networking uses southbound APIs and northbound APIs. The former are used to hand over information to the switches and routers. OpenFlow is the rst southbound API. Applications use the northbound APIs to interact.
O
simulator with SDN technology. An OpenFlow switch is a package that routes packets in the SDN environment. The data plane is referred to as the switch and the control plane is referred to as the controller. The OpenFlow switch interacts with the controller and the switch is managed by the controller via the OpenFlow protocol. The fundamental components of the OpenFlow switch (as shown in Figure 2) incorporate at least one flow table, a meter table, a group table and an OpenFlow channel to an exterior controller. The flow tables and group table perform the packet scanning and forwarding function based on the flow entries configured by the controller. The routing decisions made by the controller are deployed in the switch’s flow table. The meter table is used for the measurement and control of the rate of packets.
Porting an OpenFlow switch in ns-3
Configuring the SDN OFSwitch
The OpenFlow 1.3 module for ns-3, widely known as the OFSwitch13 module, was intended to boost the ns-3 network
In this article, we have incorporated the OFSwitch13 (version 1.3) with ns-3. To benet from the features of OFSwitch13, an www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
29
Admin
Let’s Try
Traditional Network Architecture
SDN Architecture
SDN Controller
Centralized Control Plane
Distributed Control Plane
OpenFlow App
App
OFSwitch
Internal Diagram of OF Switch
Meter Table
Group Table
Secure Channel Software
Flow Table
Flow Table
Flow Table
Hardware
- Data Plane - Control Plane
Figure 1: Traditional network architecture vs SDN architecture
ofsoftswitch13 library is used. All the commands given below have been tested on Ubuntu 16.04 and might change for other versions or distributions. Before that, there are a few bundles to be introduced on the system:
Hosts
Figure 2: OpenFlow switch components
In the ns-3.26 directory, download the repository of OFSwitch13, as follows:
$ sudo apt-get install build-essential gcc g++ python git mercurial unzip cmake
$ hg clone https://bitbucket.org/ljerezchaves/ofswitch13-
$ sudo apt-get install libpcap-dev libxerces-c-dev libpcre3-
module src/ofswitch13
dev ex bison
$ cd src/ofswitch13
$ sudo apt-get install pkg-cong autoconf libtool libboost-
$ hg update 3.1.0
dev
$ cd ../.. $ patch -p1 < src/ofswitch13/utils/ofswitch13-src-3_26.patch
In order to utilise ofsoftswitch13 as a static library, you need to introduce the Netbee library, as the ofsoftswitch13 library code relies upon it. $ wget https://bitbucket.org/ljerezchaves/ofswitch13-module/ downloads/nbeesrc.zip $ unzip nbeesrc.zip (for unzipping) $ cd netbee/src/
$ patch -p1 < src/ofswitch13/utils/ofswitch13-doc-3_26.patch
The le ofswitch13-src-3_26.patch will allow OFSwitch to get raw packets from nodes (devices). To do this, it will create a new OpenFlow receive callback at CsmaNetDevice and virtualNetDevice. The le ofswitch13-doc-3_26.patch is optional but preferable. After successful installation, congure the module, as follows:
$ cmake . $ make
$ ./waf congure --with-ofswitch13=path/to/ofsoftswitch13
$ sudo cp ../bin/libn*.so /usr/local/lib
$ ./waf congure --enable-examples --enable-tests
$ sudo ldcong $ sudo cp -R ../include/* /usr/include/
Now, clone the repository of the ofsoftwsitch13 library, as follows: $ git clone https://github.com/ljerezchaves/ofsoftswitch13 $ cd ofsoftswitch13
Now, we’re all set. Just build the simulator using the following command: $ ./waf
Enjoy the ns3.26 simulator with the power of SDN, i.e., OFSwitch 1.3.
$ ./boot.sh $ ./congure --enable-ns3-lib $ make
Integrating OFSwitch with ns-3 To install ns-3.26, use the following command: $ hg clone http://code.nsnam.org/ns-3.26
30
Simulating the basic network topology with SDN based OFSwitch In this section of the article, we’ll simulate a basic network topology with three hosts, a switch and a controller. Figure 3 demonstrates the topology of the network that we want to create. It includes three hosts, one switch and one controller.
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Admin Here, host2 pings the other two Here, ofswitch13 domain comes into hosts—host1 and host3. Whenever either action. of the hosts makes a ping request, it is forwarded to the switch. This is indicated Ptr
of13Helper=Cr by arrows shown in blue. As this is the eateObject(); first request, the switch’s flow table will of13Helper->InstallController not contain any entry. This is known as (controllerNode); a table miss. Thus, the request will be //to install forwarded to the controller. controller on node The controller will instruct the switch of13Helper->InstallSwitch (switchNode, with respect to the routing decision to be switchPorts); Figure 3: Network topology made and to also modify the ow table in //to install OFSwitch the switch. This is shown by the arrows in of13Helper->CreateOpenFlowChannels (); green. The request will then be forwarded by the switch to the appropriate destination, i.e., to host1 and host3. //for creating channels between Switch and The next time the same request is forwarded to the switch, controller the switch’s ow table will contain an entry for that request and, thus, the switch itself will make routing decisions based on Ipv4AddressHelper ipv4helpr; //set IPv4 addresses that entry, without the controller in action. Ipv4InterfaceContainer hostIpIfaces; The explanation for the code to simulate the above ipv4helpr.SetBase (“10.97.7.0”, “255.255.255.0”); topology is given below. Only the required extracts of //IPv4 range starts from the code are given. The accompanying lines of code 10.97.7.0 demonstrate the extra header documents required for rehostIpIfaces = ipv4helpr.Assign (hostDevices); enacting wired systems. Below lines will congure ping applications between hostsController
Switch
Host 1
Host 3
Host 2
#include #include
V4PingHelper pingHelper = V4PingHelper (hostIpIfaces. GetAddress (1));
The following lines of code create an object called ‘hosts’ of class NodeContainer common to all the other nodes. Here, three hosts are created.
pingHelper.SetAttribute (“Verbose”, BooleanValue (true)); ApplicationContainer pingApps1= pingHelper.Install (hosts. Get (0)); pingApps1.Start (Seconds (1));
NodeContainer hosts;
ApplicationContainer pingApps2 = pingHelper.Install (hosts.
hosts.Create (3);
Get (0)); pingApps2.Start (Seconds (1));
Ptr switchNode = CreateObject ();
Here, two hosts and one object of class ApplicationContainer called pingApps is created. Now the code for simulator to work-
//to create node for switch CsmaHelper csmaHelper; NetDeviceContainer hostDevices; NetDeviceContainer switchPorts;
Simulator::Stop (Seconds (10));
for(size_t i = 0;i
//simulation time is 10 seconds
Simulator::Run (); Simulator::Destroy ();
NodeContainer pair (hosts.Get (i), switchNode); NetDeviceContainer link = csmaHelper.Install (pair); hostDevices.Add (link.Get (0)); //two way linking switchPorts.Add (link.Get (1));
It is recommended that you save ofswitch13-modify.cc at this path (ns-dev/scratch/ ). To run the program, use the following command:
} $ ./waf --run ofswitch13-modify Ptr controllerNode = CreateObject (); //to create node for controller
The output is given in Figure 4. As shown in the gure, the host with the IP address 10.97.7.2 pings the other two hosts, and the ping is www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
31
L o o n y c o r n Some of our recent courses: On Udemy:
GCP: Complete Google Data Engineer and Cloud Architect Guide
Time Capsule: Trends in Tech, Product Strategy
Show And Tell: Sikuli - Pattern-Matching and Automation
On Pluralsight:
Understanding the Foundations of TensorFlow
Working with Graph Algorithms in Python
Building Regression Models in TensorFlow
L o o n y c o r n Our Content:
The Careers in Computer Science Bundle 9 courses | 139 hours The Complete Machine Learning Bundle 10 courses | 63 hours The Complete Computer Science Bundle 8 courses | 78 hours The Big Data Bundle 9 courses | 64 hours The Complete Web Programming Bundle 8 courses | 61 hours The Complete Finance & Economics Bundle 9 courses | 56 hours The Scientific Essentials Bundle 7 courses | 41 hours ~30 courses on Pluralsight ~80 on StackSocial ~75 on Udemy
About Us:
ex-Google | Stanford | INSEAD 80,000+ students
Admin
Let’s Try
Figure 4: Output of ofswitch13-modify.cc Figure 6: Position of hosts in NetAnim
Figure 5: Output of the log file
successful with nine packets being transmitted. The time statistics are also shown. In the program’s code, to view the log le, set trace=true. This will generate switch-stats-1.log in the ns-dev folder.
Visualising the basic working of the controller switch using Net Animator Network Animator, also known as NetAnim, is used to graphically portray projects in ns-3. It is an ofine animator, which animates the XML le generated during the simulation program in ns-3. ns-2 has many default animators for use but ns-3 is furnished with no default animator. So, we have to integrate NetAnim with ns-3. NetAnim version 3.107 is used for visualising. To visualise the above program code ( ofswitch13-modify. cc) in NetAnim, take the following steps. Add the following few lines of code:
Figure 7: Packet flow in NetAnim
The other three nodes are the created hosts. Node 0: 10.97.7.1 Node 1: 10.97.7.2 Node 2: 10.97.7.3 Figure 7 is a screenshot of the generated XML le with graphical simulation in NetAnim.
References
[1] https://www.nsnam.org/docs/release/3.26/doxygen/ index.html [2] https://www.opennetworking.org/images/stories/ downloads/sdn-resources/onf-specications/openflow/ openflow-switch-v1.5.0.noipr.pdf [3] http://www.lrc.ic.unicamp.br/ofswitch13/ofswitch13.pdf The source repository can be downloaded from: https://bitbucket.org/yashsquare/ns3_support
#include // extra header le for Network Animator in ns3 AnimationInterface::SetConstantPosition (hosts.Get(0),50,50); AnimationInterface::SetConstantPosition (hosts.Get(1),10,60); AnimationInterface::SetConstantPosition (hosts.Get (240, 25); AnimationInterface anim (“ofs13-modify.xml”);
By: Radha Govani, Yash Modi and Jitendra Bhatia The above four lines of code will set the position of the hosts (nodes) at the given coordinates on the X-Y plane (refer to the screenshot in Figure 6), and then generate the ofs13modify.xml le for the ofswitch13-modify.cc . In Figure 6, the node with the IP address 10.100.1 (the upper left corner) represents OFSwitch with SDN controller.
34
Radha Govani and Yash Modi are open source enthusiasts. You can contact them at [email protected] and [email protected]. Jitendra Bhatia works as assistant professor at Vishwakarma Government Engineering College. You can contact him at [email protected].
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Admin
Taming the Cloud: Provisioning with Terraform Terraform is open source software that enables sysadmins and developers to write, plan and create infrastructure as code. It is a no-frills software package, which is very simple to set up. It uses a simple configuration language or JSON, if you wish.
T
erraform is a tool to create and manage infrastructure that works with various IaaS, PaaS and SaaS service providers. It is very simple to set up and use, as there aren’t multiple packages, agents and servers, etc, involved. You just declare your infrastructure in a single (or multiple) le using a simple conguration language (or JSON), and that’s it. Terraform takes your congurations, evaluates the various building blocks from those to create a dependency graph, and presents you a plan to create the infrastructure. When you are satised with the creation plan, you apply the congurations and Terraform creates independent resources in parallel. Once some infrastructure is created using Terraform, it compares the current state of the infrastructure with the declared congurations on subsequent runs, and only acts upon the changed part of the infrastructure. Essentially, it is a CRUD (Create Read Update Destroy) tool and acts on the infrastructure in an idempotent manner.
it anywhere in your executable’s search path and all is ready to run. The following script could be used to download, unzip and verify the set-up on your GNU/Linux or Mac OS X nodes: HCTLSLOC=’/usr/local/bin’ HCTLSURL=’https://releases.hashicorp.com’ # use latest version shown on https://www.terraform.io/ downloads.html TRFRMVER=’x.y.z’ if uname -v | grep -i darwin 2>&1 > /dev/null then OS=’darwin’ else OS=’linux’ f
wget -P /tmp --tries=5 -q -L “${HCTLSURL}/
Installation and set-up
terraform/${TRFRMVER}/terraform_${TRFRMVER}_${OS}_amd64.zip”
Terraform is created in Golang, and is provided as a static binary without any install dependencies. You just pick the correct binary (for GNU/Linux, Mac OS X, Windows, FreeBSD, OpenBSD and Solaris) from its download site, unzip
sudo unzip -o “/tmp/terraform_${TRFRMVER}_${OS}_amd64.zip” -d “${HCTLSLOC}” rm -fv “/tmp/terraform_${TRFRMVER}_${OS}_amd64.zip” terraform version
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
35
Admin
Let’s Try
Concepts that you need to know You only need to know a few concepts to start using Terraform quickly to create the infrastructure you desire. ‘Providers’ are some of the building blocks in Terraform which abstract different cloud services and back-ends to actually CRUD various resources. Terraform gives you different providers to target different service providers and back-ends, e.g., AWS, Google Cloud, Digital Ocean, Docker and a lot of others. You need to provide different attributes applicable to the targeted service/ back-end like the access/secret keys, regions, endpoints, etc, to enable Terraform to create and manage various cloud/ back-end resources. Different providers offer various resources which correspond to different building blocks, e.g., VMs, storage, networking, managed services, etc. So only a single provider is required to make use of all the resources implemented in Terraform, to create and manage infrastructure for a service or back-end. There are ‘provisioners’ that correspond to different resources to initialise and congure those resources after their creation. The provisioners mainly do tasks like uploading les, executing remote/local commands/scripts, running conguration management clients, etc. You need to describe your infrastructure using a simple configuration language in single or multiple files, all with the .tf extension. The configuration model of Terraform is declarative, and it mainly merges all the .tf files in its working directory at runtime. It resolves the dependencies between various resources by itself to create the correct final dependency graph, to bring up independent resources in parallel. Terraform could use JSON as well for its configuration language, but that works better when Terraform configurations are generated by automated tools. The Terraform format is more human-readable and supports comments, so you could mix and match .tf and .json configuration files in case some things are human coded and others are tool generated. Terraform also provides the concepts of variables, and functions working on those variables, to store, assign and transform various things at runtime. The general workflow of Terraform consists of two stages —to plan and apply. The plan stage evaluates the merged (or overridden) configs, and presents a plan before the operator about which resources are going to get created, modified and deleted. So the changes required to create your desired infrastructure are pretty clear at the plan stage itself and there are no surprises at runtime. Once you are satisfied with the plan generated, the apply stage initiates the sequence to create the resources required to build your declared infrastructure. Terraform keeps a record of the created infra in a state file (default, terraform.tfstate ) and on every further plan-and-apply cycle, it compares the current state of the infra at runtime with the cached state. After
36
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
the comparison of states, it only shows or applies the difference required to bring the infrastructure to the desired state as per its configuration. In this way, it creates/maintains the whole infra in an idempotent manner at every apply stage. You could mark vari ous resources manually to get updated in the next apply phase using the taint operation. You could also clean up the infra created, partially or fully, with the destroy operation.
Working examples and usage Our rst example is to clarify the syntax for various sections in Terraform conguration les. Download the code example1.tf from http://opensourceforu.com/ article_source_code/sept17/terraform.zip . The code is a template to bring up multiple instances of AWS EC2 VMs with Ubuntu 14.04 LTS and an encrypted EBS data volume, in a specied VPC subnet, etc. The template also does remote provisioning on the instance(s) brought up by transferring a provisioning script and doing some remote execution. Now, let’s dissect this example, line by line, in order to practically explore the Terraform concepts. The lines starting with the keyword variable are starting the blocks of input variables to store values. The variable blocks allow the assigning of some initial values used as default or no values at all. In case of no default values, Terraform will prompt for the values at runtime, if these values are not set using the option -var ‘=’ . So, in our example, sensitive data like AWS access/private keys are not b eing put in the template as it is advisable to supply these at runtime, manually or through the command options or through environment variables. The environment variables should be in the form of TF_VAR_name to let Terraform read it. The variables could hold string, list and map types of values, e.g., storing a map of different amis and subnets for different AWS regions as demonstrated in our example. The string value is contained in double quotes, lists in square brackets and maps in curly braces. The variables are referenced, and their values are extracted through interpolation at different places using the syntax ${var.} . You could explore everything about Terraform variables on the official variables help page. It’s easy to guess that the block starting with the keyword provider is declaring and supplying the arguments for the service/back-end. The different providers take different arguments based upon the service/back-end being used and you could explore those in detail on the ofcial providers page. The resource keyword contains the main meat in any Terraform conguration. We are using two AWS building blocks in our example: aws_instance to bring up instances
Let’s Try Admin and aws_route53_record to create cname records for the instances created. Every resource block takes up some arguments to customise the resource(s) it creates and exposes some attributes of the resource(s) created. Each resource block starts with resource , and the important thing is that the combination should be unique in the same Terraform conguration scope. The prex of each resource is linked to its provider, e.g., all the AWS prex resources require an AWS provider. The simple form of accessing the attribute of a resource is ... Our example shows that the public_ip and public_dns attributes of the created instances are being accessed in route53 and output blocks. Some of the resources require a few post-creation actions like connecting and running local and/or remote commands, scripts, etc, on AWS instance(s). The connection block is declared to connect to that resource, e.g., by creating a ssh connection to the created instances in our example. The provisioner blocks are the mechanisms to use the connection to upload le(s) and the directory to the created resource(s). The provisioners also run local or remote commands, while Chef runs concurrently. You could explore those aspects in detail on the ofcial provisioners help page. Our example is uploading a provisioning script and kicking that off remotely over ssh to provision the created instances out-of-the-box. Terraform provides some meta-parameters available to all the resources, like the count argument in our example. The count.index keeps track of the current resource being created to reference that now or later, e.g., we are creating a unique name tag for each instance created, in our example. Terraform deducts the proper dependencies as we are referencing the attribute of aws_instance in aws_route53_record; so it creates the instances before creating their cname records. You could use meta-variable depends_on in cases where there is no implicit dependency between resources and you want to ensure that explicitly. The above-mentioned variables help the page provide detailed information about the metavariables too. The last block declared in our example conguration is the output block. As is evident by the name itself, the output could dump the raw or transformed attributes of the resources created, on demand, at any time. You can also see the usage of various functions like the format and the element in the example conguration. These functions transform the variables into other useful forms, e.g., the element function is retrieving the correct public_ip based upon the current index of the instances created. The ofcial interpolation help page provides detailed information about the various functions provided by Terraform. Now let’s look at how to decipher the output being dumped when we invoke different phases of the Terraform workow. We’ll observe the following kind of output if
we execute the command terraform plan -var ‘num_ nds=”3”’ after exporting the TF_VAR_aws_access_key and TF_VAR_aws_access_key, in the working directory where the rst example cong was created: + aws_instance.test.0 ... + aws_instance.test.1 ... + aws_instance.test.2 ... + aws_route53_record.test.0 ... + aws_route53_record.test.1 ... + aws_route53_record.test.2 Plan: 6 to add, 0 to change, 0 to destroy.
If there is some error in the conguration, then that will come up in the plan phase only and Terraform dumps the parsing errors. You can explicitly verify the conguration for any issue using the terraform validate command. If all is good, then the plan phase dumps the resources it’s going to create (indicated by the + sign before the resources’ names, in green colour) to converge to the declared model of the infrastructure. Similarly, the Terraform plan output represents the resources it’s going to delete in red (indicated by the – sign) and the resources it will update in yellow (indicated by the ~ sign). Once you are satised with the plan of resources creation, you can run terraform apply to apply the plan and actually start creating the infrastructure. Our second example is to get you more comfortable with Terraform, and use its advanced features to create and orchestrate some non-trivial scenarios. The code example2.tf can be downloaded from http:// opensourceforu.com/article_source_code/sept17/ terraform.zip. It actually automates the task of bringing up a working cluster out-of-the-box. It brings up a congurable number of multi-disk instances from the cluster payload AMI, and then initiates a specic order of remote provisioners using null_resource, some provisioners on all the nodes and some only on a specic one, respectively. In the example2.tf template, multiple null_resource are triggered in response to the various resources created, on which they depend. In this way, you can see how easily we can orchestrate some not-so-trivial scenarios. You can also see the usage of depends_on meta-variable to ensure a dependency sequence between various resources. Similarly, you can mark those resources created by Terraform that you want to destroy www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
37
Admin
Let’s Try
or those resources that you wish to create afresh using the commands terraform destroy and terraform taint, respectively. The easy way to get quick information about the Terraform commands and their options/arguments is by typing terraform and terraform -h. The recent versions of Terraform have started to provide data sources, which are the resources to gather dynamic information from the various providers. The dynamic information gathered through the data sources is used in the Terraform congurations, most commonly using interpolation. A simple example of a data source is to gather the ami id for the latest version of an ami and use that in the instance provisioning congurations as shown below:
module to use the same code to provision test and/or production clusters. The usage of the module is simply supplying the required variables to it in the manner shown below (after running terraform get to create the necessary link for the module code): module “myvms” { source
= “../modules/awsvms”
ami_id
= “${var.ami_id}”
inst_type = “${var.inst_type}” key_name = “${var.key_name}” subnet_id = “${var.subnet_id}”
data “aws_ami” “myami” {
sg_id
= “${var.sg_id}”
num_nds
= “${var.num_nds}”
hst_env
= “${var.hst_env}”
apps_pckd = “${var.apps_pckd}”
most_recent = true
hst_rle
= “${var.hst_rle}”
root_size = “${var.root_size}” flter {
swap_size = “${var.swap_size}”
name = “name”
vol_size = “${var.vol_size}”
values = [“MyBaseImage”]
zone_id
}
= “${var.zone_id}”
prov_scrpt= “${var.prov_scrpt}”
}
sub_dmn
= “${var.sub_dmn}”
} resource “aws_instance” “myvm” { ami = “${data.aws_ami.myami.id} … }
You also need to create a variables.tf in the location of your module source, requiring the same variables you ll in your module. Here is the module variables.tf to pass the variables supplied from the caller of the module:
Code organisation and reusability Although our examples show the entire declarative configuration in a single file, we should break it into more than one f ile. You could break your whole config into various separate configs based upon the respective functionality they provide. So our first example could be broken into variables.tf that keeps all the variables blocks, aws.tf that declares our provider, instances. tf that declares the layout of the AWS VMs, route53. tf that declares the aws route 53 functionality, and output.tf for our outputs. To keep things simple, use and maintain, keep everything related to a whole task being solved by Terraform in a single directory along with sub-directories that are named as files, scripts, keys, etc. Terraform doesn’t enforce any hierarchy of code organisation, but keeping each high level functionality in its dedicated directory will save you from unexpected Terraform actions in spite of unrelated configuration changes. Remember, in the s oftware world, “A little copying is better than a little dependency,” as things get fragile and complicated easily with each added functionality. Terraform provides the functionality of creating modules to reuse the configs created. The cluster creation template shown above is actually put in a
38
variable “ami_id” {} variable “inst_type” {} variable “key_name” {} variable “subnet_id” {} variable “sg_id” {} variable “num_nds” {} variable “hst_env” {} variable “apps_pckd” {} variable “hst_rle” {} variable “root_size” {} variable “swap_size” {} variable “vol_size” {} variable “zone_id” {} variable “prov_scrpt” {} variable “sub_dmn” {}
The Terraform ofcial documentation consists of a few detailed sections for modules usage and creation, which should provide you more information on everything related to modules.
Importing existing resources As we have seen earlier, Terraform caches the properties of the resources it creates into a state file, and by default doesn’t know about the resources not created through it.
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Admin But recent versions of Terraform have introduced a feature to import existing resources not created through Terraform into its state file. Currently, the import feature only updates the state file, but the user needs to create the configuration for the imported resources. Otherwise, Terraform will show the imported resources with no configuration and mark those for destruction. Let’s make this clear by importing an AWS instance, which wasn’t brought up through Terraform, into some Terraform-created infrastructure. You need to run the command terraform import aws_instances. in the directory where a Terraform state le is located. After the successful import, Terraform gathers information about the instance and adds a corresponding section in the state le. If you see the Terraform plan now, it’ll show something like what follows: - aws_instance .
Resource Name>
So it means that now you need to create a corresponding conguration in an existing or new .tf le. In our example, the following Terraform section should be enough to not let Terraform destroy the imported resource. resource “aws_instance” “” { ami = “” instance_type = “” tags { ... } }
Please note that you only need to mention the Terraform resource attributes that are required as per the Terraform document. Now, if you see the Terraform plan, the earlier shown destruction plan goes away for the imported resource. You could use the following command to extract the attributes of the imported resource to create its conguration:
The ofcial Terraform documentation provides clear examples to import the various resources into an existing Terraform infrastructure. But if you are looking to include the existing AWS resources in the AWS infra created by Terraform in a more automated way, then take a look at the Terraforming tool link in the References section. Note: Terraform
providers are no longer distributed as part of the main Terraform distribution. Instead, they are installed automatically as part of running terraform init. The import command requires that imported resources be specied in the conguration le. Please see terraform changelog https://github.com/hashicorp/terraform/blob/ v0.10.0/CHANGELOG.md for these.
Missing bytes You should now be feeling comfortable about starting to automate the provisioning of your cloud infrastructure. To be frank, Terraform is so feature-rich now that it can’t be fully covered in a single or multiple articles and deserves a dedicated book (which has already shaped up in the form of an ebook, ‘Terraform Up & Running’). So you could further take a look at the examples provided in its ofcial Git repo. Also, the References section offers a few pointers to some excellent reads to make you more comfortable and condent with this excellent cloud provisioning tool. Creating on-demand and scalable infrastructure in the cloud is not very difcult if some very simple basic principles are adopted and implemented using some featurerich but no-fuss, easy-to-use standalone tools. Terraform is an indispensable tool for creating and managing cloud infrastructure in an idempotent way across a number of cloud providers. It could further be glued together with some other management pieces to create an immutable infrastructure workow that can tame any kind of modern cloud infrastructure. The ‘Terraform Up and Running’ ebook is already out in the form of a print book.
References sed -n ‘/aws_instance./,/}/p’ terraform.tfstate | \ grep -E ‘ami|instance_type|tags’ | grep -v ‘%’ | sed ‘s/^ *//’ | sed ‘s/:/ =/’
Please pay attention when you import a resource into your current Terraform state and decide not to use that going forward. In which case, don’t forget to rename your terraform.state.backup as terraform.state le to roll back to the previous state. You could also delete that resource block from your state le, as an alternative, but it’s not a recommended approach. Otherwise, Terraform will try to delete the imported but not desired resource and that could be catastrophic in some cases.
[1] Terraform examples: https://github.com/hashicorp/ terraform/tree/master/examples [2] Terraforming tool: https://github.com/dtan4/terraforming [3] A Comprehensive Guide to Terraform: https://blog. gruntwork.io/a-comprehensive-guide-to-terraformb3d32832baca#.ldiays7wk [4] Terraform Up & Running: http://www. terraformupandrunning.com/?ref=gruntwork-blogcomprehensive-terraform
By: Ankur Kumar The author is a systems and infrastructure developer/ architect and FOSS researcher, currently based in the US. You can nd some of his other writings on FOSS at: https://github.com/richnusgeeks.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
39
Admin
Let’s Try
Visualising the Response Time of a Web Server Using Wireshark The versatile Wireshark tool can be put to several uses. This article presents a tutorial on using Wireshark to discover and visualise the response time of a Web server.
W
ireshark is a cross-platform network analysis tool used to capture packets in real-time. Wireshark includes lters, ow statistics, colour coding, and other features that allow you to get a deep insight into network trafc and to inspect individual packets. Discovering the delayed HTTP responses for a particular HTTP request from a particular PC is a tedious task for most admins. This tutorial will teach readers how to discover and visualise the response time of a Web server using Wireshark. OSFY has published many articles on Wireshark, which you can refer to for a better understanding of the topic. Step 1: Start capturing the packets using Wireshark on a specied interface to which you are connected. Refer to the
bounding box in Figure 1 for available interfaces. In this tutorial, we are going to capture Wi-Fi packets, so the option ‘Wi-Fi’ has been selected (if you wish to capture the packets using Ethernet or any other interface, select the corresponding options). Step 2: Here, we make a request to http://www. wikipedia.org and, as a result, Wikipedia sends an HTTP response of ‘200 OK’, which indicates the requested action was successful. ‘200 OK’ implies that the response contains a payload, which represents the status of the requested resource (the request is successful). Now filter all the HTTP packets as shown in Figure 2, as follows: syntax: http
Figure 1: Interface selection
40
Figure 2: Filtering HTTP
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Admin Step 3: We now lter the requests and response sent from the local PC to Wikipedia and vice versa. Start ltering the IP of www.wikipedia.org (a simple traceroute or pathping can reveal the IP address of any Web server) and your local PC IP (a simple ipcong for Windows and ifcong for Linux can reveal your local PC IP). Syntax: ip.addr== 91.198.174.192 && ip.addr == 192.168.155.59
Figure 3: Allow sub-dissector to reassemble TCP streams
Step 4: In order to view the response of HTTP, right-click on any response packet (HTTP/1.1). Go to Protocol preference and then uncheck the subdissector to reassemble TCP streams (marked and shown in Figure 3). If the TCP preference ‘Allow sub-dissector to reassemble TCP streams’ is off, the http.time will be the time between the GET request and the rst packet of the response, the one containing ‘OK’. If ‘Allow sub-dissector to reassemble TCP streams’ is on and the HTTP reassembly preferences have been left at their defaults (on), http.time will be the time between the GET request and the last packet of the response. Procedure: Right-click on any HTTP response packet -> Protocol preference -> uncheck ‘Reassemble HTTP headers spanning multiple TCP segments’ and ‘Reassemble HTTP bodies spanning multiple TCP segments’. Step 5: Create a lter based on the response time as shown in Figure 4, and visualise the HTTP responses using an I/O graph as shown in Figure 5.
Figure 4: Response time
Syntax: http.time >= 0.050000
Figure 5: Statistics --> I/O graph
Step 6: To calculate the delta (delay) time between request and response, use Time Reference (CTRL-T in the GUI) for easy delta time calculation. Step 7: In order to display only the HTTP response, add a lter http.time >=0.0500 in the display lter. The graph, as shown in Figure 6, depicts the result of the HTTP responses (delta time).
By: M. Kannan, Poomanam and Prem Latha M. Kannan is an associate professor and head of the department of electronics engineering, Madras Institute of Technology. His research interests include computer networks, VLSI, embedded systems and wireless security.
Figure 6: Visualisation of HTTP responses
Poomanam and Prema Latha are specialists in VLSI at the Madras Institute of Technology, Anna University. Their research interests include computer networks, VLSI and embedded design.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
41
Admin
How To
DevOps Series Creating a Virtual Machine for Erlang/OTP Using Ansible
This seventh article in the DevOps series is a tutorial on how to create a test virtual machine (VM) to compile, build, and test Erlang/ OTP from its source code. You can then adapt the method to create different VMs for various Erlang releases.
E
rlang is a programming language designed by Ericsson primarily for soft real-time systems. The Open Telecom Platform (OTP) consists of libraries, applications and tools to be used with Erlang to implement services that require high availability. In this article, we will create a test virtual machine (VM) to compile, build, and test Erlang/OTP from its source code. This allows you to create VMs with different Erlang release versions for testing. The Erlang programming language was developed by Joe Armstrong, Robert Virding and Mike Williams in 1986 and released as free and open source software in 1998. It was initially designed to work with telecom switches, but is widely used today in large scale, distributed systems. Erlang is a concurrent and functional programming language, and is released under the Apache License 2.0.
The IP address of the guest CentOS 6.8 VM is added to the inventory le as shown below:
Setting it up
## Allows people in group wheel to run all commands
A CentOS 6.8 virtual machine (VM) running on KVM is used for the installation. Internet access s hould be available from the guest machine. The VM should have at least 2GB of RAM allotted to build the Erlang/OTP documentation. The Ansible version used on the host (Parabola GNU/Linux-libre x86_64) is 2.3.0.0. The ansible/ folder contains the following les:
%wheel ALL=(ALL)
erlang ansible_host=192.168.122.150 ansible_connection=ssh ansible_user=bravo ansible_password=password
An entry for the erlang host is also added to the /etc/hosts le as indicated below: 192.168.122.150 erlang
A ‘bravo’ user account is created on the test VM, and is added to the ‘wheel’ group. The /etc/sudoers le also has the following line uncommented, so that the ‘bravo’ user will be able to execute sudo commands:
ALL
We can obtain the Erlang/OTP sources from a stable tarball, or clone the Git repository. The steps involved in both these cases are discussed below.
Building from the source tarball ansible/inventory/kvm/inventory ansible/playbooks/confguration/erlang.yml
42
The Erlang/OTP stable releases are available at http://www. erlang.org/downloads. The build process is divided into many
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Admin
How To
steps, and we shall go through each one of them. The version of Erlang/OTP can be passed as an argument to the playbook. Its default value is the release 19.0, and is dened in the variable section of the playbook as shown below:
- name: Download and extract Erlang source tarball unarchive: src: “http://erlang.org/download/{{ ERL_VERSION }}.tar. gz” dest: “{{ ERL_DIR }}” remote_src: yes
vars:
ERL_VERSION: “otp_src_{{ version | default(‘19.0’) }}” ERL_DIR: “{{ ansible_env.HOME }}/installs/erlang” ERL_TOP: “{{ ERL_DIR }}/{{ ERL_VERSION }}” TEST_SERVER_DIR: “{{ ERL_TOP }}/release/tests/test_server”
The ERL_DIR variable represents the directory where the tarball will be downloaded, and the ERL_TOP variable refers to the top-level directory location containing the s ource code. The path to the test directory from where the tests will be invoked is given by the TEST_SERVER_DIR variable. Erlang/OTP has mandatory and optional package dependencies. Let’s rst update the software package repository, and then install the required dependencies as indicated below: tasks:
- name: Update the software package repository become: true
yum: name: ‘*’
The ‘congure’ script is available in the sources, and it is used to generate the Makele based on the installed software. The ‘make’ command will build the binaries from the source code. - name: Build the project command: “{{ item }} chdir={{ ERL_TOP }}” with_items: - ./congure - make environment: ERL_TOP: “{{ ERL_TOP }}”
After the ‘make’ command nishes, the ‘bin’ folder in the top-level sources directory will contain the Erlang ‘erl’ interpreter. The Makele also has targets to run tests to verify the built binaries. We are remotely invoking the test execution from Ansible and hence -noshell -noinput are passed as arguments to the Erlang interpreter, as shown in the . yaml le.
update_cache: yes - name: Prepare tests - name: Install dependencies become: true
command: “{{ item }} chdir={{ ERL_TOP }}” with_items:
package: name: “{{ item }}”
- make release_tests environment:
state: latest
ERL_TOP: “{{ ERL_TOP }}”
with_items: - wget
- name: Execute tests
- make
shell: “cd {{ TEST_SERVER_DIR }} && {{ ERL_TOP }}/bin/erl
- gcc
-noshell -noinput -s ts install -s ts smoke_test batch -s
- perl
init stop”
- m4 - ncurses-devel - sed - libxslt - fop
The Erlang/OTP sources are written using the ‘C’ programming language. The GNU C Compiler (GCC) and GNU Make are used to compile the source code. The ‘libxslt’ and ‘fop’ packages are required to generate the documentation. The build directory is then created, the source tarball is downloaded and it is extracted to the directory mentioned in ERL_DIR.
You need to verify that the tests have passed successfully by checking the $ERL_TOP/release/tests/test_server/index. html page in a browser. A screenshot of the test results is shown in Figure 1. The built executables and libraries can then be installed on the system using the make install command. By default, the install directory is /usr/local. - name: Install command: “{{ item }} chdir={{ ERL_TOP }}” with_items: - make install become: true
- name: Create destination directory
environment:
le: path=”{{ ERL_DIR }}” state=directory
44
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
ERL_TOP: “{{ ERL_TOP }}”
How To Admin
server” tasks: - name: Update the software package repository become: true
yum: name: ‘*’ update_cache: yes - name: Install dependencies become: true
package: name: “{{ item }}” state: latest
with_items: - wget - make - gcc - perl - m4
- ncurses-devel
Figure 1: Test results
- sed
The documentation can also be generated and installed as shown below:
- libxslt - fop
- name: Make docs
- name: Create destination directory
shell: “cd {{ ERL_TOP }} && make docs”
le: path=”{{ ERL_DIR }}” state=directory
environment: ERL_TOP: “{{ ERL_TOP }}” FOP_HOME: “{{ ERL_TOP }}/fop”
- name: Download and extract Erlang source tarball
FOP_OPTS: “-Xmx2048m”
unarchive: src: “http://erlang.org/download/{{ ERL_VERSION
}}.tar.gz” - name: Install docs
dest: “{{ ERL_DIR }}”
become: true
remote_src: yes
shell: “cd {{ ERL_TOP }} && make install-docs” environment:
- name: Build the project
ERL_TOP: “{{ ERL_TOP }}”
command: “{{ item }} chdir={{ ERL_TOP }}”
The total available RAM (2GB) is specifed in the FOP_OPTS environment variable. The complete playbook to download, compile, execute the tests, and also generate the documentation is given below:
with_items: - ./congure - make
environment: ERL_TOP: “{{ ERL_TOP }}”
--- name: Setup Erlang build
- name: Prepare tests
hosts: erlang gather_facts: true
command: “{{ item }} chdir={{ ERL_TOP }}”
tags: [release]
- make release_tests
vars:
with_items: environment: ERL_TOP: “{{ ERL_TOP }}”
ERL_VERSION: “otp_src_{{ version | default(‘19.0’) }}” ERL_DIR: “{{ ansible_env.HOME }}/installs/erlang” ERL_TOP: “{{ ERL_DIR }}/{{ ERL_VERSION }}” TEST_SERVER_DIR: “{{ ERL_TOP }}/release/tests/test_
- name: Execute tests shell: “cd {{ TEST_SERVER_DIR }} && {{ ERL_TOP }}/bin/ erl -noshell -noinput -s ts install -s ts smoke_test batch -s
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
45
Admin
How To
init stop”
become: true
- name: Install
name: “{{ item }}”
command: “{{ item }} chdir={{ ERL_TOP }}”
with_items:
state: latest
- make install
package:
with_items: - wget
become: true
- make
environment:
- gcc
ERL_TOP: “{{ ERL_TOP }}”
- perl - m4
- name: Make docs
- ncurses-devel
shell: “cd {{ ERL_TOP }} && make docs”
- sed
environment:
- libxslt
ERL_TOP: “{{ ERL_TOP }}”
- fop
FOP_HOME: “{{ ERL_TOP }}/fop”
- git
FOP_OPTS: “-Xmx2048m”
- autoconf
- name: Install docs
- name: Create destination directory
become: true
le: path=”{{ ERL_DIR }}” state=directory
shell: “cd {{ ERL_TOP }} && make install-docs”
environment: ERL_TOP: “{{ ERL_TOP }}”
- name: Clone the repository
git: repo: “https://github.com/erlang/otp.git”
The playbook can be invoked as follows:
dest: “{{ ERL_DIR }}/otp”
$ ansible-playbook -i inventory/kvm/inventory playbooks/
- name: Build the project
conguration/erlang.yml -e “version=19.0” --tags “release” -K
command: “{{ item }} chdir={{ ERL_TOP }}”
with_items:
Building from the Git repository
- ./otp_build autoconf
We can build the Erlang/OTP sources from the Git repository too. The complete playbook is given below for reference:
- ./congure - make
environment: ERL_TOP: “{{ ERL_TOP }}”
- name: Setup Erlang Git build hosts: erlang gather_facts: true tags: [git] vars: GIT_VERSION: “otp”
The ‘git’ and ‘autoconf’ software packages are required for downloading and building the sources from the Git repository. The Ansible Git module is used to clone the remote repository. The source directory provides an otp_build script to create the confgure script. You can invoke the above playbook as follows:
ERL_DIR: “{{ ansible_env.HOME }}/installs/erlang” ERL_TOP: “{{ ERL_DIR }}/{{ GIT_VERSION }}”
$ ansible-playbook -i inventory/kvm/inventory playbooks/
TEST_SERVER_DIR: “{{ ERL_TOP }}/release/tests/test_
conguration/erlang.yml --tags “git” -K
server” tasks: - name: Update the software package repository
You are encouraged to read the complete installation documentation at https://github.com/erlang/otp/blob/master/ HOWTO/INSTALL.md.
become: true
yum: name: ‘*’ update_cache: yes
By: Shakthi Kannan The author is a free software enthusiast and blogs at shakthimaan.com.
- name: Install dependencies
46
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Overview
Admin
An Introduction to govcsim (a vCenter Server Simulator) govcsim is a vCenter Server and ESXi API based simulator that offers a quick fix solution for prototyping and testing code. It simulates the vCenter Server model and can be used to create data centres, hosts, clusters, etc.
ovcsim (a vCenter Server simulator) is an open source vCenter Server and ESXi API based simulator written in the Go language, using the govmomi the govmomi library. library. govcsim simulates the vCenter Server model by creating various vCenter related objects like data centres, hosts, clusters, resource pools, networks and datastores. If you are a software developer or quality engineer who works with vCenter and related technologies, then you can use govcsim for fast prototyping and for testing your code. In this article, we will write an Ansible Playbook to gather all VMs installed on a given govcsim installation. Ansible provides many modules for managing and maintaining VMware resources. (You (You can nd out more about Ansible modules for managing VMware at http://docs.ansible.com/ansible/ list_of_cloud_modules.html#vmware.) Do list_of_cloud_modules.html#vmware.) Do note that govcsim will simulate almost the identical environments provided by VMWare vCenter and ESXi server.
g
The requirements for installing govcsim are: 1. Golang 1.7+ 2. Git Step 1: Installing Golang To install the Go tools, type the following command at the terminal: $ sudo dnf install -y golang
Step 2: Confguring the Golang workspace Use the following commands to congure the Golang workspace: $ mkdir -p $HOME/go $ echo ‘export GOPATH=$HOME/go’ GOPATH=$HOME/go’ >> $HOME/.bash $HOME/.bashrc rc $ source $HOME/.bash $HOME/.bashrc rc
Installation
Check if everything is working by using the command given below:
We will use Fedora 26 for the installation of govcsim. Let’s assume that Ansible has been already installed using dnf or a source tree.
$ go env GOPATH
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
47
S K C A R T
Open Source and You (Success Stories)
Application Development Day (talks on hybrid app development for th for the e en ente terp rpri rise se us use) e)
Cyber Security Day
Database Day
Open Source in IoT
Cloud and Big Data
Container Day
FREEZE YOUR CALENDER NO NOW! W!
www.opensourceindia www.opensourceindia.in .in
Asia’s #1 Conf Asia’s Confer erence ence on Open Source
Regist egister er No Now! w! http://register.opensourceindia.in
Stall Booking & Partnering Opportunit Oppor tunities ies Open Now For more details, details, call Omar on +91 +91 995 8855 8855 993 or write to info@osida info@osidays. ys.com. com.
Gold Partner
Silver Partner
Associate Partners
Admin Overview
Figure 3: Ansible Playbook to get details about the virtual machine
Figure 1: Getting help from vcsim
Figure 2: Starting vcsim without any parameters
This should return your home directory path with the Go workspace. Step 3: Download govcsim using the ‘go get’ command Figure 4: Ansible in action $ go get github.com/vmware/govmomi/vcsim
Ansible at https://docs.ansible.com/ansible/. After running the playbook from Figure 3, you will get a list of virtual machine objects that are simulated by the govcsim server (see Figure 4). You can play around and write different playbooks to get information about govcsim simulated VMware objects.
$ $GOPATH/bin/vcsim -h
If everything is congured correctly, you will be able to get the help options related to govcsim. To start govcsim without any argument, use the following command: $ vcsim
References [1] Ansible documentation: https://docs.ansible.com/ansible [2] Govcsim: https://github.com/vmware/govmomi/tree/ master/vcsim
Now, govcsim is working. You can check out the various methods available by visiting https://127.0.0.1:8989/about on your favourite browser.
By: Abhijeet Kasurde
Testing govcsim with Ansible
The author works at Red Hat and is a FOSS evangelist. He loves to explore new technologies and software. You can contact him at [email protected].
Now, let’s try to write a simple Ansible Playbook, which will list down all VMs emulated by govcsim. The complete code is given in Figure 3. You can read up more about
THE COMPLETE MAGAZINE ON OPEN SOURCE
The latest from the Open Source world is here.
OpenSourceForU.com Join the community at facebook.com/opensourceforu Follow us on Twitter @OpenSourceForU
50
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Insight Admin
Serverless Architectures:
Demystifying Serverless Computing Serverless architectures refer to applications that depend a lot on third party services known as BaaS (Backend as a Service), or on custom code which runs on FaaS (Function as a Service).
I
n the 1990s, Neal Ford (now at ThoughtWorks) was working in a small company that focused on a technology called Clipper. By writing an object-oriented framework based on Clipper, DOS applications were built using dBase. With the expertise the rm had on Clipper, it ran a thriving training and consulting business. Then, all of a sudden, this Clipper-based business disappeared with the rise of Windows. So Neal Ford and his team went scrambling to learn and adopt new technologies. “Ignore the march of technology at your peril,” is the lesson that one can learn from this experience. Many of us live inside ‘technology bubbles’. It is easy to get cozy and lose track of what is happening around us. All of a sudden, when the bubble bursts, we are left s crambling to nd a new job or business. Hence, it is important to stay relevant. In the 90s, that meant catching up with things like graphical user interfaces (GUIs), client/server technologies and later, the World Wide Web. Today, relevance is all about being agile and leveraging the cloud, machine learning, articial intelligence, etc. With this background, let’s delve into serverless computing, which is an emerging eld. In this article, readers will learn how to employ the serverless approach in their applications and discover key serverless technologies; we
will end the discussion by looking at the limitations of the serverless approach.
Why serverless? Most of us remember using server machines of one form or another. We remember logging remotely to server machines and working with them for hours. We had cute names for the servers - Bailey, Daisy, Charlie, Ginger, and Teddy - treating them well and taking care of them fondly. However, there were many problems in using physical servers like these: Companies had to do capacity planning and predict their future resource requirements. Purchasing servers meant high capital expenses (capex) for companies. We had to follow lengthy procurement processes to purchase new servers. We had to patch and maintain the servers … and so on. The cloud and virtualisation provided a level of exibility that we hadn’t known with physical servers. We didn’t have to follow lengthy procurement processes, or worry about who ‘owns the server’, or why only a particular team had ‘exclusive access to that powerful server’, etc. The task of procuring physical machines became obsolete with the arrival
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
51
Admin
Insight
of virtual machines (VMs) and the cloud. The architecture The solution is to set up some compute capacity to process we used also changed. For example, instead of scaling up data from a database and also execute this logic in a language by adding more CPUs or memory to physical servers, we of choice. For example, if you are using the AWS platform, started ‘scaling out’ by adding more machines as needed, you can use DynamoDB for the back-end, write programming but, in the cloud. This model gave logic as Lambda functions, and us the exibility of an opex-based expose them through the AWS API AWS Lambda (operational expenses-based) Gateway with a load balancer. This revenue model. If any of the VMs entire set-up does not require you to Key went down, we got new VMs provision any infrastructure or have Apache Serverless MS Azure spawned in minutes. In short, we any knowledge about underlying OpenWhisk Functions Platforms started treating servers as ‘cattle’ servers/VMs in the cloud. You can and not ‘pets’. use a database of your choice for However, the cloud and the back-end. Then choose any Google Cloud virtualisation came with their own programming language supported Functions problems and still have many in AWS Lambda, including Java, limitations. We are still spending a Python, JavaScript, and C#. There is Figure 1: Key serverless platforms lot of time managing them — for no cost involved if there aren’t any example, bringing VMs up and down, based on need. We have users using the MovieBot. If a blockbuster like ‘Baahubali’ is to architect for availability and fault-tolerance, size workloads, released, then there could be a huge surge in users accessing and manage capacity and utilisation. If we have dedicated VMs the MovieBot at the same time, and the set-up would provisioned in the cloud, we still have to pay for the reserved effortlessly scale (you have to pay for the calls, though). resources (even if it’s just idle time). Hence, moving from a Phew! You essentially engineered a serverless application. capex model to an opex one is not enough. What we need is to With this, it’s time to dene the term ‘serverless’. only pay for what we are using (and not more than that) and Serverless architectures refer to applications that signicantly ‘pay as you go’. Serverless computing promises to address depend on third-party services (known as Backend-as-aexactly this problem. Service or BaaS) or on custom code that’s run in ephemeral The other key aspect is agility. Businesses today need containers (Function-as-a-Service or FaaS). to be very agile. Technology complexity and infrastructure Hmm, that’s a mouthful of words; s o let’s dissect operations cannot be used as an excuse for not delivering this description. value at scale. Ideally, much of the engineering effort should Backend-as-a-Service: Typically, databases (often NoSQL be focused on providing functionality that delivers the avours) hold the data and can be accessed over the cloud, desired experience, and not in monitoring and managing the and a service can be used to help access that back-end. infrastructure that supports the scale requirements. This is Such a back-end service is referred to as BaaS. where serverless shines. Function-as-a-Service: Code that processes the requests (i.e., the ‘programming logic’ written in your favourite What is serverless? programming language) could be run on containers that are Consider a chatbot for booking movie tickets - let’s call it spun and destroyed as needed. They are known as FaaS. MovieBot. Any user can make queries about movies, book The word ‘serverless’ is misleading because it literally tickets, or cancel them in a conversational style (e.g., “Is means there are no servers. Actually, the word implies, “I don’t ‘Dunkirk’ playing in Urvashi Theatre in Bengaluru tonight?” care what a server is.” In other words, serverless enables us in voice or text). to create applications without thinking about servers, i.e., we This solution requires three elements: a chat interface can build and run applications or services without worrying channel (like Skype or Facebook Messenger), a natural about provisioning, managing or scaling the underlying language processor (NLP) to understand the user’s intentions infrastructure. Just put your code in the cloud and run it! Keep (e.g., ‘book a ticket’, ‘ticket availability’, ‘cancellation’, etc), in mind that this applies to Platform-as-a-Service (PaaS) as and then access to a back-end where the transactions and data well; although you may not deal with direct VMs with PaaS, pertaining to movies is stored. The chat interface channels you still have to deal with instance sizes and capacity. are universal and can be used for different kinds of bots. NLP Think of serverless as a piece of functionality to run can be implemented using technologies like AWS Lex or IBM — not in your machine but executed remotely. Typically, Watson. The question is: how is the back-end served? Would serverless functions are executed in an ‘event-driven’ you set up a dedicated server (or a cluster of servers), an API fashion — the functions get executed as a response to events gateway, deploy load balancers, or put in place identity and or requests on HTTP. In the case of the MovieBot, the access control mechanisms? That’s costly and painful, right! Lambda functions are invoked to serve user queries as and That’s where serverless technology can help. when user(s) interact with it.
52
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Insight Admin
Use cases With serverless architecture, developers can deploy certain types of solutions at scale with cost-effectiveness. We have already discussed developing chatbots - it is a classic use case for serverless computing. Other key use cases for the serverless approach are given below. 1) Three-tier Web applications: Conventional single page applications (SPA), which rely on REpresentative State Transfer (REST) based services to perform a given functionality, can be re-written to leverage serverless functions front-ended by an API gateway. This is a powerful pattern that helps your application scale innitely, without concerns of conguring scale-out or infrastructure resources.. 2) Scalable batch jobs: Batch jobs were traditionally run as daemons or background processes on dedicated VMs. More often than not, this approach hit scalability and had reliability issues - developers would leave their critical processes with Single Points of Failure (SPoF). With the serverless approach, batch jobs can now be redesigned as a chain of mappers and reducers, each running as independent functions. Such mappers and reducers will share a common data store, something like a blob storage or a queue, and can individually scale up to meet the data processing needs. 3) Stream processing: Related to scalable batch jobs is the pattern of ingesting and processing large streams of data for near-real-time processing. Streams from services like Kafka and Kinesis can be processed by serverless functions, which can be scaled seamlessly to reduce latency and increase the throughput of the system. This pattern can elegantly handle spiky loads as well. 4) Automation/event-driven processing: Perhaps the rst application of serverless computing was automation. Functions could be written to respond to certain alerts or events. These could also be periodically scheduled to augment the capabilities for the cloud service provider through extensibility. The kind of applications that are best suited for s erverless architectures include mobile back-ends, data processing systems (real-time and batch) and Web applications. In general, serverless architecture is suitable for any distributed system that reacts to events or process workloads dynamically, based on demand. For example, serverless computing is suitable for processing events from IoT (Internet of Things) devices, processing large data sets (in Big Data) and intelligent systems that respond to queries (chatbots).
Functions has good support for a wider variety of languages and integrates with Microsoft’s Azure services. Google’s Cloud Functions is currently in beta. One of the key open source players in serverless technologies is Apache OpenWhisk, backed by IBM and Adobe. It is often tedious to develop applications directly on these platforms (AWS, Azure, Google and OpenWhisk). The serverless framework is a popular solution that aims to ease application development on these platforms. Many solutions (especially open source) focus on abstracting away the details of container technologies like Docker and Kubernetes. Hyper.sh provides a container hosting service in which you can use Docker images directly in serverless style. Kubeless from Bitnami, Fission from Platform9, and funktion from Fabric8 are serverless frameworks that provide an abstraction over Kubernetes. Given that serverless architecture is an emerging approach, technologies are still evolving and are yet to mature. So you will see a lot of action in this space in the years to come. Join us at the India Serverless Summit 2017
These are the best of times, and these are the worst of times! There are so many awesome new technologies to catch up on. But, we simply can’t. We have seen a progression of computing models - from virtualisation, IaaS, PaaS, containers, and now, serverless - all in a matter of a few years. You certainly don’t want to be left behind. So join us at the Serverless Summit, India’s first confluence on serverless technologies, being held on October 27, 2017 at Bengaluru. It is the best place to hear from industry experts, network with technology enthusiasts, as well as learn about how to adopt serverless architecture. The keynote speaker is John W illis, director of ecosystem development at Docker and a DevOps guru (widely known for the book ‘The DevOps Handbook’ that he co-authored). Open Source For You is the media partner and the Cloud Native Computing Foundation is the community partner for this summit. For more details, please visi t the website www.inserverless.com.
Challenges in going serverless Despite the fact that a few large businesses are already powered entirely by serverless technologies, we should keep in mind that serverless is an emerging approach. There are many challenges we need to deal with when developing serverless solutions. Let us discuss them in the context of the MovieBot example mentioned earlier. Debugging Unlike in typical application development, there is no concept of a local environment for serverless functions. Even fundamental debugging operations like stepping-through, breakpoints, step-over and watch points are not available with serverless functions. As of now, we need to rely on extensive logging and instrumentation for debugging.
Serverless technologies There are many proprietary and a few open source serverless technologies and platforms available for us to choose from. AWS Lambda is the earliest (announced in late 2014 and released in 2015) and the most popular serverless technology, while other players are fast catching up. Microsoft’s Azure
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
53
Admin
Insight
When MovieBot provides an inconsistent response or does not understand the intent of the user, how do we debug the code that is running remotely? For situations such as this, we have to log numerous details: NLP scores, the dialogue responses, query results of the movie ticket database, etc. Then we have to manually analyse and do detective work to nd out what could have gone wrong. And, that is painful. State management Although serverless is inherently stateless, real-world applications invariably have to deal with state. Orchestrating a set of serverless functions becomes a signicant challenge when there is a common context that has to be passed between them. Any chatbot conversation represents a dialogue. It is important for the program to understand the entire conversation. For example, for the query, “Is ‘Dunkirk’ playing in Urvashi Theatre in Bengaluru tonight?” if the answer from MovieBot is “Yes”, then the next query from the user could be, “Are two tickets available?” If MovieBot conrms this, the user could say, “Okay, book it.” For this transaction to work, MovieBot should remember the entire dialogue, which includes the name of the movie, the theatre’s location, the city, and the number of tickets to book. This entire dialogue represents a sequence of s tateless function calls. However, we need to persist this state for the nal transaction to be successful. This maintenance of state external to functions is a tedious task. Vendor lock-in Although we talk about isolated functions that are executed independently, we are in practice tied to the SDK (software development kit) and the services provided by the serverless technology platform. This could result in vendor lock-in because it is difficult to migrate to other equivalent platforms. Let’s assume that we implement the MovieBot on the AWS Lambda platform using Python. Though the core logic of the bot is written as Lambda functions, we need to use other related services from the AWS platform for the chatbot to work, such as AWS Lex (for NLP), AWS API gateway, DynamoDB (for data persistence), etc. Further, the bot code may need to make use of the AWS SDK to consume the services (such as S3 or DynamoDB), and that is written using boto3. In other words, for the bot to be a reality, it needs to consume many more services from the AWS platform than just the Lambda function code written in plain Python. This results in vendor lock-in because it is harder to migrate the bot to other platforms. Other challenges Each serverless function code will typically have third party library dependencies. When deploying the serverless function, we need to deploy the third party dependency packages as well, and that increases the deployment package size. Because containers are used underneath to execute the serverless functions, the increased deployment size increases the latency to start up and execute the serverless functions.
Further, maintaining all the dependent packages, versioning them, etc, is a practical challenge as well. Another challenge is the lack of support for widely used languages from serverless platforms. For instance, as of May 2017, you can write functions in C#, Node.js (4.3 and 6.10), Python (2.7 and 3.6) and Java 8 on AWS Lambda. How about other languages like Go, PHP, Ruby, Groovy, Rust or any others of your choice? Though there are solutions to write serverless functions in these languages and execute them, it is harder to do so. Since serverless technologies are maturing with support for a wider number of languages, this challenge will gradually disappear with time. Serverless is all about creating solutions without thinking or worrying about servers; think of it as just putting your code in the cloud and running it! Serverless is a game-changer because it shifts the way you look at how applications are composed, written, deployed and scaled. If you want signicant agility in creating highly scalable applications while remaining cost-effective, serverless is what you need. Businesses across the world are already providing highly compelling solutions using serverless computing technologies. The applications serverless has range from chatbots to real-time stream processing from IoT (Internet of Things) devices. So it is not a question of if, but rather, when you will adopt the serverless approach for your business. References
[1] ‘Build Your Own Technology Radar’, Neal Ford, http:// nealford.com/memeagora/2013/05/28/build_your_own_ technology_radar.html [2] ‘Serverless Architectures’, Martin Fowler, https:// martinfowler.com/articles/serverless.html [3] ‘Why the Fuss About Serverless?’ Simon Wardley, http://blog. gardeviance.org/2016/11/why-fuss-about-serverless.html [4] ‘Serverless Architectural Patterns and Best Practices’, Amazon Web Services, https://www.youtube.com/ watch?v=b7UMoc1iUYw Serverless technologies AWS Lambda: https://aws.amazon.com/lambda/ Azure Functions: https://functions.azure.com/ Google Cloud Functions: https://cloud.google.com/ functions/ Apache OpenWhisk: https://github.com/openwhisk Serverless framework: https://github.com/serverless/ serverless Fission: https://github.com/fssion/fssion Hyper.sh: https://github.com/hyperhq/ Funktion: https://funktion.fabric8.io/ Kubeless: http://kubeless.io/ •
•
•
•
•
•
•
•
•
56
By: Ganesh Samarthyam, Manoj Ganapathi and Srushit Repakula The authors work at CodeOps Technologies, which is a software technology, consulting and training company based in Bengaluru. CodeOps is the organiser of the upcoming India Serverless Summit, scheduled on October 27, 2017. Please check www.codeops.tech for more details.
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Insight Admin
A Glimpse of Microservices with Kubernetes and Docker The microservices architecture is a variant of service oriented architecture. It develops a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API.
‘M
icroservices’ is a compound word made of ‘micro’ and ‘services’. As the name suggests, microservices are the small modules that provide some functionality to a system. These modules can be anything that is designed to serve some specic function. These services can be independent or interrelated with each other, based on some contract. The main function of microservices is to provide isolation between services — a separation of services from servers and the ability to run them independently, with the interaction between them based on a specic requirement. To achieve this isolation, we use containerisation, which will be discussed later. The idea behind choosing microservices is to avoid correlated failure in a system where there is a dependency between services. When running all microservices inside the same process, all services will be killed if the process is restarted. By running each service in its own process, only one service is killed if that process is restarted, but restarting the server will kill all services. By running each service on its own server, it’s easier to maintain these isolated services, though there is a cost associated with this option.
How microservices are defined The microservices architecture develops a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API.
Microservices is a variant of the service-oriented architecture (SOA) ar chitectural style. A SOA is a style of software design in which services are provided to the other components by application components, through a communication protocol over a network that structures an application as a collection of loosely coupled services. In the microservices architecture, services should be fine-grained and the protocols should be lightweight. The benefit of breaking down an application into different smaller services is that it improves modularity and makes the application easier to understand, develop and test. It also parallelises development by enabling small autonomous teams to develop, deploy and scale their respective services independently. These services are independently deployable and scalable. Each service also provides a kind of contract allowing for different services to be written in different programming languages. They can also be managed by different teams.
The architecture of microservices Microservices follows the service-oriented architecture in which the services are independent of users, products and technologies. This architecture allows one to build applications as suites of services that can be used by other services. This architecture is in contrast to the monolithic architecture, where the services are built as a single unit comprising a client-side user interface, databases and server-side applications in a single frame — all dependent on one another. The failure of one can bring down the whole system. The microservices architecture mainly consists of the client-side user interface, databases and server-side applications as different services that are related in some way to each other but are not dependent on each other. Each layer is independent of the other, which in turn leads to easy maintenance. The architecture is represented in Figure 2. This architecture is a form or system that is built by plugging together components, somewhat like in a real world composition where a component is a unit of software that is independently replaceable and upgradeable. These microservices are easily deployable and integrated into one another. This gives rise to the possibility of continuous integration and continuous deployment.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
57
Admin
Insight
doesn’t have to install and configure AMQP HTTP User Interface complex databases nor worry about Relational DB HTTP switching between HTTP Services incompatible Module 1 Module 2 Module 3 Module 4 language toolchain HTTP HTTP versions. When an Key / Value Store app is dockerised, HTTP HTTP that complexity Database 2 Database 1 is pushed into Microservices Database Client Side User Interface Database 3 microservices - application databases containers that are easily built, Figure 1: Microservices - application databases Figure 2: Microservices architecture shared and run. It is a tool that is designed to benefit both developers and APP APP APP Containers are isolated systems administrators. A A* B Bins/ Libs
Bins/ Libs
but share the OS and, where appropriate, the bins/libraries
Bins/ Libs
....result is significantly faster deployment, much less overhead, easier migration, faster restart
VM Guest OS
Guest OS
Guest OS
Container
A P P A
A P P A *
A P P B
A P P B *
A P P B *
A P P B *
Hyprevisor (Type 2) Host OS Server
D o c k e r
Host OS Server
Figure 3: Virtual machines vs containers
What’s so good about microservices? With the advances in software architecture, microservices have emerged as a different platform compared to other software architecture. Microservices are easily scalable and are not limited to a language; so you are free to choose any language for the services. The services are loosely coupled, which in turn results in ease of maintenance and exibility, as well as reduced time in debugging and deployment.
Microservices with Docker and Kubernetes Docker is a software technology that provides containers, which are a computer virtualisation method in which the kernel of an operating system allows the existence of multiple isolated user-space instances, instead of just one. Everything required to make a piece of software run is packaged into isolated containers. With microservices, containers play the same role of providing virtual environments to different processes that are running, being deployed and undergoing testing, independently. Docker is a bit like a virtual machine, but rather than creating a whole virtual operating system, Docker allows applications to use the same kernel as the system that it’s running on and only requires applications to be shipped with things not already running on the host computer. The main idea behind using Docker is to eliminate the ‘works on my machine’ type of problems that occur when collaborating on code with co-workers. Docker
58
How well does Kubernetes go with Docker? Before starting the discussion on Kubernetes, we must rst understand orchestration, which is to arrange various components so that they achieve a desired result. It also means the process of integrating two or more applications and/or services together to automate a process, or synchronise data in real-time. The intermediate path connecting two or more services is done by orchestration, which refers to the automated arrangement, coordination and management of software containers. So what does Kubernetes do then? Kubernetes is an open source platform for automating deployments, scaling and operations of application containers across clusters of hosts, providing container-centric infrastructure. Orchestration is an idea whereas Kubernetes implements that idea. It is a tool for orchestration. It deploys containers inside a cluster. It is a helper tool that can be used to manage a cluster of containers and treat all servers as a single unit. These containers are provided by Docker. The best example of Kubernetes is the Pokémon Go App, which runs on a virtual environment of Google Cloud, in a separate container for each user. Kubernetes uses a different set-up for each OS. So if you want a tool that will overcome Docker’s limitations, you should go with Kubernetes. To conclude, we may say that microservices is growing very fast, the reason being its features of independence and isolation which give it the power to easily run, test and be deployed. This is just a small summary of microservices, about which there is a lot more to learn.
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
By: Astha Srivastava The author is a software developer. Her areas of expertise are C, C++, C#, Java, JavaScript, HTML and ASP.NET. She has recently started working on the basics of articial intelligence. She can be reached at [email protected].
Let’s Try Developers
Selenium: A Cost-Effective Test Automation Tool for Web Applications Selenium is a software testing framework. Test authors can write tests in it without learning a test scripting language. It automates Web based applications efficiently and provides a recording/ playback system for authoring tests.
S
elenium is a portable software-testing framework for Web applications that can operate across different browsers and operating systems. It is quite similar to HP Quick Test Pro (or QTP, now called UFT) except that Selenium focuses on automating Web based applications. Testing done using this tool is usually referred to as Selenium testing. Selenium is not just a single tool but a set of tools that helps the tester to automate Web based applications more efciently. It has four components: 1. The Selenium integrated development environment (IDE) 2. The Selenium remote control (RC) 3. WebDriver 4. The Selenium grid Selenium RC and WebDriver are merged into a single framework to form Selenium 2. Selenium 1 is also referred to as Selenium RC. Jason Huggins created Selenium in 2004. Initially, he named it JavaScriptTestRunner, and later changed this to Selenium. It is licensed under Apache License 2.0. In the following sections, we will learn about how Selenium and its components operate.
The Selenium IDE The Selenium IDE is the simplest framework in the Selenium
suite and is the easiest one to learn. It is a Firefox plugin that you can install as easily as any other plugin. It allows testers to record their actions as they go through the workow that they need to test. But it can only be used with the Firefox browser, as other browsers are not supported. The recorded scripts can be converted into various programming languages supported by Selenium, and the scripts can be executed on other browsers as well. However, for the sake of simplicity, the Selenium IDE should only be used as a prototyping tool. If you want to create more advanced test cases, either use Selenium RC or WebDriver.
Selenium RC Selenium RC or Selenium Remote Control (also known as Selenium 1.0) was the agship testing framework of the whole Selenium project for a long time. It works in a way that the client libraries can communicate with the Selenium RC server that passes each Selenium command for execution. Then the server passes the Selenium command to the browser using Selenium-Core JavaScript commands. This was the rst automated Web testing tool that allowed people to use a programming language they preferred. Selenium RC components include: 1. The Selenium server, which launches and kills the
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
59
Developers
Let’s Try
browser, interprets and runs the Selenese commands passed from the test program, and acts as an HTTP proxy, intercepting and verifying HTTP messages passed between the browser and Application Under Test (AUT). 2. Client libraries that provide the interface between each programming language and the Selenium RC server. Selenium RC is great for testing complex AJAX based Web user interfaces under a continuous integration system. It is also an ideal solution for users of Selenium IDE who want to write tests in a more expressive programming language than the Selenese HTML table format.
The WebDriver uses a different underlying framework, while Selenium RC uses a JavaScript Selenium-Core embedded within the browser, which has its limitations. WebDriver directly interacts with the browser without any intermediary. Selenium RC depends on a server.
Architecture The architecture of WebDriver is explained in Figure 1.
Web Application
Selenese commands Selenese is the set of Selenium commands which is used to test Web applications. The tester can test the broken links, the existence of some object on the UI, AJAX functionality, the alert window, list options and a lot more using Selenese. There are three types of commands: 1. Actions: These are commands that manipulate the state of the application. Upon execution, if an action fails, the execution of the current test is stopped. Some examples are: click(): Clicks on a link, button, checkbox or radio button. contextMenuAt (locator, coordString): Simulates the user by clicking the ‘Close’ button in the title bar of a popup window or tab. 2. Accessors: These evaluate the state of the application and store the results in variables which are used in assertions. Some examples are: assertErrorOnNext: Pings Selenium to expect an error on the next command execution with an expected message. storeAllButtons: Returns the IDs of all buttons on the page. 3. Assertions: These enable us to verify the state of an application and compare it against the expected. It is used in three modes, i.e., assert, verify and waitfor. Some examples are: waitForErrorOnNext(message): Wait for error, used with the accessor assertErrorOnNext. verifySelected (selectLocator, opti onLocator):Veries that the selected item of a drop-down satises optionSpecifer.
Selenium WebDriver Selenium WebDriver is a tool that automates the testing of Web applications and is popularly known as Selenium 2.0. It is a Web automation framework that allows you to execute your tests against different browsers. WebDriver also enables you to use a programming language in creating your test scripts. The following programming languages are supported by Selenium WebDriver: 1. Java 2. .NET 3. PHP 4. Python 5. Perl 6. Ruby
Selenium Web Driver
Selenium Test (Java, C#, Ruby, Python, Perl, Php, Java Script) Figure 1: Architecture of Selenium WebDriver
The differences between WebDriver and Selenium RC are given in Table 1. Table 1 WebDriver
Selenium RC
Architecture is simpler, as it controls the browser from the OS level. It supports HtmlUnit. WebDriver is faster, as it interacts directly with the browser.
Architecture is complex, as it depends on the server.
Less object-oriented APIs and cannot be used for mobile testing.
Purely object-oriented and can be used for iPhone/Android application testing.
WebDriver is not ready to support new browsers and does not have a built-in command for the automatic generation of test results.
Selenium RC can support new browsers and have built-in commands.
It does not support HtmlUnit. It is slower, as it uses JavaScript to interact with RC.
Continued to page 64.... 60
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Developers
Using the Spring Boot Admin UI for Spring Boot Applications
Using Spring Boot makes it easy for developers to create standalone, production-grade Spring based applications that can be ‘just run’. Spring Boot is a part of microservices development.
A
s part of developing microservices, many of us use the features of Spring Boot along with Spring Cloud. In the microservices world, we may have many Spring Boot applications running on the same or different hosts. If we add SpringActuator (http://docs.spring.io/ spring-boot/docs/current/reference/htmlsingle/#productionready) to the Spring Boot applications, we get a lot of out-ofthe-box end points to monitor and interact with applications. The list is given in Table 1. The end points given in Table 1 provide a lot of insights about the Spring Boot application. But if you have many applications running, then monitoring each application by hitting the end points and inspecting the JSON response is a tedious process. To avoid this hassle, the Code Centric team came up with the Spring Boot Admin (https://github. com/codecentric/spring-bootFigure 1: Spring Boot logo admin) module, which provides
us an Admin UI dashboard to administer Spring Boot applications. This module crunches the data from Actuator end points, and provides insights about all the registered applications in a single dashboard. We will demonstrate the Spring Boot admin features in the following sections. As a frst step, create a Spring Boot application that will be a Spring Boot Admin Server module by adding the Maven dependencies given below:
de.codecentric
spring-boot-admin-server
1.5.1
de.codecentric
spring-boot-admin-server-ui
1.5.1
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
61
Developers
Let’s Try
Table 1
ID
Description
Sensitive default
actuator
Provides a hypermedia-based ‘discovery page’ for the other endpoints. Requires Spring HATEOAS to be on the classpath. Exposes audit events information for the current application. Displays an auto-configuration report showing all auto-configuration candidates and the reason why they ‘were’ or ‘were not’ applied.
True
beans configprops dump
Displays a complete list of all the Spring Beans in your application. Displays a collated list of all @ConfigurationProperties. Performs a thread dump.
True True True
env flyway health
Exposes properties from Spring’s ConfigurableEnvironment. Shows any Flyway database migrations that have been applied. Shows application health information (when the application is secure, a simple ‘status’ when accessed over an unauthenticated connection or full message details when authenticated). Displays arbitrary application information. Shows and modifies the configuration of loggers in the application. Shows any Liquibase database migrations that have been applied. Shows ‘metrics’ information for the current application. Displays a collated list of all @RequestMapping paths. Allows the application to be gracefully shut down (not enabled by default). Displays trace information (by default, the last 100 HTTP requests).
True True False
auditevents autoconfig
info loggers liquibase metrics mappings shutdown trace
Add the Spring Boot Admin Server conguration by adding @EnableAdminServer to your conguration, as follows:
True True
False True True True True True True
throws Exception { // Page with login form is served as /login.html and does a POST on /login
package org.samrttechie;
http.formLogin().loginPage(“/login.
html”).loginProcessingUrl(“/login”).permitAll(); import org.springframework.boot.SpringApplication; import org.springframework.boot.autocongure.
// The UI does a POST on /logout on logout
http.logout().logoutUrl(“/logout”);
SpringBootApplication; import org.springframework.context.annotation.Conguration;
// The ui currently doesn’t support csrf
http.csrf().disable();
import org.springframework.security.cong.annotation.web. builders.HttpSecurity;
// Requests for the login page and the static
import org.springframework.security.cong.annotation.web.
assets are allowed
conguration.WebSecurityCongurerAdapter;
http.authorizeRequests() .antMatchers(“/login.html”, “/**/*.
import de.codecentric.boot.admin.cong.EnableAdminServer;
css”, “/img/**”, “/third-party/**”)
.permitAll();
@EnableAdminServer
// ... and any other request needs to be authorized
@Conguration
http.authorizeRequests().
@SpringBootApplication
antMatchers(“/**”).authenticated();
public class SpringBootAdminApplication { // Enable so that the clients can public static void main(String[] args) {
SpringApplication.
authenticate via HTTP basic for registering
http.httpBasic();
run(SpringBootAdminApplication.class, args);
}
}
} // end::conguration-spring-security[]
@Conguration public static class SecurityCong extends
}
WebSecurityCongurerAdapter {
@Override protected void congure(HttpSecurity http)
62
Let us create more Spring Boot applications to monitor through the Spring Boot Admin Server created in the above
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Developers artifactId>
1.5.1
org.springframework.boot
spring-boot-starter-actuator
artifactId> Figure 2: Admin server UI
Add the property given below to the application. properties le. This property tells us where the Spring B oot Admin Server is running. Hence, the clients will register with the server. spring.boot.admin.url=http://localhost:1111
Now, if we start the Admin Server and other Spring Boot applications, we will be able to see all the admin clients’ information in the Admin Server dashboard. As we started our Admin Server on port 1111 in this example, we can see the dashboard at http://:1111. Figure 2 shows the Admin Server UI. A detailed view of the application is given in Figure 3. In this view, we can see the tail end of the log le, the metrics, environment variables, the log conguration where we can dynamically switch the log levels at the component level, the root level or package level, and other information. Let’s now look at another feature called notifcations from the Spring Boot admin. This noties the administrators when the application status is DOWN or when the application status is coming UP. Spring Boot admin supports the following channels to notify the user. Email notications Pagerduty notications Hipchat notications Slack notications Let’s Chat notications In this article, we will congure Slack notications. Add the properties given below to the Spring Boot Admin Server’s application.properties le.
spring.boot.admin.notify.slack.webhook-url=https://hooks. Figure 3: Detailed view of Spring Boot Admin
slack.com/services/T8787879tttr/B5UM0989988L/0000990999VD1hV t7Go1eL //Slack Webhook URL of a channel
steps. All Spring Boot applications that we now create will act as Spring Boot Admin clients. To make the application an admin client, add the dependency given below along with the actuator dependency. In this demo, I have created three applications: Eureka Server, Customer Service and Order Service.
de.codecentric spring-boot-admin-starter-client
spring.boot.admin.notify.slack.message=”*#{application. name}* is *#{to.status}*” //Message to appear in the channel
Since we are managing all the applications with the Spring Boot Admin, we need to secure its UI with a login feature. Let us enable the login feature to the Spring Boot Admin Server. I am going with basic authentication here. Add the Maven dependencies give below to the Admin www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
63
Developers
Let’s Try
Server module, as follows:
authenticating. Hence, add the properties given below to the admin client’s application.properties fles.
org.springframework.boot
spring.boot.admin.username=admin
spring-boot-starter-security
spring.boot.admin.password=admin123
de.codecentric
spring-boot-admin-server-ui-login
artifactId>
1.5.1
Add the properties given below to the application.properties
There are additional UI features like Hystrix and Turbine UI, which we can enable in the dashboard. You can fnd more details at http://codecentric.github.io/ spring-boot-admin/1.5.1/#_ui_modules . The sample code created for this demonstration is available on https://github. com/2013techsmarts/SpringBoot_Admin_Demo.
fle. security.user.name=admin //user name to authenticate security.user.password=admin123 //Password to authenticate
As we have added security to the Admin Server, admin clients should be able to connect to the server by
Continued from page 60....
By: Siva Prasad Rao Janapati The author is a software engineer with hands-on experience in Java, JEE, Spring, Oracle Commerce, MOZU Commerce, Apache Solr and other open source/enterprise technologies. You can reach him at his blog http://smarttechie.org.
To locate by CSS Selector, type: driver.fndElement(By.cssSelector());
Selenium locators Locator is a command that instructs the Selenium IDE which GUI element it needs to work on. Elements are located in Selenium WebDriver with the help of fndElement() and fndElements() methods provided by the WebDriver and WebElement class. The fndElement() method returns a WebElement object based on a specifed search criteria or ends up throwing an exception. The fndElements() method returns a list of WebElements matching the search criteria. If these are not found, it returns an empty list. The different types of locators are: 1. ID 2. Name 3. Link Text 4. CSS Selector 5. DOM 6. XPath To locate by ID, type: driver.fndElement(By.id());
To locate by name, type:
To locate by XPath, type: driver.fndElement(By.xpath());
Limitations of Selenium Selenium does have some limitations which one needs to be aware of. First and foremost, image based testing is not clearcut compared to some other commercial tools in the market, while the fact that it is open source also means that there is no guaranteed timely support. Another limitation of Selenium is that it supports Web applications; therefore, it is not possible to automate the testing of non-browser based applications. Selenium is a power testing framework to conduct functional and regression testing. It is open source software and supports various programming environments, OSs and popular browsers. Selenium WebDriver is used to conduct batch testing, cross-platform browser testing, data driven testing, etc. It is also very cost-effective when automating Web applications; and for the technically inclined, it provides the power and exibility to extend its capability many times over, making it a very credible alternative to other test automation tools in the market.
driver.fndElement(By.name());
To locate by Link Text, type:
By: Neetesh Mehrotra The author works in TCS as a systems engineer. His areas of interest are Java development and automation testing. You can contact him at [email protected].
driver.fndElement(By.linkText());
64
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Developers
Splinter: An Easy Way to Test Web Applications Splinter is a Web application testing tool which is built around Python. It automates actions such as visiting specified URLs and interacts with their items. It also removes the drudgery from Web application testing by replacing manual testing with automated testing.
E
very one of us makes mistakes—some of these might be trivial and be ignored, while a few that are serious can’t be ignored. Hence, it’s always a good practice to verify and validate what we do in order to eliminate the possibility of error. So is the case with any software application. The development of a software application is complete only when it’s fully veried and validated (its functionality, performance, user interface, etc). Only then is it ready for release. Carrying out all such validations manually is quite time consuming; so, machines perform such repetitive tasks and processes. This is called automation testing. It saves a lot of time while it reduces the risk of any further error caused by human intervention. There are different automation tools and frameworks available, of which Splinter is one. It lets us automate different manual tasks and processes associated with any Web-based software application. In a Web application, we need to automate the sequence of different actions performed, right from opening the Web browser to checking if it’s loading properly for different actions that involve interactions with the application. Splinter is quite good in automating a sequence of actions. It is an open source tool used for testing different Web applications using Python. The tasks needed to be performed by Splinter are written in Python. It lets us automate various browser actions, such as visiting URLs as well as interacting with their different items. It has got easy-
to-use built-in functions for the most frequently performed tasks. A newbie can easily use Splinter and automate any specic process with just a limited knowledge of Python scripting. It acts as an easily usable abstraction layer on top of different available automation tools like Selenium and makes it easy to write automation tests. We can easily automate a plethora of tasks such as opening a browser, clicking on any specic link or accessing any link, just with one or two lines of code using Splinter, while in the case of other open source tools like Selenium, this is a long and complex process. Splinter even allows us to nd different elements of any Web application using its different properties like tag name, text or ID value, xpath, etc. Since Splinter is an open source tool, it’s quite easy to get clarications on anything that’s not clear. It is supported by a large community. It even has well maintained documentation which makes it easy for any newbie to master this tool. Apart from all this, Splinter supports various inbuilt libraries making the task of automation easier. We can easily manage different actions performed on more than one Web window at the same time as well as navigate through the history of the page, reload the page, etc.
Features of Splinter 1. Splinter has got one of the simplest APIs among open source tools used for automating different tasks on Web applications. This makes it easy to write automated tests
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
65
Developers
Let’s Try
for any Web application. 2. It supports different Web drivers for various browsers. These drivers are the Firefox Web driver for Mozilla Firefox, Chrome’s Web driver for Google Chrome, PhantomJs Web driver for PhantomJs, zope.testbrowser for Zopetest and a remote Web driver for different ‘headless’ (with no GUI) testing. 3. Splinter also allows us to nd different elements in any Web page by their Xpath, CSS, tag value, name, ID, text or value. In case we need more accurate control of the Web page or we need to do something more, such as interacting with old «frameset» tags, Splinter even exposes the Web driver that allows us to use the low level methods used for interacting with that tag. 4. Splinter supports multiple Web automation back-ends. We can use the same set of test code for doing browser-based testing with Selenium as its back-end, and for ‘headless’ testing with zope.testbrowser as its back-end. 5. It has extensive support for using iframes and interacts with them by just passing the iframe’s name, ID or index value. There is also Chrome support for various alerts and prompts in the Splinter 0.4 version. 6. We can easily execute JavaScript in different drivers which support Splinter. We can even return the result of the script using an inbuilt method called evaluate_script . 7. Splinter has got the ability to work with AJAX and asynchronous JavaScript using various inbuilt methods. 8. When we use Splinter to work with AJAX and asynchronous JavaScript, it’s a common experience to have some elements which are not present in HTML code (since they are created using JavaScript, dynamically). In such cases, we can use various inbuilt methods such as is_element_present or is_text_present for checking the existence of any specic element or text. Splinter will actually load the HTML and the JavaScript in the browser, and the check will be performed before JavaScript is processed. 9. The Splinter project has full documentation for its APIs and this is really important when we have to deal with different third party libraries. 10. We can also easily set up a Splinter development environment. We need to make sure we have some basic development tools in our machine, before setting up an entire environment with just one command. 11. There is also a provision for creating a new Splinter browser in an easy and simple way. We just need to implement a test case for this. 12. Using Splinter, it’s possible to check the HTTP status code that a browser visits. We can use the status_code. is_success method to do the work for us. We can compare the status code directly. 13. Whenever we use the visit method, Splinter actually checks if the given response is a success or not, and if it is not, then Splinter raises an HttpResponseError exception. 66
This helps to conrm if the given response is okay or not. 14. It is pos sible to manipulate the cookies that are us ing the cookies’ attributes from any browser instance. The cookie’s attribute is actually an instance of a CookieManager class which manipulates cookies, such as adding and deleting them. 15. One can create new drivers using Splinter. For instance, if we need to create a new Splinter browser, we just need to implement a test case (extending test.base.BaseBrowsertests). All this will be present in a Python le, which will act as a driver for any future usage.
Chrome
Splinter
Browser-based
Firefox
Selenium Test Code
A P I
Remote
Web Driver
HTTP
Remote Webdriver Server Sauce Labs (IE)
Headless
PhantomJS zope.testbrowser
Figure 1: Flow diagram for Splinter acting as an abstraction layer ( Image source: googleimages.com)
Drivers supported by Splinter Drivers play a signicant role when it comes to any Web application. In Splinter, a Web driver helps us open that specic application whose driver we are using. Different types of drivers are supported by Splinter, based on the way any specic application is accessed and tested. There are browser based drivers, which help to open specic browsers; apart from that we have headless drivers, which help in headless testing and then there are remote drivers, which help to connect to any Web application present on a remote machine. Here is a list of drivers that are supported by Splinter. Browser based drivers:
Chrome WebDriver Firefox WebDriver Remote WebDriver
Headless drivers
Chrome WebDriver Phantomjs WebDriver zope.testbrowser Django client Flask client
Remote driver
Remote WebDriver
Prerequisites and installation of Splinter To install Splinter, Python 2.7 or above should be installed on the system. We can download Python from http://www.python.org.Make sure you have already set up your development environment.
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Developers Git should be installed on the system. If you want to use Google Chrome as the Web browser, make sure Chrome WebDriver is set up properly. There are two ways in which we can install Splinter. To install the stable release: To install the ofcial and bug-free version, run the following command from a terminal:
browser.visit(url)
$ [sudo] pip install splinter
#Inbuilt function browser.ll uses the tag name for Email
#checks if Facebook web page is loaded else prints an error message if browser.is_text_present(‘www.facebook.com’): # lls the user’s email ID and password in the email and password eld of the facebook
login section
and Password input box i.e. email and pass respectively to
For installing under-development source-code: To get Splinter’s latest and best features, just run the following given set of commands from a terminal:
identify it browser.ll(‘email’, user_email) browser.ll(‘pass’, user_pass) #selects the login button using its id value present on the
$ git clone git://github.com/cobrateam/splinter.git
Facebook page to click and log in with the given details
$ cd splinter
button = browser.nd_by_id(‘u_0_d’)
$ [sudo] python setup.py install
button.click()
else:
Writing sample code to automate a process using Splinter
print(“Facebook web application NOT FOUND”)
As already stated, even a newbie without much knowledge of programming can automate any specic task using Splinter. Let’s discover how one can easily make Splinter perform any specic task automatically on a Web application. The credit for the ease of coding actually goes to the different inbuilt functions that Splinter possesses. We just need to incorporate all such built-in functions or library les with the help of a few lines of code. Additionally, we need to apply logic while coding to validate different scenarios from different perspectives. Let’s have a look at one of the sample code snippets that has been written for Splinter. Here, we make use of the name and ID values of different elements present on the Web page to identify that specic Web element. Scenario for sample code: Login to a Facebook account using the user’s email ID and password . #imports the Browser library for Splinter from splinter import Browser # takes the email address from user as input to login to his/ her Facebook account user_email = raw_input(“enter users email address “) # takes the password from user as input to login to his/her Facebook account user_pass = raw_input(“enter users password “)
Some important built-in functions used in Splinter Table 1 lists some of Splinter’s significant built-in functions that can be used while automating any process for a Web application.
Setting up the Splinter development environment When it comes to programming in Splinter, we have already seen that it’s easier than other open source Web application testing tools. But we need to set up a development environment for it, wherein we can easily code or automate a specic process using Splinter. This is not a tough task. We just need to make sure that we have some basic development tools, library les and a few add-on dependencies on our machine, which will ultimately help us code in an easier and better way. We can get the required tools and set up the entire environment using just a few commands. Lets’ have a look at the different development tools required to set up the environment. Basic development tools: If you are using the Mac OS, install the Xcode tool. It can be downloaded from the Mac Application Store (on the Mac OS X Lion) or even from the Apple website. If you are using a Linux computer, install some of the basic development libraries and the headers. On Ubuntu, you can easily install all of these using the apt-get command. Given below is the command used for this purpose.
# loads the Firefox browser
$ [sudo] apt-get install build-essential python-dev libxml2-
browser= Browser(‘refox’)
dev libxslt1-dev
# stores the URL for Facebook in url variable url = “https://www.facebook.com/” #navigates to facebook website and load that in the Firefox browser
Pip and virtualenv: First of all, we need to make sure that we have Pip installed in our system, with which we manage all the Splinter development dependencies. It lets us program our task and makes the system perform any activity www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
67
Developers
Let’s Try
Table 1
Name of the function Browser Browser () related browser.visit()
For managing different browser windows actions
For finding elements of any Web page
Function used for Used to instantiate any browser and create a window for it This function is used to navigate to any specific URL browser.reload() Used to reload any Web page browser.title Displays the title of the current active Web page browser.html Used to display the HTML content of the current active Web page browser.url Used to access the URL of the current active Web page browser.windows[0] Used to access the first window
browser.windows [window_name]
Used to access any specific window using the window_name
browser.reload() browser.title browser.html browser.url browser.windows[numeric value representing window to be visited]
browser.windows[window_name]
browser.windows. Takes you to the current window current () window.is_current () Boolean – used to check whether the current window is active or not
browser.windows.current()
window.next () window.prev() window.close() window.close_oth- ers() browser.find_by_ name()
Takes you to the next open window Takes you to the previous open window Closes current window Closes all windows except the current one Used to find an element using its name
window.next() window.prev() window.close() window.close_others()
browser.find_by_ css() browser.find_by_ xpath() browser.find_by_ tag() browser.find_by_ text() browser.find_by_id()
Used to find an element using i ts CSS value Used to find an element using its XPath
browser.find_by_ value()
Used to find an element using its value
Used to find an element using i ts tag name Used to find an element using i ts text value Used to find an element using its ID value
using the code or command we write. It’s advisable to choose virtualenv for a good development environment. Once we have all the development libraries installed for the OS we are using, we just need to install all the Splinter development dependencies using the make command. Given below is the command for this.
window.is_current = Boolean True or False
browser.find_by_name(‘name of element’) browser.find_by_css(‘css value’) browser.find_by_xpath(‘xpath value’) browser.find_by_tag(‘name of tag’) browser.find_by_text(‘text value for the element to be accessed’) browser.find_by_id(‘id value of the element’) browser.find_by_value(‘value of the element to be accessed’)
References [1] http://www.wikipedia.org/ [2] https://splinter.readthedocs.io [3] https://github.com/cobrateam/splinter
$ [sudo] make dependencies
We will use sudo while making dependencies only if we are not using virtualenv.
68
Syntax of the function variable name = Browser (‘name of the Web driver used’) browser.visit(‘URL’)
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
By: Vivek Ratan The author, who is currently an automation test engineer at Infosys, Pune, has completed his B. Tech in electronics and instrumentation engineering. He can be reached at [email protected] for any suggestions or queries.
Let’s Try Developers
Getting Started with PHP, the Popular Programming Language This article provides an introduction to PHP — the development process, its history, as well as its pros and cons. At the end of the article, we learn how to install XAMPP on a computer, and to write code to add two numbers.
I
f you refer to any Web technology survey to check the market share of different server side scripting languages, you will be surprised to know that PHP is used by an average 70 per cent of the websites. According to w3techs. com, “PHP is used by 82.7 per cent of all the websites whose server-side programming language we know.” In the early stages, even Facebook servers deployed PHP to run their social networking application. Nevertheless, we are not concerned about the Web trafc hosted by PHP these days. Instead, we will delve deep into PHP to understand its development, its history, its pros and cons and, in the end, we will have a sneak peek into some of the open s ource IDEs which you can use for rapid development. First, let’s understand what PHP is. It is an abbreviated form of ‘Hypertext Pre-processor’. Confused about the sequence of the acronym? Actually, the earlier name of PHP was ‘Personal Home Page’ and hence the acronym. It is a server side programming language mainly used to enhance the look and feel of HTML Web pages. A sample PHP code embedded into HTML looks like what follows:
Example
In the above example, you can see how easily PHP can be embedded inside HTML code just by enclosing it inside tags, which allows very cool navigation between HTML and PHP code. It differs from client-side scripting languages like JavaScript in that PHP code is executed on the server with the help of a PHP interpreter, and only the resultant HTML is sent to the requester’s computer. Though it can do a variety of tasks, ranging from creating www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
69
Developers
Let’s Try
forms to generating dynamic Web content to sending and receiving cookies, yet there are three main areas where PHP scripts are usually deployed. Server-side scripting: This is the main usage and target area of PHP. You require a PHP parser, a Web browser and a Web server to make use of it, and then you will be able to view the PHP output of Web pages on your machine’s browser. Command line scripting: PHP scripts can also be run without any server or browser but with the help of a PHP parser. This is most suited for tasks that take a lot of time — for example, sending newsletters to thousands of records, taking backups from databases, and transferring heavy les from one location to another. Creating desktop applications: PHP can also be used to develop desktop based applications with graphical user interfaces (GUI). Though it has a lot of pain points, you can use PHP-GTK for that, if you want to. PHP-GTK is available as an extension to PHP.
called ‘Personal Home Page Tools’ in order to maintain his personal Web pages. The succeeding year, these tools were released under the name of ‘Personal Home Page/ Forms Interpreter’ as CGI binaries. They were enabled to provide support for databases and Web forms. Once they were released to the whole world, PHP underwent a series of developments and modications, and the result was that the second version of ‘Personal Home Page/Forms Interpreter’ was released in November 1997. Moving on, PHP 3, 4 and 5 were released in 1998, 2000 and 2004, respectively. Today, the most used version of PHP is PHP 5, with approximately 93 per cent of the websites using PHP making use of it, though PHP 7 is also available in the market. In 2010, PHP 5.4 came out with Unicode support added to it.
Fact: Did
you know that PHP has a mascot just like sports teams? The PHP mascot is a big blue elephant named elePHPant.
PHP and HTML – similar but different PHP is often confused with HTML. So to set things straight, let’s take a look at how PHP and HTML are different and similar at the same time. As we all know, HTML is a markup language and is the backbone for front-end Web pages. On the other hand, PHP works in the background, on the server, where HTML is deployed to perform tasks. Together, they are used to make Web pages dynamic. For better understanding, let’s look at an example where you display some content on a Web page using HTML. Now, if you want to do some backend validation on the database, then you will use PHP to do it. So both HTML and PHP have different assigned roles and they complement each other perfectly. Listed below are some of the similarities and differences that will make this clear.
The pros and cons of PHP Before going further into PHP development, let’s take a look at some of the advantages and disadvantages of using it in Web development. Advantages Availability: The biggest advantage of PHP is that it is available as open source, due to which one can nd a large developer community for support and help. Stability: PHP has been in use since 1995 and thus it’s quite stable compared to other server side scripting languages since its source code is open and if any bug is found, it can be readily xed. Extensive libraries: There are thousands of libraries available which enhance the abilities of PHP—for example, PDFs, graphs, Flash movies, etc. PHP makes use of modules, so you don’t have to write everything from the beginning. You just need to add the required module in your code and you are good to go. Built-in modules: Using PHP, one can connect to the database effortlessly using its built-in modules, which drastically reduce the development time and effort of Web developers. Cross-platform: PHP is supported on all platforms, so you don’t have to worry whether your code written in Windows OS will work on Linux or not. Easy to use: For beginners, learning PHP is easy because of its cool syntax, which is somewhat similar to the C programming language, making it even simpler for those familiar with C.
Similarities
Differences
Compatible with most of the browsers supporting their technologies.
HTML is used on the front-end whereas PHP is back-end technology.
Can be used on all operating systems.
PHP is a programming language, whereas HTML is called a markup language and is not included in the category of programming languages because it can’t do calculations like ‘1+1=2’.
History and development The development of PHP dates back to 1994 when a DanishCanadian programmer Rasmus Lerdorf created Perl scripts
70
Disadvantages Not suitable for huge applications: Though PHP has a lot of advantages in Web page development, it still can’t be used to build complicated and huge Web applications since it does not support modularity and, hence, the maintenance of the app will be a cumbersome task. Security: Security of data involved in Web pages is of paramount concern. The security of PHP can be
| SEPTEMBER 2017 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Let’s Try Developers compromised due to its open source nature since anyone can view its source code and detect bugs in it. So you have to take extra measures to ensure the security of your Web page if you are dealing with sensitive data. Fact: It is estimated that there are approximately 5 million PHP developers worldwide, which is a testament to its power.
Open source IDEs for PHP development The choice of IDE plays an important role in the development of any program or application but this aspect is often neglected. A good and robust IDE comes packed with loads of features and packages to enable rapid development. Automatic code generation, refactoring, organising imports, debugging, identifying dead code and indentation are some of the advantages a powerful IDE can provide. So let’s take a look at some dominant open source IDEs that can be very useful in PHP development. 1. NetBeans: Most of you must be aware of NetBeans in Java development but it can also be used for PHP development. The biggest advantage of NetBeans is that it supports many languages like English, C hinese, Japanese, etc, and can be installed smoothly on any operating system. Some of the features that differentiate it from the rest are smart code completion, refactoring, try/catch code completion and formatting. It also has the capability to congure various PHP frameworks like Smarty, Doctrine, etc. You can download it from netbeans.org. 2. Eclipse: Eclipse tops the list of popular IDEs. If you have worked with Eclipse earlier, then you will feel at home using Eclipse PDT for PHP development. It can be downloaded from eclipse.org/pdt . Some of its features are syntax highlighting, debugging, code templates, validating syntax and easy code management through Windows Explorer. It is a cross-platform IDE and works on Windows, Linux and Mac OS. Since it is developed in Java, you must have it installed in your machine. 3. PHPStorm: PHPStorm, developed by JetBrains (the same company that developed IntelliJ IDEA for Java), is mainly used for professional purposes but is also available licence-free for students, teachers and open source projects. It has the most up-to-date set of features for rapid development since it provides support for leading frontend technologies like HTML5, CofeeScript, JavaScript and Sass. It supports all the major frameworks available in the market like Symfony, CakePHP, Laravel and Zend, and can also be integrated with databases, version control software, rest clients and command line tools to ease the work of developers. A number of MNCs, like Wikipedia, Yahoo and Cisco, are making use of PHPStorm for PHP development. You can read more about PHPStorm at jetbrains.com/phpstorm. 4. Sublime Text: Sublime Text is basically a text editor, but it can be converted into a PHP IDE by installing various
available packages. It is known for its sleek, featurerich and lightweight interface. It is also supported on all operating systems. Some of the packages which can be used to convert it into an IDE are Sublime PHP Companion, PHPCS, codIntel, PHPDoc, Simple PHPunit, etc. It can be downloaded as open source from sublimetext.com. 5. PHP Designer: This IDE is only available for Windows users. It is very fast and powerful, with full support for PHP, HTML, JavaScript and CSS. It is used for fast Web development due to its features like intelligent syntax highlighting, object-oriented programming, code templates, code tips and debug manager, which are all wrapped into a sleek and intuitive interface that can also be customised according to various available themes. It also supports various JavaScript frameworks such as JQuery, ExtJs and Yui. An open source version of it is available and you can read more about it on its ofcial website. 6. NuSphere PHP IDE: PHpED is the IDE developed by NuSphere, a Nevada based company which entered the market way back in 2001. The current available version of PHpED is 18.0 which provides support for PHP 7.0 and almost all PHP frameworks. This tool also has the ability to run unit tests for the developed projects and comes packaged with the support for all Web based technologies. You can download PHpED from NuSphere’s website www.nusphere.com. 7. Codelobster: Codelobster also provides a free IDE for PHP development. Though it is not used too often, it is catching up fast. By downloading the free version, you get support for PHP, JS, HTML and CSS. It can be integrated with various frameworks such as Drupal, WordPress, Symfony and Yii. You can download it from www.codelobster.com.
Writing the first PHP program Having read about PHP, its history and various IDEs, let’s write our rst PHP program and run it using XAMPP. Though there is no ofcial information about the full form of XAMPP, it is usually assumed to stand for cross-platform (X), Apache (A), MariaDB (M), PHP (P) and Perl (P). XAMPP is an open source, widely used Web server developed by apachefriends.org, which can be used to create a local HTTP server on machines with a few clicks. We will also be using it in our tutorial below.
Figure 1: Apache service started on XAMPP control panel
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPTEMBER 2017 |
71
Developers
Let’s Try
Download the latest version of XAMPP on your systems from the website https://www.apachefriends.org/ download.html. After the download is complete, install it in your machine. By default, XAMPP is installed in your machine’s C drive, but if you have specied any other directory in the installation process, go to that directory and create a folder named PHPDevelopment inside the htdocs folder in your XAMPP installation directory folder. For example, C:\xampp\htdocs\PHPDevelopment. Now start the XAMPP control panel and click on the Start button to start Apache. Create a text le inside the above folder named AddTwoNumbers.php and copy the following code inside it:
Index of/PHPDevelopment Parent Directory Add Two Numbers.php 2017-07-30 13:47 545 FirstDemo.php 2017-07-30 13:24 333 Apache/2.4.26 (Win32) OpenSSL/1.0.21 PHP/5.6.31 Server Figure 2: Screenshot of the list of files in your directory
Addition of Two Number
Sum
Addition Of Two Numbers