Descripción: Los usuarios cada vez se decantan más por el 'software libre', porque lo adaptan a sus necesidades, corrigen sus errores... Descárgate este ebook para conocer todo sobre 'open source'.
Descrição: virtualizacion open source
Descrição completa
Jaap Bloem | Menno van Doorn | Erik van OmmerenDescrição completa
Descripción: Open Source POS Installation Guide.pdf
Open Source POS Installation Guide.pdfFull description
A case-study exploring blockchain applications in healthcare, specifically electronic medical records (EMR). The study was lead by Alex Singleton, master of science in information systems technolog...
Bass Guitar Magazine 129 - April 2016Full description
Descripción: Bass Guitar Magazine 129 - April 2016
Descrição: MEW-240 April 2016
Pianist - April-May 2016
IKA
PCWorld USA - April 2016
Five Custom Android ROMs You Can Have Fun With
` 120
Cool FFmpeg Tricks
Volume: 04 | Issue: 07 | Pages: 108 | April 2016
SUBSCRIBER COPY
NOT FOR RE-SALE Sale and purchase of this copy is illegal.
An Interview With
An Interview With
CEO, ESDS Software Solutions Pvt Ltd
Business, Cisco India And SAARC
Piyush Somani, Founder, MD And
Sanjay Kaul, MD, Service Provider
Contents Admin 22
vnStat: A Lightweight Network Traffic Monitor
29
Sharing Resources Across Diverse OSs with Samba
90
An Introduction to Riak S2
Developers 49
JavaScript: The New Parts
52
GNU Emacs: How to Work with HTML Mode, Indentation and Magit
56
Get Started with Developing MS Office Add-ins
60
Build a Personal Safety Application in App Inventor 2
64
A Guide to Material Design, a Modern Software Design Language
24
Use Sync for
Stress-free Data Backup
Ceph: A Storage and Backup Solution for Every Need
33
OpenGurus 67
The Z File System: It’s Honest and Different
69
SimpleCV: Making Vision Computing Easy and Effective
For U & Me 73
A Beginner’s Guide to Linux
75
GNU Linux Mint 17.3: The Friendly Distro
80
Cool FFmpeg Tricks
83
Getting Familiar with GRUB
87
How to Run Android Without Google
92
An Overview of Open Source Databases
95
Five Cool Custom Android ROMs You Can Have Fun With
37
Manage File Storage to Give the Best Customer Service
42
Bare Metal System Backup and Recovery Using Open Source Tools
REGULAR FEATURES 07
You Said It...
10
FOSSBytes
08
New Products
55
Editorial Calendar
4 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
kindly add ` 50/- for outside delhi cheques. please send payments only in favour of EFY Enterprises Pvt Ltd. non-receipt of copies may be reported to [email protected]—do mention your subscription number.
YOU SAID IT The OSFY effect I am currently a B. Tech student at Christ University, Bengaluru. I have been reading your magazines ever since I joined my professional course and find them amazing. They teach me the topics that a computer engineer must know about in the current tech world and which are out-of-syllabus. I am also a regular reader of EFY and I had attended the EFY Expo 2016 which was held recently in Bengaluru. The Expo was a great place for the tech-savvy. I met a few interesting people and also learned a lot about electronics and IoT. I actually like security related topics. Also, thanks for bundling the bootable Kali Linux and Backbox distros (if not, I would have had to spend time in downloading ISO files of 3-4GB). Your magazine, OSFY, has transformed me so much that I’ve started my own blog on ‘Open Source Security Tools’ hosted on GitHub (https://hackwith.github.io) and also started a regularly maintained page/account on Facebook, Twitter and LinkedIn.
ED: We regret the inconvenience caused. I would request you to please send an email to [email protected] with details like the subscription number and missing issue. They will revert to you and arrange to send a copy.
Articles for beginners Please guide me on the online courses for Linux. Also, some topics covered in OSFY go over my head, as I am a newbie. Could I request you to please include some more articles for beginners? —Goswami Prashanta, [email protected] ED: Thanks for your feedback. We always try to carry articles that cater to different types of readers. The current issue of OSFY, for instance, has a number of articles for beginners. Also, as you suggested, we will try and include more articles for newbies. We urge you to continue sharing your opinion with us, as this motivates us and helps us to deliver even better.
—B.N.Chandrapal, [email protected] ED: First of all, a big thank you for your words of appreciation. Through the years, we have constantly worked at improving the content, layout and design of the magazine in order to make your reading experience more enjoyable. It really makes us feel great knowing how our readers have benefited by reading OSFY. And as our readers have always done, we urge you to continue sharing your views with us, as this motivates us and helps us to deliver even better.
Writing for opensourceforu.com I have over 11 years of experience as a Web developer. My hobby is to write articles on Web development using WordPress, Joomla, Drupal and other such themes. I would like to contribute such articles to opensourceforu.com. I hope the content in these would be beneficial to your specific audience. Let me know if this is an option. —Reegan, [email protected]
Missing subscription copy I did not receive the February issue of Open Source For You magazine. Kindly let me know how to get it. —Sanjeevkumar Badrinath, [email protected]
Share Your
ED: It’s great to hear of your interest in writing for us! Before doing so, you can send us the detailed Table of Contents (ToC) at [email protected] for the topic you want to write on. Our team will review it, and once they give you the thumbs-up, you can go ahead with the article.
Please send your comments or suggestions to: The Editor, Open Source For You, D-87/1, Okhla Industrial Area, Phase I, New Delhi 110020, Phone: 011-26810601/02/03, Fax: 011-26817563, Email: [email protected]
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 7
NEW PRODUCTS An affordable smartphone from Lava Indian smartphone manufacturer Lava has launched an affordable smartphone, namely, the Lava Fuel F2. It features a 12.7cm (5 inch) FWVGA display with an 854x480 pixel resolution. The phone is powered by a 1.3GHz quad-core processor with 512MB of RAM, and runs on the Android 5.1 Lollipop operating system. The dual SIM phone comes with onboard storage of 8GB, further expandable up to 32GB via microSD card. Packed with a 3000mAh battery, the Fuel F2 comes with a 5 megapixel rear camera with LED flash and a VGA front facing camera. The connectivity options of the device include 3G, Wi-Fi, Bluetooth and GPS. The Lava Fuel F2 is available in black and white options, via eBay. Address: Lava International Ltd, A-56, Sector – 64, Noida, Uttar Pradesh 201301; Ph: 01204637100; Website: http://www.lavamobiles.com
Price:
` 4,444
Affordable Bluetooth speakers from Zebronics Zebronics, the IT peripherals and accessories manufacturer, has launched its Bluetooth 2.1 ‘Rock n Roll’ speakers, which are equipped with surround sound, as well as subwoofers with 7.6cm low range drivers. The satellite speakers are powered by 6.3cm mid/high range drivers for high performance sound quality. The Rock n Roll speakers enable users to play music via a USB device or SD card through the USB reader, which supports MP3/WMA dual format decoding. The speakers also support FM tuning. The Zebronics
Price:
` 1,111 Rock n Roll speakers are available via online and retail stores. Address: Zebronics India, 15/14, 12th A Cross, Sindhi Hospital Road, Sampangiram Nagar, Bengaluru 560027, Karnataka; Ph: 080-41713242
Innovative Bluetooth headphones from Creative Creative, the maker of digital entertainment products, has unveiled its latest Bluetooth over-the-head headphones, named ‘Outlier’, which are crafted from durable material with a flexible headband. The ear cushions are made from soft-protein leatherette that allows them to be worn all day, with comfort. Targeted at urban commuters and fitness enthusiasts, the headphones come with six pairs of interchangeable colour rings, allowing for 30 customised colour combinations. Weighing just 93 grams, they are equipped with a built-in MP3 player, enabling users to insert a microSD card and use them without a paired smartphone. The headphones’ audio is powered by 32mm neodymium dynamic drivers, and a ‘Buddy’ app can be installed to read out the messages received from paired Android devices. They also feature NFC connectivity for quick Bluetooth pairing with supported devices, along with lossless USB audio playback in order to charge the device while playing music.
8 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Price:
` 4,881 The headphones include a 3.55mm audio-in socket, which lets users connect them with a typical stereo cable or any source equipment. Bluetooth v4.1 technology supports the MP3, WMA and WAV formats. The company claims that the headphones offer 10 hours of uninterrupted music on a full charge. The Creative Outlier is available via amazon.in. Address: Amazon.com, 26/1, 8th Floor, Brigade Gateway, Dr Rajkumar Road, Malleshwaram West, Bengaluru, Karnataka 560005
NEW PRODUCTS
High capacity compact SSD from Samsung
4G tablet from Micromax Indian consumer electronics company Micromax has launched a new tablet in India called Micromax Canvas Tab P702, with the highlighted feature being 4G LTE support at an affordable price. The dual SIM supporting device features a 17.7cm (7 inch) HD (1280x720 pixels) resolution IPS display and runs on Android 5.1 Lollipop, out-of-the-box. Powered by a 1.3GHz quad-core MediaTek processor, the device has a 2GB RAM along with 16GB inbuilt storage, expandable up to 32GB via microSD card. The tablet is backed with a 3000mAh battery rated to deliver 250 hours of standby and three hours of playtime. As for the camera, the Canvas Tab P702 features a 5 megapixel autofocus rear camera with LED flash along with a 2 megapixel front facing camera. The tablet supports 4G LTE,
Price:
` 7,999 Wi-Fi, Bluetooth and micro-USB connectivity, including dual speakers powered by DTS audio technology. It is bundled with an OTGB cable, headphones and is available in bold black and white, via Snapdeal. Address: Micromax Informatics Limited, 90B, Sector-18, Gurgaon 122015; Ph: +91-124-4811000; Email: [email protected]
Samsung has launched its latest solidstate drive (SSD) — the T3. Similar to the previously launched T1, the new T3 looks like a slightly thicker credit card and weighs around 51 grams. The drive sports a dark silver coloured metal body with a black design and anti-scratch urethane coating, which gives it a premium look. It is powered by Samsung’s proprietary Vertical NAND (V-NANO) and SSD TurboWrite technology. The SSD offers sequential read/ write speeds of up to 450 megabytes per second (MB/s) with a USB 3.0 super speed interface, along with AES 256-bit hardware encryption. It has a shock-resistant internal frame, which can withstand an accidental drop from up to 2 metres, along with a thermal guard to manage workload and prevent the device from overheating.
Sony’s pocket-sized projector for smartphones Sony recently unveiled its portable projector for smartphones, namely, the MP CL1. The pocket-sized projector comes encased in an aluminium body with a matte finish and weighs around 210 grams. It uses a laser light source and offers full HD resolution images with a contrast ratio of 80,000:1, at an aspect ratio of 16:9 and maximum brightness of 32 ANSI lumens. The MP CL1 is a short-throw mobile projector capable of displaying a screen size of up to 304.8cm (120 inches) at a distance of 3.45 metres. It offers several connectivity options, including HDMI or MHL. The projector features built-in screen mirroring that allows users to mirror content via Wi-Fi. It’s suited for business presentations or home entertainment, as it comes with a built-in speaker for displaying content from smartphones and tablets.
Price:
` 10,999 for 250GB, ` 18,999 for 500GB, ` 37,999 for 1TB and ` 74,999 for 2TB
Price:
` 45,899 The projector uses Sony’s laser beam scanning technology, which enables focus-free projection and does not require even surfaces. According to company sources, the Sony MP CL1 lasts for up to 120 minutes on HD resolution playback with a built-in 3,400mAh battery. It is available via amazon.in. Address: Sony India, No. A-31, Mohan Co-Operative Industrial Estate, Mathura Road, New Delhi 110044; Ph: 011 66006600
The T3 SSD is compatible with a wide range of USB supported devices, including the latest Android smartphones, tablets, etc. It uses the exFAT file system so that users need not reformat while using the drive with different computers. The Samsung T3 SSD is available in 250GB, 500GB, 1TB and 2TB variants across all leading online and retail stores. Address: Samsung India Electronics Pvt Ltd, 2nd, 3rd and 4th Floors, Tower-C, Vipul Tech Square, Sector-43, Golf Course Road, Gurgaon 122002, Haryana (India); Ph: 0124-4881234; Website: www.samsung.com/intp-link.in/
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 9
FOSSBYTES Powered by www.efytimes.com
Linus Torvalds releases the Linux kernel 4.5 The Linux kernel 4.5 has been released and one can read more about it on the ‘Kernel Newbies’ page. Linus Torvalds has been working on the Linux kernel 4.5 and now it has finally come to the market. The new kernel patch offers driver updates for USBs, Watchdog, GPU, RDMA, ATA, and sound; improvements for the ARM, x86, MIPS and SPARC architectures, and file system (Btrfs, CIFS, Ceph and JFFS2) enhancements. The announcement was made by Linus Torvalds on the Linux kernel mailing list. Overall, everything is quite small and the diffstat appears slightly bigger for an xfs fix, as that fix comprises three clean-up refactoring patches preceding it. There is also an access type pattern affixed to the sound layer which generates a lot of noise; however, it’s pretty simple in the end. There are other random small fixes all over. The kernel comprises a shortlog, which is appended for people who would like to skim the details. One can also get a great summary regarding all that is in the new version on the ‘Kernel Newbies’ page.
Linux kernel 4.5
Microsoft joins Open Source Eclipse Foundation Microsoft has made an announcement regarding its association with the Eclipse Foundation, which is an open source group known for its Eclipse IDE offering. The foundation also offers many other developer tools. The collaboration implies Microsoft will now shake hands with other Eclipse sponsors like Google, Novell, IBM, Debeka and Oracle. The announcement is a bit surprising as Microsoft offers its own IDE called Visual Studio. Besides, Microsoft plays an active role in the Eclipse ecosystem through its Azure toolkit for Eclipse, along with Java SDK for Azure which Eclipse users can utilise for building their cloud applications.
Google joins Facebook’s Open Compute Project Facebook’s Open Compute Project was aimed at accelerating the development of Internet hardware as well as to increase its uptake. The increase in demand would eventually drive down hardware costs. The move became very popular and saw big names in the game joining the race. These included companies like Microsoft, HP and Quanta. There were only two companies who were left out—Google and Amazon. And in a recent announcement made by Google at the annual Open Compute Summit held in San Jose, California, the search giant has also joined the project and is working with Facebook on new open source hardware. As per Facebook’s vice president of infrastructure, Jason Taylor, “The community and Google have inched closer and closer together.” Google’s announcement is also indicative of other changes within the big Internet players. It marks the entry of artificial intelligence technology, which has been making rapid headway in many parts of the world.
10 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
According to Microsoft’s general manager, Shanku Niyogi, “We have worked with the Eclipse Foundation for many years to improve the Java experience across our portfolio of application platform and development services, including Visual Studio Team Services and Microsoft Azure.” The collaboration highlights Microsoft strengthening its role in the open source ecosystem. Also, the company has announced that it will be open sourcing its Team Explorer Everywhere plugin for Eclipse. Microsoft will be adding Azure IoT support to the Eclipse foundation’s Kura IoT framework, and launching enhanced support for Java developers in Azure.
FOSSBYTES Raspberry Pi 3 comes with 64-bit processor and built-in Wi-Fi
The Raspberry Pi Foundation has launched the new Pi 3 with Wi-Fi and Bluetooth, but at the price of Pi 2. The new version is an iterative update on the Pi 2. According to Raspberry Pi Foundation’s CEO, Eben Upton, the Pi 3 has been in development for over a year. The new version comes with a quad-core 64-bit 1.2GHz ARM Cortex A53 chip. This is about 50 per cent faster than its predecessor. Upton further pointed out that, “Our primary goal in moving to A53 from A7 was to get a better 32-bit core. A53 running in 32-bit mode outperforms A7 at the same clock speed by 20-30 per cent.”
The new Pi 3 comes with the same 1GB RAM and the clock speed has been upgraded to 900MHz from 450MHz. There is also an improvement in the VideoCore IV graphics from 250MHz to 400MHz. Both of these are based in a board that retains the same dimensions as the Pi 2. The addition of integrated Bluetooth 4.1 and 802.11n Wi-Fi will help in reducing the need to look for component sites for cheap USB dongles. Those looking to exchange their old Pi with the new one will need to upgrade their power source, as the Pi 3 requires a 2.5A input. The Pi 3 can be bought from Element14, RS Components and the other usual stockists for US$ 35. It is also available at http://kitsnspares.com for ` 3300.
Open source adoption by the Indian government The government’s decision to switch over to open source options which are free could result in substantial cost savings but is also a matter of concern in terms of security and operational efficiency, both of which need to be addressed. The government’s total IT spend was going to be US$ 66.98 billion in 2015 as per the Ministry of Communications and IT. This was derived from several global IT research and advisory forecasts. According to NASSCOM’s Data Security Council of India, the average expense on cyber security is nearly 2 to 3 per cent of the total IT spending, i.e., US$ 1.5 to US$ 2 billion. As per Rahul De, Hewlett-Packard’s chair professor on ICT for sustainable economic development at IIM Bangalore, a major proportion—almost 15-20 per cent—of the government’s IT spend is on closed software. Transitioning
NGA releases seven tools under open source licence
The National Geospatial-Intelligence Agency (NGA) of the USA provides combat and intelligence agencies with critical and often classified intelligence. NGA is trying to share and collaborate on geodata in the open, and has expanded its transparency efforts with the release of seven open source tools on GitHub. These are: 1. DigitalGlobe Reader: This is a C++ class loading TIFF imagery and associated XML metadata files, along with images obtained from Digital Globe’s World View 3 satellite. 2. SWIR signal detection: This tool runs a spectral signature comparison between the provided spectral library file and a signature provided by the user. 3. Social media picture explorer: This, along with its user interface, utilises machine learning techniques to assist users in exploring social media by combining similar images and offering instant object recognition. 4. Nounalyzer: This tool categorises the nouns and entities in a RSS feed and displays the results through visualisations for analysts to distil information faster. 5. WordPress revision slider: This tool modifies a WordPress theme to interactively show a post’s revisions and get readers comfortable with consuming stories written with version control. 6. Rational polynomial coefficients mapper: This tool maps an object’s latitude, longitude and altitude to a two-dimensional point. 7. Spectral library reader: This reads files in the splib06a spectral reflectance database so that users can graph out the spectral reflectances of the materials in a multiband image. The GEOINT Pathfinder project was launched in August 2015, and assists the NGA in navigating the competitive world of commercial geospatial intelligence in an unclassified setting. According to sources in the NGA, the project endeavours to answer research questions via unclassified tools, data, information technology and services available from public or open sources.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 11
FOSSBYTES
TP Link blocks open source router firmware in US TP Link, a networking hardware vendor, has said that it is going to stop the loading of open source firmware on routers that are sold in the United States. This is in compliance with the new Federal Communications Commission (FCC) requirements. The FCC is trying to restrict interference with other devices by limiting user modifications, which can result in radios operating out of their licensed radio frequency (RF) parameters. The FCC stated that it is not aiming at banning the use of third-party firmware like DD-WRT and OpenWRT. Theoretically, router manufacturers can still load open source firmware, so long as they deploy controls which prevent devices from operating outside their permitted frequency range, types of modulation, power levels, etc. However, users of open source firmware are worried that hardware manufacturers are going to ban thirdparty firmware entirely, as it would be the simplest way to comply with the FCC requirements.
As per the FAQ stated in TP-Link’s website, the above theory is confirmed. The FAQ accepts that TP-Link is limiting the functionality of its routers. The company stated that, “The FCC requires all manufacturers to prevent user[s] from having any direct ability to change RF parameters (frequency limits, output power, country codes, etc).” TP-Link further stated that the change is going to be in effect for routers produced on and after June 2, 2016, though the requirements will be incorporated in a gradual manner.
to open software would assist the government in cutting costs. However, open source adoption is going to take a lot of time as it needs a certain ecosystem to work smoothly. He said that the government’s policy towards it is a positive move. De further stressed that since open source software is available to the public, there are many who feel that this makes it easy for malicious users to hack into it. But he went on to clarify that this was not the case; rather, the opposite was true. Since the source code is available freely, more people work on such software and take care of its vulnerabilities. De also said that it’s tougher to hack into open source software than proprietary software. Among the first government organisations that adopted open source were the US military forces. This was done since it was felt that open source software was a more secure option. Although users of open source can access the algorithm that makes it function, no one can access its encryption key or the set of numbers which act as a password. In the absence of the password, no one can hack into them. One of the main reasons for the government’s push towards open source is the need to incorporate vernacular languages in governance. Also, the government is relying on BOSS (Bharat Operating System Solutions), which is a Linux based operating system to meet this particular need and also to impart improved knowhow on open source across India. However, experts are also of the opinion that the government should focus more on applications and then move on to operating systems. Further, the IT ministry has given its approval for a policy on ‘Collaborative Application Development by Opening the Source Code of Government Applications’. This policy is aimed at increasing the speed of application development and their quick roll-out through the adoption of open source principles. The government is aiming to encourage the reuse of already developed applications. It also intends to boost innovation, both within the government as well as outside, by boosting collaborative development to bring out better products, faster, sources claimed.
White House to share custom code with open source community
The White House has released the draft of its Source Code Policy that sets out rules regarding sharing of customised software among federal agencies. The policy aims at improving government access to applications as well as lowering development costs. The policy also states that the Obama administration will be launching a pilot programme, which will require federal agencies to release nearly 20 per cent of the third-party-developed custom coding as open source software, making it completely accessible to external developers in the open source community. As stated by the US CIO, Tony Scott, in a blog post, “Through this policy and pilot programme, we can save taxpayer dollars by avoiding duplicative customer software purchases, and promote innovation and collaboration across federal
12 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
FOSSBYTES agencies. We will also enable the brightest minds inside and outside of government to review and improve our code, and work together to ensure that the code is secure, reliable and effective in furthering our national objectives.” The administration has launched the policy with the aim of fulfilling its commitment to adopt a government-focused open source software policy, which was originally proposed in its Second Open Government National Action Plan. The new law sets out open sharing guidelines for any new custom software that has been built for the US government websites and systems. These could be developed internally or via contracted third-party vendors, apart from cyber security software.
The new policy states that prior to agencies officially commissioning a new custom software development project, they must carry out a thorough ‘alternatives analysis’ to determine whether there is a pre-existing software solution at another agency whose source code can be shared for any important task at hand. Agencies will also need to find out and consider any off-the-shelf software solutions which may be available, before commissioning a custom project. Even though the new guidelines are not applicable for software programs which were contracted via third parties before the Source Code Policy got published, it strongly motivates agencies to take steps in making those applications available for shared inter-agency use as well.
Linux Foundation and edX bring the Free Open Source Cloud Infrastructure Course
The new Massive Open Online Course (MOOC) will permit individuals worldwide to embark on lucrative careers by building and managing cloud technologies. The Linux Foundation made an announcement that will enable mass innovation via open source. It announced its latest MOOC, which is open for registration. The MOOC is an introduction to cloud infrastructure technologies and is conducted via edX, the non-profit online learning platform launched in 2012 by Harvard University and Massachusetts Institute of Technology (MIT). The course is free and will commence from June 2016. The course is the second edX MOOC being offered via the Linux Foundation. The Foundation’s first course, an ‘Intro to Linux’, has attracted over 500,000 students worldwide and continues to grow.
Microsoft releases source code for SONiC Microsoft has released the source code for Software for Open Networking in the Cloud (SONiC). This is used for running network devices like switches, and has been built in collaboration with leading networking industry vendors Arista, Broadcom, Dell and Mellanox. This will help in building switches that are rich in functionality. And with SONiC, the hardware’s functionality and applications can reduce the dependence on a proprietary firmware through a traditional networking vendor. SONiC is based on Microsoft’s Linux-based Azure Cloud Switch (ACS) operating system. ACS is also the brain behind switches in Microsoft’s Azure Cloud. The code is capable of running on various types of hardware from different equipment manufacturers and utilises a common C API – the Switch Abstraction Interface (SAI) for programming the specialist chips in the networking gear. This implies that ACS can be used for controlling and managing network devices. It can also implement the features needed, irrespective of who has manufactured the hardware. According to Azure’s CTO, Mark Russinovich, “SONiC is a collection of software networking components required to build network devices like switches.” SONiC can be downloaded from Microsoft’s Azure GitHub repository under a mix of open source licences, which include the GNU GPL and the Apache licence.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 13
FOSSBYTES According to Jim Zemlin, executive director at The Linux Foundation, “The response to our first edX course was powerful, demonstrating the understanding among students and early career professionals all over the world that open source is the pathway to a lucrative career in technology.” He added that through the introduction to cloud infrastructure technologies, anybody can start learning the basics of building and managing some of today’s most pervasive software, thus granting professionals a strong position in the IT talent market. LFS151, which is what the course is called, will offer a primer on cloud computing and the use of open source software to maximise development and operations. The course will cover next-generation cloud technologies including Docker, CoreOS, Kubernetes and OpenStack. It will offer an overview of software-defined storage and networking solutions, along with a review of DevOps and continuous integration best practices. The course has twelve chapters. Each chapter has a short quiz at the end. One also needs to take a final exam for completing the course. Students can take the complete course at no cost, or add a verified certificate of completion for US$ 99.
Microsoft to open source Minecraft based AI platform Microsoft plans to open source its AI platform, which is based on Minecraft. Microsoft will be opensourcing a platform that is being used by researchers for testing in artificial intelligence projects. The AIX platform is already being utilised by Microsoft researchers. It permits researchers to utilise the unstructured play in the Minecraft game as a testing ground for AI research.
AIX is going to be open sourced by summer this year under an open source licence. The announcement has come at a time when Google DeepMind has gained attention for the Go games, which are being played by its AI program AlphaGo with a key Go player, Lee Se-dol. AlphaGo has won three straight games of the five-game match in Seoul, losing one to Se-dol. With AIX, Microsoft is focusing on projects which involve general intelligence. Microsoft feels it “…is more similar to the nuanced and complex way humans learn and make decisions.” According to Allison Linn, a senior writer at Microsoft, AI researchers are developing tools which involve recognising words, for example. However, researchers are not being able to combine these skills effortlessly as humans do. Microsoft had acquired Mojang, the developer of Minecraft, in 2014. Linn also stated that the AIX platform, which was developed by Microsoft’s lab in Cambridge, UK, comprises a ‘mod’ for the Java version and code, which assists artificial intelligence agents to sense and act within the Minecraft environment. Both these components are capable of running on Windows, Linux or Mac OS, and researchers can use any programming language to program their agents. 14 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
In The News
PGDay India 2016:
A SUMMARY OF THE SHOW A conference that witnessed experience-sharing on a wide range of topics by PostgreSQL users and developers was held in Bengaluru on February 26, 2016. Here is a summary of the day.
T
he annual PostgreSQL PGDay event was held on February 26, 2016, at Courtyard by Marriott in Bengaluru. The event was a grand success with over 150 attendees representing 50+ companies. This number was more than double the last year’s number of participants. What was also interesting to note was that almost 80 per cent of the attendees were from outside the city. Robert Haas, well-known PostgreSQL contributor, made the keynote address, talking about the past, present and future of this open source database. There were talks on Vacuum, foreign data wrappers, logical replication, etc, in the developer track. The speakers gave some great insights into customer use cases involving PostgreSQL. The presentation by speakers from the National Informatics Centre got the most applause, thanks to its wholehearted endorsement of PostgreSQL in a myriad projects, ranging from police-FIR systems and education certificates to online appointments at government hospitals. The government’s policy of supporting open source and the tremendous
cost savings, with no compromise in performance or functionality, was well received. Speakers from Inmobi, a company that has an enterprise-scale usage of PostgreSQL, talked about performance tips to get the best out of this database, when used on a large scale. The talk around PostgresXL, which is a distributed multi-master hybrid OLTP/OLAP offering, was also well received. The event concluded with the DBA track, in which useful insights were given into how to detect and avoid data corruption, apart from inputs about DBA maintenance activities. With an impressive sponsors list comprising 2ndQuadrant, EnterpriseDB, GE Predix, Ashnik, Tashee Linux Services and OpenSCG, the event was a grand success. The enthusiastic response augurs well for the increased adoption of PostgreSQL in the years to come. Reported By: Ashish Mehra, Nikhil Sontakke and Pavan Deolasee. All three were a part of the organising team for the event
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 15
CUSTOMER SUCCESS SNAPSHOT
YOU SAID IT
CRIS BOOKS MORE, HAPPIER PASSENGERS WITH INFRASTRUCTURE POWERED BY RED HAT
CHALLENGE • Support nearly 2 million passengers a day on over 2,500 trains nationwide. • Handle a workload of 25,000 concurrent users during peak booking periods.
2014 Innovation Award Winner
• Deploy a highly scalable solution based on open standards.
SOLUTION • Deployed Red Hat® Enterprise Linux® as the underlying infrastructure for mission-critical systems, increasing scalability to support online ticket bookings.
Centre for Railway Information Systems (CRIS) www.cris.org.in GOVERNMENT INDUSTRY
• Developed the CRIS Journey Planner (JP), an application for passengers based on open JavaTM EE 6 standards, which caters to 25,000 concurrent users and provides a high-performing, user-friendly, and highly available interface. • Collaborated closely with Red Hat Consulting during the planning, execution, and implementation of the projects.
BENEFITS CUSTOMER SINCE
2006
SOFTWARE AND SERVICES Red Hat Enterprise Linux
• Gained an operational, state-of-the-art journey planning system in the most cost-effective manner with shortest time to market. • Avoided vendor lock-in and achieved interoperability by integrating Red Hat Enterprise Linux into the existing infrastructure. • Increased customer satisfaction by developing and launching a reliable, streamlined train journey planning system.
leadership through updates and support that help us to improve our application tuning so that we can give our customers the best possible experience.“ SUNEETI GOEL CHIEF PROJECT ENGINEER, PASSENGER RESERVATION SYSTEM GROUP, CRIS
Learn more about Red Hat customer successes: redhat.com/customersuccess
In this month’s column, we discuss the topic of automatic relation extraction from text using natural language processing techniques.
I
n last month’s column, I had described certain techniques for information extraction from unstructured texts using natural language processing. In this month’s column, we take a break from this and discuss a few interview questions in general computer science and data science. 1. Given an arbitrary integer N, how many unique binary trees can be created containing N nodes? Can you come up with an algorithm which can programmatically list all possible binary trees consisting of N nodes? 2. You are given two unsorted lists of integers. You are asked to create a sorted list consisting of all the elements in the two lists. Can you write an algorithm for this? What is the running time of your code, in the worst case? 3. You are given a list of N nodes in a tree, with each node represented by an integer. For each node, you are also given the details of its parent node. You are not given the edges of the tree explicitly, but you are told that it is a binary tree. You are asked to write code to perform an in-order traversal of the tree using this information. How would you solve this problem: (a) If you are told that you can perform pre-processing on the information and can use any amount of additional storage; (b) You are told that you can use only a constant amount of additional storage. 4. You are given an arbitrary tree T (not necessarily binary), with each tree node containing a pair of integers (X and Y). Each tree node can have an arbitrary number of children nodes, at each level of the tree. You are given two numbers, A and B, and are told to find out if the nodes N1 and N2 containing the integers A and B are located at the same level of the tree. Note that N1 and N2 can be two separate nodes or be the same node. It is only required that N1 and N2 should be located at the same level of the tree, and they need not be
siblings. Write an algorithm to determine whether the tree T contains A and B at the same level. What is the execution time complexity of your code? 5. You are given an array of size N. Each element of the array can take only one of the three values (0, 1 or -1). You are told to find the longest contiguous sequence of 1s in the array. Write a C function to find the start and end of the contiguous sequence of 1s in the array. Now you are told that you can flip K contiguous sequence of (-1) to 1 where K is a dynamically provided input to the function. In that case, how would you determine the longest contiguous sequences of 1s in the array after flipping K elements which were -1 to 1? What is the time and space complexity of your function? 6. Given an array of N integers, you are asked to find out whether the given array is an arithmetic series (a sequence of numbers where there is a constant difference between any two consecutive numbers of the series). What is the complexity of your function? Now, you are given a modified problem where your task is to convert a given array into an arithmetic sequence by dropping off those array elements which prevent it from being an arithmetic sequence. How would you solve this problem? What is the time complexity of your function? 7. Given an arbitrarily long character string, you are asked to find out the most frequently occurring character as well as the least frequently occurring character. Write a C function to solve this problem, given a character string of size N. Now you are told that instead of being a fixed size character string, you are given a stream of characters which come dynamically as input. At any point in time, you should be able to print the most frequent and least frequent characters in the stream (note that if two or more characters have the same highest frequency, you can choose to print any of them as the most frequent characters). What is the time and space
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 17
CodeSport
Guest Column
complexity of your algorithm? 8. You are given an array of N integers and told that the array is mostly sorted, with only k elements not being in their correct sorted position in the array, and with k being very small compared to N. What is the sorting algorithm you would use to completely sort the array? 9. You are given an array of N integers. You are asked to find the maximum and minimum elements in each of the sub-sequences of length K of the array, where K is an arbitrary integer. (a) Given an array of N integers and a value K, how many sub-sequences of length K are possible? (b) What is the complexity of computing the maximum and minimum for each of the sub-sequences of length K? How would you solve this problem in polynomial time, for any arbitrary value of K? 10. You are given a list of N integers, where the numbers can only be positive. You are now being asked to write code to create two lists of size N/2 such that the sum of the elements of the two sub-lists are equal, if at all it is possible to create such sub-lists. Your function should return the sum of the sub-list elements if it is possible to split the original list into two such sub-lists, or return 0 if that is not possible. What is the time complexity of your function? (Note that you can move the elements of the list.) You are now told that the original list can either contain positive or negative integers, and are told to solve the above problem. What are the changes you would make to your earlier solution? 11. Consider the problem given in (10). Now, instead of splitting the list into two sub-lists such that the elements of both are equal, you are asked to split the list at any arbitrary index such that the difference of the sum of the two sub-lists is the minimum among all possible splits of the original list. Unlike problem (10), you cannot move around the elements of the list. What is the time and space complexity of your function? 12. You are given two sentences S1 and S2, each containing N and M words, respectively. You are now asked to convert S1 into S2 by either deleting a word or adding a new word. Modifying word1 to word2 can be considered as word deletion, followed by word addition. Each insert operation has a cost of +2 and each delete operation has a cost of +1. Given two arbitrary sentences S1 and S2, the edit distance of (S1, S2) is the minimum total cost of operations needed to convert S1 into S2. You are asked to write a function to compute the edit distance. What is the
THE COMPLETE MAGAZINE ON OPEN SOURCE
time and space complexity of your function? 13. You are given a long piece of text T and three sub-strings S1, S2 and S3. You are asked to find the shortest sub-string X of T such that X contains S1, S2 and S3 as sub-strings. What is the time and space complexity of your solution? 14. You are given an array of N integers (where N can be of the order of millions) which has been sorted. Given an arbitrary value K, you are asked to find out how many times it repeats in the array? Can you write a C function to do this in O (log N) time complexity? 15. Consider the following code snippet: int main(int argc, char *argv[]) { void *my_mem = 0; int alloc_blocks = 0; while(1) { my_mem = (void *) malloc(1024 * 1024); if (!my_mem) break; memset(my_mem,1, (1024*1024)); printf(“allocated memory blocks is %d MB\ n”,++alloc_blocks); } exit(0); }
Can you explain what would happen when you compile and run the program? If you have any favourite programming questions/ software topics that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_AT_yahoo_DOT_com. Till we meet again next month, happy programming!
By: Sandya Mannarswamy The author is an expert in systems software and is currently working as a research scientist in Xerox India Research Centre. Her interests include compilers, programming languages, file systems and natural language processing. If you are preparing for systems software interviews, you may find it useful to visit Sandya’s LinkedIn group Computer Science Interview Training India at http://www. linkedin.com/groups?home=HYPERLINK “http://www.linkedin.com/ groups?home=&gid=2339182”&HYPERLINK “http://www.linkedin. com/groups?home=&gid=2339182”gid=2339182
Your favourite Magazine on Open Source is now on the Web, too.
OpenSourceForU.com Follow us on Twitter@LinuxForYou
18 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Guest Column Exploring Software
Anil Seth
Visualising Data Using googleVis Viewing statistics in motion charts can be great fun. This article describes how googleVis software can be used to view motion charts. The author gives the reader an interesting example of data picked up from the Bombay Stock Exchange and how it can be viewed as a motion chart.
V
iewing statistics can be an exciting experience. If you are surprised, look at one of the motion chart presentations by Hans Rosling, at his TED talk https:// www.youtube.com/watch?v=RUwS1uAdUcI. The software behind the presentation of motion charts was acquired by Google and integrated into the Google Charts API. Normally, I would have experimented with the Python API. However, I came across the R package ‘googleVis’ on GitHub and decided to experiment with that instead. It provides a motivation to experiment with R. R is a statistical tool, which assumes considerable significance in the context of making sense of Big Data.
Installing googleVis You need to install R and then the googleVis package under R. For example, on Fedora, use the following commands: $ sudo dnf install R $ sudo R >install.packages(“googleVis”)
You may be prompted to select a repository from which to install R packages. Once the installation is complete, you can test it with just two commands, as follows: $ R > suppressPackageStartupMessages(library(googleVis)) > plot(gvisMotionChart(Fruits, “Fruit”, “Year”, options=list(width=600, height=400))) starting httpd help server ... done
You will get a 600x400 (width x height) motion chart in the browser and can play around with various bubbles moving with time. You can find videos of googleVis charts on YouTube in case you are interested.
Getting started with R In case you are not already familiar with R, at this point, try
a sample session in ‘An Introduction to R’ which is included in the R distribution. You can access the built-in help including the introduction as follows: > help.start() starting httpd help server ... done
This will bring up HTML documentation. You can navigate to ‘Appendix A: A sample session’ in ‘An Introduction to R’.
Exploring googleVis Now, let’s get back to understanding and exploring the googleVis package. ‘Fruits’ is a data frame included in the distribution of the googleVis package. The command data() will list all the data frames available. However, we are currently interested in the googleVis package only. So, find out the data frames available in googleVis, load Fruits and print its contents as follows: > data(package=”googleVis”) Data sets in package ‘googleVis’: Andrew Hurricane Andrew: googleVis example data set Cairo Daily temperature data for Cairo CityPopularity CityPopularity: googleVis example data set Exports Exports: googleVis example data set Fruits Fruits: googleVis example data set …. > data(Fruits) > Fruits Fruit Year Location Sales Expenses Profit Date 1 Apples 2008 West 98 78 20 2008-12-31 2 Apples 2009 West 111 79 32 2009-12-31 ... 9 Bananas 2010 East 81 71 10 2010-12-31
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 19
Exploring Software
Guest Column
You can learn more about googleVis in the R interpreter, with the following: help(googleVis) which will give a brief description of the package and guide you to further information about it demo(googleVis) which will guide you through examples of charts included in the package, including gvisBubbleChart and gvisMotionChart vignettes(googleVis) which will give you an introduction to the package
Experimenting with stock market indices The next step is to use your own data. An easy and interesting resource is to download historical indices data from the BSE (Bombay Stock Exchange) site. You can save the data as CSV files. The daily data will look something like what follows: Date 1-January-2016 4-January-2016 5-January-2016
Open 26101.5 26116.52 25744.7
High 26197.27 26116.52 25766.76
Low 26008.2 25596.57 25513.75
Close 26160.9 25623.35 25580.34
You can merge and organise the data from the various CSV files in a spreadsheet so that it is suitable for importing in R. The YMD date format is very convenient as there is no ambiguity. You may want to add a column that shows the difference between the high and low values as an indicator of volatility. Suppose you want to look at the Sensex, MidCap,
SmallCap and IT indices, the properly organised data will look like what follows: Date 2016-02-29 2016-02-29 2016-02-29 2016-02-29
Close 23002 9575.1 9548.33 10229.49
Hi-Lo 848.61 262.42 194.6 463.03
Save this sheet in CSV format as BSE-indices.csv. This file can be easily imported as a data set in R, though the date field needs to be explicitly identified. > D=read.csv(“BSE-indices.csv”, header=TRUE,colClasses=c (Date=”Date”)) > M <- gvisMotionChart(D,idvar=”Index”,timevar=”Date”, sizevar=”Hi.Lo”) > plot(M)
Notice that the ‘Hi-Lo’ becomes ‘Hi.Lo’ in R. This value is used as the default value for the size of the bubble in the above chart. The chart will open in a browser window and needs Flash Player. Figure 1 shows the initial image. The Open column is used as the x-axis and the next column, High, is used as the y-axis. However, the columns used for the x and y axes can be changed in the chart. The colours for various indices can also be changed. If you
Index Open High Low Sensex 23238.5 23343.22 22494.61 Midcap 9584.31 9651.77 9389.35 Smallcap 9567.63 9594.03 9399.43 IT 10488.4 10507.62 10044.59
Color Unique colors
Sensex, 1/1/16 25,000
Size
20,000
Hi.Lo
10,000
Bank, 1/1/16
Size Hi.Lo
20,000
1000
Select
20,000
Deselect all
Bank Health
1000
IT
Select
Midcap
Bank
15,000
Sensex
15,000
Health
Smallcap
IT
Trails
IT, 1/1/16
Midcap Sensex
Trails 10,000
15,000
Open 1/1/16
20,000
25,000
Lin
Data: D • Chart ID: MotionChartID2728658d2e73 • googleVis-0.5.10 R version 3.2.3 (2015-12-10) • Google Terms of Use • Documentation and Data Policy
Figure 1: Initial display of a chart 20 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Close
High
Smallcap
10,000
10,000 Jan 2015 - Feb 2016
Time 2/29/16 Data: D • Chart ID: MotionChartID2728658d2e73 • googleVis-0.5.10 R version 3.2.3 (2015-12-10) • Google Terms of Use • Documentation and Data Policy
Figure 2: A chart showing some trails as the bubbles move in time
click on the Play option, the values and bubble sizes will change with time. Change the x-axis column to Date, the y-axis column to Closing and colours to Unique. Select Bank, IT and Sensex indices for showing the trails. You can see the results in Figure 2. You may notice that the values of some of the indices are too close to each other to be viewed effectively. The chart has the option to select an area and zoom in, as you can see in Figure 3. The range of data sets available in the public domain is increasing, some of which you may explore on https:// www.google.com/publicdata/. The data made available by various government departments in India is available through https://data.gov.in/. Visualisation tools like the one discussed above help you make sense of it all.
Time 2/29/16 Data: D • Chart ID: MotionChartID2728658d2e73 • googleVis-0.5.10 R version 3.2.3 (2015-12-10) • Google Terms of Use • Documentation and Data Policy
Figure 3: After zooming in, a chart showing some trails of the movement of bubbles
By: Dr Anil Seth The author has earned the right to do what interests him. You can find him online at http://sethanil.com and http://sethanil.blogspot. com, or you can reach him via email at [email protected].
Please share your feedback/ thoughts/ views via email at [email protected]
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 21
Admin Let’s Try
vnStat: A Lightweight Network Traffic Monitor vnStat is a Web based interface to view and monitor network traffic bandwidth usage in a nice graphical format. It is a command line tool and its integration with PHP gives it a graphical frontend. Here’s a tutorial on installing and using vnStat with a PHP frontend.
vnStat
To view all the available interfaces on your system that vnStat can monitor, use the following command: [root@sun ~]# vnstat --iflist Available interfaces: lo eth0 eth1
Now add the eth0 and eth1 interfaces for monitoring traffic, as follows: [root@sun ~]# vnstat -u -i eth0 Error: Unable to read database “/var/lib/vnstat/eth0” Info: -> A new database has been created
For the first time, it will display the above error. Ignore this as sufficient data is not available for display. Similarly, add the second interface, i.e., eth1, as follows:
T
here are occasions when it is useful to know your bandwidth usage pattern. This not only helps administrators to detect the root cause of traffic related issues, such as a network overload, but also helps them to keep a tab on traffic flow to and from the Internet, and the charges levied by the ISP for the bandwidth used. This is an important task which requires dedicated and effective software. vnStat is just the tool for that. It is a lightweight (command line) network traffic monitor, which provides several output options like summaries on an hourly, daily, monthly, weekly or ‘top 10 days’ basis. It can monitor selected interfaces, stores network traffic logs in a database for further analysis and can be used without root permission.
Prerequisites for installation
I have chosen a system with Centos version 6.7 (equivalent to RHEL 6.7) with minimal installation, which acts as a gateway through which eth0 is connected to the Internet and eth1 is connected to the local network. First, install all the required packages and dependencies for vnStat and the PHP Web frontend using the Yum utility. [root@sun ~]# yum install wget epel-release php httpd php-gd
Installation
To install vnStat, execute the command given below in a terminal: [root@sun ~]# yum install vnstat 22 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
[root@sun ~]# vnstat -u -i eth1 Error: Unable to read database “/var/lib/vnstat/eth1”. Info: -> A new database has been created.
Method for updating the database The database can be updated by either using cron or service. Cron: To use cron, just remove the comment from /etc/ cron.d/vnstat, after which the entry should be as shown below: #vi /etc/cron.d/vnstat MAILTO=root */5 * * * * vnstat /usr/sbin/vnstat.cron
:wq save and quit Service: The second method uses vnStatd a program that collects all the information and is run as the daemon. You can start it with executing below command: [root@sun ~]# /etc/init.d/vnstat start
Enable service at boot time [root@sun ~]# chkconfig --level 345 vnstat on
To start traffic monitoring for selected interfaces, start vnstat service. [root@sun ~]# cd /usr/local/src
Download the latest vnStat PHP frontend
Let’s Try
Traffic data for public Internal (eth0) Summary In This hour This day This month All time
Out 10.52 MB 10.52 MB 50.33 MB 50.33 MB
Total 1.00 MB 1.00 MB 8.64 MB 8.64 MB
11.52 MB 11.52 MB 58.97 MB 58.97 MB
Top 10 days In 16 February 2016 18 February 2016 10 February 2016 03 February 2016 09February 2016
from http://www.sqweek.com, as follows: [root@sun src]#wget -c http://www.sqweek.com/sqweek/files/ vnstat_php_frontend-1.5.1.tar.gz
Extract the downloaded archive in a Web-accessible directory, e.g. /var/www/html/ [root@sun src]## tar xzf vnstat_php_frontend-1.5.1.tar.gz [root@sun src]## mv vnstat_php_frontend-1.5.1 /var/www/html/ vnstat
Edit the config.php file and set the following parameters as per your set-up: [root@sun src]#vi /var/www/html/vnstat/config.php $language = ‘en’; $iface_list = array(‘eth0’, ‘eth1’); $iface_title[‘eth0’] = “Public Interface”; $iface_title[‘eth1’] = “ Local interface”;
If you are using SELinux, restore the file(s) default SELinux security contexts by using the restorecon command: [root@sun src]# restorecon -Rv /var/www/html/vnstat/
Make sure that the correct local time zone is set in php.ini. Edit php.ini and set date.timezone =”Asia/Kolkata” [root@sun src]#vi /etc/php.ini date.timezone =”Asia/Kolkata”
Now enable web service at boot [root@sun src]#chkconfig --level 345 httpd on
Start the Web service httpd as follows: [root@sun src]#service httpd start
Admin
You can access vnStat with the PHP frontend by pointing your browser to http://ipaddress/vnstat, i.e., http://192.168.1.140/vnstat/ Allow port 80 on the server from outside, add rule in iptables by executing below command: [root@sun src]#/sbin/iptables -A INPUT -m state --state NEW -p tcp --dport 80 -j ACCEPT [root@sun src]#/etc/init.d/iptables save [root@sun src]#/etc/init.d/iptables restart
To secure the vnStat PHP frontend with user authentication, add an entry in httpd.conf at the end as shown below: AuthUserFile /var/www/vnstat-htpasswd AuthName “vnStat user” AuthType Basic require valid-user
After making changes in the httpd.conf file, restart the httpd service: [root@sun src]#service httpd restart
This will add the user, with Force SHA encryption of the password. Now point your browser to http://192.168.1.140/vnstat/. It will ask for the user name and password. After entering both, you will be able to see the vnStat PHP frontend main page as shown in Figure 1. You can get details like the summary, hourly, daily and monthly graphical reports of bandwidth usage for configured interfaces.
By: Suresh M. Jagtap The author is a Linux enthusiast who loves to explore various Linux distributions and open source software. You can contact him at [email protected]
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 23
Admin Let’s Try
Use Sync for Stress-free Data Backup A disk crash or a corrupted file can have disastrous consequences. A safe bet is to backup one’s data. The author narrates his personal experience with data backup issues and gives a tutorial in the use of Sync, which he recommends for home users.
M
ost of you are probably active users of various services like Gmail, Flickr, Google Photos, Apple iPhotos and Instagram, or at least, must be aware of them. Many of you might also be users of digital storage options like Dropbox. While these services do give you the flexibility of easy availability of data anywhere in the world (assuming you have the bandwidth to access the data), you need to maintain a local copy (on your laptop or desktop) of all your data, much as vendors may try to convince you that all the data is safe with them. In case the local copy gets corrupted, one can always depend on cloud storage. While data stored in the cloud is mostly safe, there have been cases where the cloud service provider has accidentally deleted user data. Additionally, data can be lost in case there is a data breach and the cloud provider’s service gets compromised or passwords get stolen. Considering these possibilities, some of us are not comfortable in relying entirely on cloud services. Personally, I make limited use of cloud-based storage services and mostly depend on keeping data in local storage. Having made the decision of ‘going local’ I need to take steps to ensure that the data is properly backed up, so that I will have a copy of the data, in case the primary storage fails. For important things like photographs, email and financial documents, I
24 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
maintain two backup copies of data. In other words, I have the actual data in local storage, as well as a backup of the data on two different portable HDDs (hard disk drives). Due to this approach, I need to use a well-defined and simple backup process.
A survey of personal backup solutions Over the years, I have used many methods to maintain backup copies of data. The first method obviously was to maintain a copy of each file. Then I started storing compressed versions of the data and using a timestamp for each backup. Just as enterprises have had to address the Big Data explosion, I too have not been immune to the problem of large volumes of data. Due to data volumes and the frequency of changes, the timestamp method of maintaining backups is not practical. Additionally, each zip archive ends up with one copy of the file and when a particular copy is needed, it is difficult to figure out the most relevant copy. I also tried using version-control software like SVN. The problem with version control is that it’s sub-optimal for binary data. In other words, version control is best suited for text files, where versions are stored using the ‘difference’ method. In the case of binary files, most such solutions simply copy the binary file into the repository, as
Let’s Try
Admin
finding a ‘difference’ between two binary files is not simple and straightforward. Ideally, version control software, when used, should be hosted on a separate server, which is not an option at home. Additionally, maintaining the repository on an external HDD was next to impossible. Then I started searching for personal backup software that would allow me to maintain backups of data and that, too, on external media like a portable HDD. I tried multiple solutions, two of them being Microsoft SyncToy and Cobian Backup. While I liked each of these tools (with a preference for Cobian Backup), I found that the GUI interface, and the way these tools stored the backup commands, created a problem that needed additional effort each time I wished to use the tools for backup. While using these tools, I faced a problem with the backup commands due to the way Windows uses external media. Whenever external media is connected to a computer, there is no guarantee that it will be assigned the same drive letter as the one assigned the last time the media was connected. For example, if the connected media is assigned the drive letter H: today, there is no guarantee that it will be assigned the same drive letter tomorrow. If you have other media already connected, the portable HDD will get assigned the next available drive letter. As the backup commands configured in these tools refer to an absolute path that includes the drive letter, the command does not work if the drive letter is different. If the external HDD is assigned a different drive letter, you will have no choice but to edit the backup command to point it to the present drive letter. Now imagine the effort needed if such changes have to be applied to multiple backup commands, each time you wish to take a backup. While searching for more tools, I came across a simple backup tool, namely, Sync (alternately named Syncdir).
command line. I can almost hear you grumble: “Ew! The command line. How can you even recommend a command line application in these days of GUI systems?” But, in my experience, this command line interface provides me the maximum flexibility. On my local machine, I always have a JVM instance available either as a JDK or a JRE, along with a properly configured PATH variable and a JAVAHOME variable. To make the task of backing up easier, I have created batch files that contain the backup command (using Sync). The batch files are then stored on the portable device. Thus, whenever I wish to take a backup, I simply connect the portable device and execute the backup script. As the source directory does not change (being local storage) and the destination directory is specified using a relative path, the batch scripts do not need to be edited – irrespective of what drive letter is assigned to it. When launched, the script executes in a dedicated command prompt and I can continue with other activities. I only need to monitor the progress once in a while. Once the backup is complete, I go through the log file to check for errors and my backup work is done. By using a Windows batch file, my backup activity is reduced to a double click or an ‘Enter’ key-press, which is all that’s needed to launch the script. The scripts have made the task so simple that even after maintaining two backup copies of my personal photographs, email and other documents, the only trouble I face is that of maintaining a regular schedule to keep the backup copies suitably fresh.
Introducing Sync Sync is a Java application that needs to be invoked from the
Using this command synchronises the [“Target”] to match the [“Source”]. It should be noted that only the
Figure 1: Taking backups for the first time
Figure 2: Updating the backup. New files added
Using Sync To use Sync, the command format is as follows: java -jar Sync.jar [“Source”] [“Target”]
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 25
Admin Let’s Try --synctime:[y|n]
Synchronise the time of matched target files
--overwrite:[y|n]
Overwrite existing target files/directories
--delete:[y|n]
Delete unmatched target files/directories
--force
Equivalent to the combination:--rename:y --synctime:y --overwrite:y --delete:y
Batch script: sync-generic.bat Given below is a ‘generic’ script that can be invoked by other backup scripts. The reason for using such a script is that the actual backup script ends up with only minimal details and is thus easier to maintain. Additionally, changes to the commands, if any, when done in a central script work for all dependent scripts, reducing maintenance efforts.
Figure 3: Updating the backup. A file has been updated
[“Target”] is modified and, by default, the file name, size, last-modified time and a CRC-32 checksum of the file are used to match files. If [“Source”] is a directory, the source and target directories are matched recursively. Matched target files are time-synced and renamed if necessary; unmatched source files are copied to the target directory and unmatched target files and/or directories are deleted. If [“Source”] is a file, source and target files are matched, while ignoring the file name. If the files match, the target file is time-synced and renamed if necessary; if the target does not exist, the source file is copied to the target.
A few switches Some of the switches that can be used with Sync are: -s, --simulate
Simulate only; do not modify target
--ignorewarnings
Ignore warnings; do not pause
-l, --log:<”x”>
Create log file x; if x is not specified, sync.yyyyMMdd-HHmmss.log is used
-r, --norecurse
Do not recurse into sub-directories
-n, --noname
Do not use file name for file-matching
-t, --notime
Do not use last-modified time for filematching
-c, --nocrc
Do not use CRC-32 checksum for filematching
--time:[x]
Use an x-millisecond time-tolerance for file-matching (0-millisecond time-tolerance is used by default; use --time:1000 or more to avoid mismatches across different file systems)
--rename:[y|n]
Always[y]/never[n] rename matched target files
28 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
@echo off set SYNC_HOME=./Sync.jar rem deleting existing log file del log\%1.log java -jar Sync.jar --log:log/%1.log --force %2 %3 echo Backup for %1 done.
Batch script: backup-doc.bat Given below is the script that takes a backup of the ‘doc’ directory and uses the generic script for this purpose. @echo off set NAME=doc call sync-generic.bat %NAME% D:\%NAME% .\%NAME% pause
While there are many personal backup solutions, my tool of choice is Sync – a Java based command-line tool. Though it does not have a fancy GUI, it helps me maintain backups with little effort. To make the task of backing up easier, I use Windows batch files, which only require me to double-click (or press an Enter key) the script, to kick off the backup process. References [1] Syncdir, http://syncdir.sourceforge.com/ [2] Best free backup software: http://www.in.techradar.com/ news/software/applications/Best-free-backup-software-11programs-we-recommend/articleshow/38877922.cms [3] 13 Best backup software: http://www.pcadvisor.co.uk/ test-centre/software/13-best-backup-software-2015-2016uk-3263573/
By: Bipin Patwardhan The author is a senior technical architect at Capgemini India. He believes in the value of open source and has developed many customer solutions using open source Java libraries like Apache XMLBeans.
Overview
Admin
Sharing Resources Across Diverse OSs with Samba Welcome to the world of Samba. Being software that can be run on multiple platforms such as Windows, UNIX, Linux, etc, it enables you to share your files and printers between systems running various OSs. It works as an alternative to NFS.
S
amba is an open source network file system. It allows access to shared resources across the different OSs such as UNIX, Linux, IBM System 390, OpenVMS, etc. It is an open source implementation of Microsoft’s SMB (Server Message Block)/CIFS (Common Internet File System) protocol.
The main components of Samba Samba is made of three daemons (smbd, nmbd and winbindd). There are two services (smb and winbind) that control the behaviour of daemons. smbd: The server message block daemon provides sharing and printing services over the network. It also controls resource locking, user authentication and data sharing through the SMB protocol. The Samba server listens for SMB traffic on the default TCP ports, 139 and 445. The smb service controls the behaviour of the smbd daemon. nmbd: This provides NetBIOS names over the network. It allows the browsing protocols to make up the Windows Network Neighborhood view. The Samba server listens for NMB traffic on the default UDP port 137. The smb service also controls the behaviour of the nmbd daemon. winbindd: This resolves user and group information on a server running Windows NT 2000 or Windows Server 2003. It renders Windows user/group information in a format that is understandable by UNIX platforms.
A Microsoft RPC call, the name service switch (NSS), and pluggable authentication modules (PAM) allow this daemon to achieve this. This allows Windows NT domain users to access a UNIX machine as UNIX users. The winbind service controls the behaviour of the winbindd daemon. It doesn’t require the smb service to be started. It is a client side service, which is used to connect to Windows users. Samba also contains several important files. smb.conf: This basic configuration file can be edited directly with a text editor or via a utility program like SWAT or Webmin. smbpasswd: This contains the names of users to which smbd will refer when it is time to enforce user-level access control. smbusers: This provides the names of Samba’s administrators and of users with specialised Samba access.
Samba’s features The primary features of Samba are shown in Figure 2. The other features are: Establishes shares on UNIX that are accessible to Windows users Shares printers Creates a naming scheme so one can use user-friendly names rather than IP addresses Makes UNIX a Windows Internet Naming Service (WINS) server www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 29
Admin Overview smbd
● Provides sharing and printing servces ● Controlled by smb service
To create a Samba users (smbgrp) group, execute the following command: groupaddsmbgrp
nmbd
● Provides NetBIOS names over the network ● Controlled by smb service
Winbindd
● Resolves user and group information on a server running Windows NT 2000 ● Controlled by the winbind service
Figure 1: The main components of Samba
Provide Windows Internet Name Service (WINS) name server resolution
Authenticate Windows Domain Loging Assist in network browsing (with or without NetBIOS)
Samba Features Server directory trees and printers to Linux, UNIX, and Windows clients
Act as a Backup Domain Controller (BDC) for a Samba-based PDC Act as an Active Directory domain member server
Figure 2: Features of Samba
Why use Samba? Here are a few reasons: Samba easily replaces a Windows NT Samba file/print server. It offers more stability/performance compared to Windows NT. It allows interoperability between UNIX and Windows workstations. It can serve as a Windows primary domain controller. Samba can serve as a highly stable WINS server. It can be used to stabilise Windows browsing services.
Installation and configuration Samba installation To install Samba on a Linux machine, use the following command:
To add a user into a group, use the command given below: usermod -G smbusersmax
Samba configuration The smb.conf file stores all the configurations for Samba. It provides runtime configuration information for the Samba programs. It determines which system resources you can share with the outside world and what restrictions you wish to place on them. The file consists of sections and parameters. The name of the section must be enclosed in square brackets. Each section continues until the next one begins. Each section of the file starts with a section header such as [global], [homes], [printers], etc. The [global] section has the variables to define sharing for all resources. The [homes] section allows remote users to access their (and only their) home directory on the local (Linux) machine. That is, users can access their share from Windows machines and they will have access to their personal home directories. The [printers] section works the same as the [homes] section, but for printers. If you change this configuration file, the changes do not apply until you restart the Samba daemon with the command service smb restart. To view the status of the Samba daemon, use the following command: /sbin/service smb status
To start the daemon, use the following command: /sbin/service smb start
To stop the daemon, use the following command: yum -y install samba /sbin/service smb stop
To create a new Samba user (the new Samba user must exist on the Linux machine as a user), use the following command:
To start the smb service at boot time, use the following command:
smbpasswd -a /sbin/chkconfig --level 345 smb on
For example, to add the user max, execute the command given below:
Access management using Samba Table 1 gives the access control options.
smbpasswd -a max 30 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Overview Option
Parameters String (list of usernames) String (list of usernames) String (list of usernames) String (list of usernames) String (list of usernames)
Admin users Valid users Invalid users Read list Write list
Max Numeric connections Guest only (only Boolean guest) String Guest account (name of account)
Admin
Function
Default
Scope
Users who can perform operations as the root
None
Share
Users who can connect to a share
None
Share
Users who will be denied access to a share
None
Share
Users who have read-only access to a writable share
None
Share
Users who have read/write access to a read-only share
None
Share
Maximum number of connections for a share at a given time
Zero
Share
If yes, it allows only guest access
No
Share
UNIX account that will be used for guest access
Nobody
Share
Table 1
Table 2 gives the username options. Option Username map
Parameters String (filename)
Username level
Numeric
Function Sets the name of the username mapping file Indicates the number of capital letters to use when trying to match a username
Default None
Scope Global
Zero
Global
Table 2
The authentication of clients is shown in Table 3. Option
Parameters
Security
Domain, server, share or user
Function Indicates the type of security that the Samba server will use
Default
Scope
User
Global
Table 3
The password configuration options are listed in Table 4. Option Encrypt passwords
Parameters Boolean
Function If yes, enables encrypted passwords If yes, updates the standard UNIX password UNIX password sync Boolean database when users change their encrypted password String (chat com- Sequence of commands sent to the password Password chat mands) program Password chat If yes, sends debug logs of the password-change Boolean debug process to the log files with a level of 100 String (UNIX comPassword program Program to be used to change passwords mand) Number of capital-letter permutations to attempt Password level numeric when matching a client’s password If yes, updates the encrypted password file when Update encrypted Boolean a client connects to a share with a plain-text password password Null passwords Boolean If yes, allows access for users with null passwords smbpasswd file
String (filename)
hosts equiv
String (filename)
use rhosts
String (filename)
Name of the encrypted password file
Default No
Scope Global
No
Global
See earlier section Global on this option No
Global
/bin/passwd%u
Global
None
Global
No
Global
No /usr/local/samba/ private/smbpasswd
Global
Name of a file that contains hosts and users that None can connect without using a password Name of a rhosts file that allows users to connect None without using a password
Global Global Global
Table 4 www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 31
Admin Overview Samba server types and security modes
Active Directory Security Mode
There are three different types of servers available: Domain controller • Primary domain controller (PDC) Domain Server • Backup domain controller (BDC) Security Security • ADS domain controller Mode Mode Domain member server • Active directory domain server User • NT4 style domain server Level Standalone server There are two types of security modes Security available for Samba: share-level and user-level, known as security levels. Share-level security: This can be Figure 3: User level security modes implemented in only one way. With share-level security, the server expects a password for each share. It is not dependent on a particular user name. compatible authentication protocols. In smb.conf, the security=share directive that sets shareIn smb.conf, this can be achieved by using the following level security is: directives: [GLOBAL] security = share
User-level security: This is the default setting for Samba. It is used by Samba, even if the security = user directive is not listed in the smb.conf file. Once the server accepts the user name/password, the user can then access multiple shares without entering the password for each instance. In smb.conf, the security = user directive that sets userlevel security is: [GLOBAL] security = user
There are different implementations of user level security (see Figure 3). Domain security mode: In this mode, the Samba server has a domain security trust account, which ensures all authentication requests are handled by domain controllers. It can be achieved by using the following directives in smb.conf: [GLOBAL] ... security = domain workgroup = MARKETING
Active directory security mode: In the active directory environment, this mode is used to join as a native active directory member. The Samba server can join an ADS using Kerberos, even if a security policy restricts the use of NT32 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Server security mode: This is used when Samba is not capable of acting as a domain member server. In smb.conf, this mode is achieved by using the following directives: [GLOBAL] ... encrypt passwords = Yes security = server password server = “pwd”
By: Palak Shah The author is a senior software engineer. She loves to explore new technologies and learn innovative concepts. She is also fond of philosophy. She can be reached at [email protected]
Overview
Admin
Ceph: A Storage and Backup Solution for Every Need Backup and storage solutions assume paramount importance these days when data piles up in terabytes and petabytes, and the loss of it can be catastrophic. Ceph is the next generation, open source, distributed object store based low cost storage solution for petabyte-scale storage.
S
torage is a key requirement these days and has to be thoroughly planned, keeping in mind performance, future enhancements, scalability and availability, as all these factors directly impact the functioning of the system. With the recent advances in cloud computing, the need for storage has been broadly classified into three major types—object store, block store and file system store. Ceph is an open source, reliable and easy-to-manage, next-generation distributed object store based storage platform that provides different interfaces for the storage of unstructured data from applications. Ceph, being an object store, provides an interface for all three types of data storage, i.e., object store, block store and file based store, hence giving a unified storage platform. Ceph is the best example of software-defined storage (SDS). Analogous to the principles of cloud computing, Ceph scales out using commodity hardware, considering the challenges of scaling up. This helps us define different hardware for different workloads, and manage multiple nodes as a single
entity. With distributed storage, Ceph allows the distribution of the load, which delivers the best possible performance from low cost devices. All this contributes to low cost hardware, high data efficiency, broader storage use cases and greater performance. These are some of the major benefits of Ceph, because of which it has been widely adopted recently by enterprises and cloud service providers. Major storage vendors like SanDisk and other enterprises like Red Hat have contributed significantly to Ceph’s development. So let us take a closer look at it.
Architecture and design
Ceph is a scalable storage cluster based on RADOS (reliable, autonomic distributed object store), which is a self-healing and self-managing distributed object storage system. It has two types of nodes, Ceph Monitor and Ceph OSD. A Ceph Monitor node maintains the state of the cluster map to check for node availability, and provides the cluster map to the cluster clients. A Ceph OSD Daemon is an intelligent peer, www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 33
Admin Overview Client
RADOSGW
RBD
CephFS
OSD
OSD
LibRADOS
RADOS OSD
OSD
a RESTful HTTP API as the object gateway, which is compatible with Amazon’s S3 and Openstack’s Swift services. Ceph can be used to make a cloud object storage solution like Dropbox or Google Drive using the Openstack cloud platform. HDFS, GFS, Gluster-Swift, EMC Atmos and NetApp StorageGRID Webscale are a few of the leading proprietary object store systems.
Ceph as block storage
Block store is the traditional form of disk data storage where the data is divided into blocks and Monitor OSD OSD OSD stored using a file system. Block store is best suited for VM disk volume storage needs, where we store large singular files with higher read and write OSD OSD OSD OSD frequencies. Ceph block storage can be well served as a SAN (storage area network) solution. One added advantage of Ceph is that it Figure 1: Ceph architecture has thin-provisioning enabled, which helps in faster data duplication and efficient storage space utilisation. When a disk which stores the data, checks its own state and other OSDs stored in Ceph is replicated, it doesn’t occupy any memory and reports back to monitors. In case an OSD goes down, the for the replicated copy. Ceph uses the COW (copy-on-write) Ceph OSD Daemon automatically triggers data rebalancing method where the block is replicated and updated only on based on the new cluster map from Ceph Monitor. Data is write, else the block is not replicated. always transferred from the OSD nodes and never from the Amazon’s EBS (Elastic Block Storage) is a popular block monitors to the clients. storage service used generally for volumes of VM instances. This architecture of Ceph might seem familiar to those who know about Google File System (GFS) and Hadoop Distributed File System (HDFS) but it is also very different Ceph as file system storage from them in multiple ways. Ceph uses an algorithm known A file system based storage is like any NAS (network as CRUSH (controlled, scalable, decentralised placement attached storage) system, where the file system is managed of replicated data) for random and distributed data storage by a remote storage device. Ceph uses the Ceph FS (Ceph among the OSDs. Ceph doesn’t need two round-trips for data file system), which provides a POSIX-compliant file system retrieval like HDFS or GFS, in which one trip is to the central as an interface. A client system can mount this file system lookup table to find the data location and the second trip is to and access the file storage. the located data node. Every bit of data stored in Ceph OSDs Ceph FS uses Ceph MDS (Ceph metadata server) in the is self-calculated using the CRUSH algorithm and stored storage cluster to store the metadata about the file system. This independent of any other attribute. When a client requests data metadata consists of the file system tree structure, timestamps, from Ceph, this CRUSH algorithm is used to find the exact permissions and other POSIX-compliant data. Ceph MDS can location of all the requested blocks, and the data is transferred also be distributed by dividing the tree structure and storing it by the responsible OSD nodes. in different Ceph MDSs. This metadata is stored like any other As and when any OSD goes down, a new cluster map object in the Ceph OSDs, and is replicated for backup and is generated and the duplicate data of the crashed OSD recovery purposes. FreeNAS, NAS4Free are some of the NAS is transferred to a new node based on results from the solutions available. The Lustre file system is a famous open CRUSH algorithm. source parallel file storage solution. Oracle ZFS Storage and Dell Storage SCv2000 are a few of the popular NAS solutions from proprietary vendors. Ceph as object storage Object stores have certain advantages over traditional block storage. Every stored object is given an object ID, which Ceph in cloud platforms makes it faster to search/access and easier to manage. An Storage is an important component in the cloud service object store is highly scalable for a large number of objects providers’ (CSP) list of services. Major CSPs provide but has limitations on the size of a single object. multiple forms of cloud storage services, which include rapid Ceph provides the LibRADOS native library, which allows storage, archival storage, object store, block store, NAS, direct access to RADOS using apps with support for C, C++, SAN, VM disk storage, etc. All these solutions generally Java, Python, Ruby and PHP. Ceph also features RADOSGW, require different dedicated hardware and software in order to 34 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Admin Overview maintain QoS and performance. Ceph as a single storage cluster allows multiple data storage interfaces, which can utilise different hardware for different storage needs, managed by a single SDS system. Ceph divides the OSDs into placement groups for the CRUSH algorithm. These placement groups can be combined together to form a pool, which is like a logical partition for storing the objects in Ceph. Pools can help differentiate between the storage hardware based on performance. Ceph also has cache-tiering, which helps in creating a pool of faster storage devices as cache storage for expensive read/write operations. This helps in improved performance and efficient utilisation of the storage hardware. With OpenStack as the cloud platform, Ceph can be used as a Swift object store and Cinder block store utilising the same storage hardware for multiple needs. Ceph can be used with other cloud platforms like CloudStack, Eucalyptus and OpenNebula.
Ceph as a backup solution
We have been using RAID systems for redundancy and data backup purposes. With Ceph as the storage platform, data redundancy and recovery is managed using the RADOS storage system. RADOS has features like selfhealing and self-managing, which help in recovering the data from lost OSD nodes using instant replication and rebalancing. Ceph also allows taking an incremental snapshot of block storage using RBD. OpenStack has a Ceph backup driver, which is an intelligent solution of VM volumes. It backs up the volumes to Ceph’s backend storage, and regularly performs an incremental backup on the volumes to maintain consistency. CloudBerry Backup for Ceph is a popular tool, which helps in versatile control over Ceph’s backup and recovery mechanism. According to several research organisations like Gartner Inc., Ceph has made a strategic entry into the enterprise IT space and will prove to be the next big evolution in storage technology. With the current adoption rate, Ceph will soon surpass the existing storage solutions at enterprises. There is a lot of development expected to happen with Ceph, which will bring about significant performance improvements to match the current proprietary solutions. Even if you don’t take future enhancements into consideration, Ceph is a storage platform that definitely needs to be looked at for your next big deployment.
By: Krishna Modi The author has a B. Tech degree in computer engineering from NMIMS University, Mumbai and an M. Tech in cloud computing from VIT University, Chennai. He has rich and varied experience at various reputed IT organisations in India. He can be reached at [email protected] WWW.EFYMAG.COM WWW.EFYMAG.COM
36 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Insight
Admin
Manage File Storage to Give the Best Customer Service
“There’s never enough of storage, no matter how big a disk you have,” seems to be the refrain of all computer users. There are, of course, various solutions to tide over this problem, especially now, with the arrival of cloud computing. This article takes the reader through the steps required to create a file storage system and its management. It could well set you up as a STaaS (Storage as a Service) provider in cloud computing.
W
e can’t deny the fact that the traditional model of data management and hosting has changed, and users now want their data/resources to be centralised. This is possible because of the innovation of virtualisation technology. It has become much easier for
large organisations and even small businesses to centralise not only their storage but also everything they came across in the IT world. Due to the rapid increase in structured and non-structured data, there is the need for good management of storage, without a waste of resources. www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 37
Admin Insight If we talk about Storage as a Service (STaaS) in cloud computing and if you are a cloud service provider, then a lot of your resources get wasted once fixed storage space is allotted to a client. Most of this space is rendered useless since the client doesn’t use some of it and you are not allowed to use it either. To solve this or to minimise problems like this, it is preferable to use file data storage as a shared storage solution rather then object storage or a storage attached network (SAN). In this article, I will try to lead you through the procedure of using and managing a file storage solution and also qemu-img virtualisation. I will even try to take you through some new techniques of disk virtualisation, which can be really useful at some point for your cloud server. I have used simple techniques, which can be easily understood and are not very striking but will really help you at different levels of storage system development. All the tools used are generally available on most Linux systems; if not, these can be easily downloaded via your default online repository. The tested system for these commands is RHEL 7.0, but they will work on other Linux systems in the same way, with a little modification.
File storage: Solving the storage problem
If you are a STaaS provider, you will one day definitely come across the problem of a lot of your storage being wasted. This is because, if a user purchases 10GB of storage space from you, then you are forced to allocate him his 10GB, regardless of whether he will use all this space at once. Another challenge you will face frequently is that of maintaining the right amount of storage to provide the best service to your customers. As an example, if your data storage has 100GB of space left, someone may unexpectedly demand 200GB of storage from you, and you may have to refuse. But a good provider should fulfil all user requests and provide 24/7 service. So a good solution is to use file storage, with which you can scale your storage according to the data inside it and allocate space that you don’t even have. File storage relies on the fact that everything in your operating system is a file. In file storage, we create a file and use it as our virtual partition, then format it in the desired file system and mount it. All these operations are done in our real disk partition (/dev/sda*) with some small tricks or manipulation. For this procedure, we use sparse images as our disk. Sparse files: A sparse file is a specific type of file that aims to use the file system space more efficiently by using metadata to represent empty blocks. In sparse files, blocks are allocated and written dynamically as the actual data is written, rather than at the time the file is created. So, if you create a 10GB sparse file, it will 38 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
not even take 1MB of your disk space but in its property, it will show 10GB as that is the allocated space. So let’s start the procedure. A sparse file can be created by either using the truncate or the dd utility in Linux (other tools are also available). $truncate --size=1GB test.img
The above command will create a 1GB sparse image test.img. To get the same result with dd, use the following command: $dd if=/dev/zero of=test.img bs=1024 count=0 seek=$[1024*1000]
Here, bs is block size and the size is provided by seek. You can also use the following code for simplicity: $dd if=/dev/zero of=test.img bs=1 count=0 seek=1G
If you want to allocate the whole disk space at once (which is not a good solution in our case), you can use fallocate, as follows: $fallocate -l 1G test.img
Note: Here’s a little trick: Can I create a 2TB partition on my 1TB hard disk? The answer is, “Yes!” Since sparse files have the property of not taking up space during creation. $truncate –size=2TB mylargefile.img
Create any desired file system on it, as follows: $mkfs.ext2 mylargefile.img $mount mylargefile.img /mnt/
You now have a 2TB partition. After creating the file, you can format it and directly mount it in order to use it as your storage, but to enable it to do lots of disk operations it first has to connect to the loop device using the losetup utility. To attach the file to your loop device, use the following command: $losetup -f test.img
…where test.img is your formatted file. To see all such files or block devices connected to the loop device, use the following command: $losetup -a
You can easily grep the last created loop device by using the command below:
Insight
$losetup -a | tail -1
After creating the loop device, it’s time to format it. You can use any file system, but I recommend using the ext2 file system (the reason will be explained later). To format, use mkfs utils: $mkfs.ext2 /dev/loop0
If you are working on big data management, then the xfs file system is preferred, because it works well on handling large files and supports larger inode data. Now you can mount it to use as your virtual storage, which can be shared, as follows:
Admin
You will find that there is no change in the mounted space, since the loop device has not detected the increased size. To make the loop device detect this, it will first have to be checked by e2fsck, using the following command: $e2fsck -f test.img
After that, since the file system is only made on a 1GB partition, it has to be resized. However, we don’t want data to be lost or deleted and can’t handle the break in service by unmounting it for even a few seconds. So we need to do online resizing by using the command given below: $resize2fs test.img
$mount /dev/loop0 /mnt/
To share it across the network using a service like nfs, open /etc/exports and use the following command: /mnt/ *(rw,rsync,no_root_squash)
Close it and restart the nfs service by using the command given below:
We have specially used the ext2 file system, since online resizing and file checking only works perfectly for this file system with resize2fs utility. This will make our file partition ready for use. However, all these changes have not yet been detected by the loop device; so the same operations have to be done on it. $losetup -a |grep mnt $losetup -c /dev/loop0
$service nfsd restart
Once you put the data inside the device, you can check its original occupied size by using the qemuimg utility, as follows: $qemu-img info test.img
…or by using du, as shown below: $du -h test.img
General disk operation Scaling up the size: If a user wants more storage, it can get difficult for you since this type of storage doesn’t support disk operations like resizing (especially online resizing) and you can’t unmount or put the storage offline. Using a small trick and with a bit of manipulation, however, you can achieve this. Just follow the exact procedure I have outlined. Suppose a user wants his space to be increased by 1GB (i.e., 2GB total), then you first need to increase the size of the file by 1GB. For that you can use the same truncate tool, since it works on already created files and also creates the file of the desired size. $truncate --size=2GB test.img
…or: $qemu-img resize test.img +2GB www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 39
Admin Insight With the -c operation, losetup will detect the increase in the size but you still need to make the change in the file system for the loop device.
storage image, it will get copied exactly in the backup image. So a better solution can be to rsync the mounted path: $rsync -avz /mnt/ /mnt_backup
$resize2fs /dev/loop0
With this, the new resized storage is ready to be used. Decreasing the storage: To decrease the partition size, it is not required to decrease the size of the file with the ‘truncate’ utility, since it may leave some half data chunks or bad blocks, and due to these bad blocks you will not be able to use it until you format it again. So, the best solution can be to just decrease the file system layer on the file using the resize2fs command: $resize2fs test.img 1G
…where 1G is the new decreased size. You do not need to worry about the size shown in metadata since the user will only be able to use the formatted space. Since the file is a sparse file, the remaining space will not take any space in your system. In the looped device, we just need to detect the changed size, as follows: $losetup -c /dev/loop0
Now, your storage of decreased size is ready. Tip: To detach the storage from the loop device use the following command: $ losetup -d /dev/loop0
Backing up your data: Features that every storage service provider must have are backup, snapshot and clustering. Data snapshots seem to be the less sensible option, so let’s not concentrate on them and discuss the topic at a later stage. But talking of backing up data, this feature creates a backup file which saves all the useful data for future use. So let’s take a backup, which can be successfully done by using the rsync utility: $rsync -avz test.img test_backup.img
This will create a test_backup.img file with the same data blocks. So every time you run this command, new changes made since the last backup will get saved in test_backup.img. Warning: If you create the backup image of your file system through rsync, the backup file will not be a sparse file; so it will allocate all of the space at once.
Although this looks simple, it is not a very efficient solution since once data starts overwriting in some blocks in the original
40 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
To make the backup process automatic, you can use lsyncd for live synchronisation. Install lsyncd in an RHEL system, as follows: $yum install -y lsyncd
Edit the configuration file lsyncd.conf, as follows: $cat /etc/lsyncd.conf ----- User configuration file for lsyncd. --- Simple example for default rsync. -settings = { logfile = “/var/log/lsyncd.log”, statusFile = “/var/log/lsyncd.stat”, statusInterval = 2, } sync{ default.rsync, source=”/mnt/”, target=”192.168.1.15:/backup/”, rsync={rsh =”/usr/bin/ssh -l root -i /root/.ssh/id_rsa”,} }
Note: You need to first connect to the backup machine by ssh, using ssh-keygen. If you want to automate the backup process, you can use fsmonitor npm, as follows: $fsmonitor rsync -azP /mnt/ /mnt_backup
You can even use rsnapshot for incremental backup. Snapshots of the storage data: Snapshot is a facility available in different Linux distros to save your storage at particular stages. It can be a useful feature for STaaS providers, because it will help your storage to revert back to any previous state. Snapshot is different from backup because it doesn’t take up your storage space until changes are made in the storage. To save space, it just copies the files that have been deleted. For our file storage, we will be using the qemuimg utility for a snapshot. So first create a new snapshot of your storage, as shown below: $qemu-img snapshot -c backup_snapshot test.img
Insight -c is used for creating a new snapshot. To revert back to a particular state, use the following command: $qemu-img snapshot -a 5 test.img
…where 5 is the snapshot ID. To see all the available snapshots, use the command given below: $qemu-img -l test.img
To delete a snapshot, use the command shown below: $qemu-img snapshot -d 2 /images/sles11sp1.qcow2
Securing your virtual storage: The other advantage of using file storage is that it’s easy to ship like a container. But with shipping comes the responsibility of securing your storage. So a good solution for securing your virtual data storage is to protect it by using encryption, for which you will need a password every time to mount it. For encrypting your storage, let’s use dm-crypt. We will try encryption on fresh file storage, as follows: $truncate encrypted.raw --size=2GB
Next, set up a LUKS header, as follows:
Admin
# cryptsetup close my_encp.raw
Note: You can do the same disk scaling and other operations on disk, but now you need to make changes to /dev/mapper/my_encp.raw rather then /dev/loop0. Using a file storage virtual machine: A great benefit of file storage is its use as a base storage for virtual machines. Once you decide to use a file storage for an OS run on a virtual machine, other operations like scaling and encryption can also be applied to it. To create a virtual machine instance from our already created storage, let’s use the qemu-kvm utility, as shown below: $qemu-kvm -name “my_os” -m 1024 -smp 2 -drive file=test. img,if=virtio,\ index=0,media=disk,format=raw -drive file=ubuntu-14.04. iso,index=1,media=cdrom
This will start a new virtual machine with the minimal option selected. Here, -m defines the amount of RAM allocated and -smp refers to the number of cores. You can read about more available options in the qemu-kvm man page or by using qemu-kvm –help. To start an already created virtual machine, use the following command:
$cryptsetup luksFormat encrypted.raw
Warning: Don’t try this with an already formatted partition because it will delete all the previous data inside the partition.
This will prompt you to enter a fresh password. To gain access to the device, use the command given below: $cryptsetup open encrypted.raw my_encp.raw
my_encp.raw is the name of the file where our partition is mapped in /dev/mapper/. Now you can create a file system on top of it, as follows: $mkfs.fstype /dev/mapper/my_encp.raw
Mount the newly created partition anywhere with the following command: $mount -t ext2 /dev/mapper/my_encp.raw /mnt/
Once use of the storage is finished, you can unmount it, as follows: # umount /mnt/
File storage can be a good solution for your enterprise environment or personal use, but what matters is how you make it useful at a different level of your cloud solution. Storage management in the case of file storage doesn’t end here since there is a lot more you can do with it. It can be a powerful as well as flexible solution for storage management, but requires thorough knowledge to get the job done.
References [1] For lsyncd: http://www.linuxtechi.com/install-and-uselsyncd-on-centos-7-rhel-7/ [2] For qemu utility: https://www.suse.com/documentation/ sles11/book_kvm/data/book_kvm.html [3] For Dm-crypt encryption: https://wiki.archlinux.org/index. php/Dm-crypt/Encrypting_a_non-root_file_system
By: Shubham Dubey The author is a B. Tech student at the LNM Institute of Information Technology, Jaipur. Besides being a Linux enthusiast, he works in the fields of cloud computing, virtualisation and cyber security. You can contact him at [email protected] or at his LinkedIn page https://in.linkedin.com/in/shubham0d
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 41
Admin How To
Bare Metal System Backup and Recovery Using Open Source Tools Disaster often strikes when least expected. When it’s a question of safeguarding data against disk crash or corruption, it’s best to have a backup and recovery system. Bare metal system recovery is particularly advantageous in that there is no need to even load the operating system to effect data recovery.
B
ackup and recovery of heterogeneous data files is one of the key aspects of personal as well as corporate computing. In our routine tasks, we create and update a number of files in our systems. But the major problem occurs when the operating system crashes or any media stalls. This issue generates huge complications and, many a time, our work stops. Often, we have to create the files using backup, recovery and disaster management tools and techniques. A number of tools are available in the open source as well as the proprietary domain, which provide secured backup, fast recovery and fault tolerance. The key challenge in disaster management arises when the operating system crashes and when we are not able to load the OS to execute the recovery tool. In such scenarios, we can work with bare metal backup and recovery tools so that there is no need to even load the operating system. Using these tools, the backup, recovery and disaster management can be achieved at the BIOS level itself.
Features of effective backup and fault-tolerant tools
Some of the major features generally required for personal as well as corporate computing are: Bare metal recovery and fault tolerance Non-redundant or de-duplicated backup Security and privacy using multi-level encryption Dynamic compression and archiving Dynamic repository management Automatic/scheduled backups Cloud based backup and recovery Import and export for multiple platforms Version tracking 42 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Configurability with scope for extension using plugins Command line as well as GUI panels for usability Transaction fault tolerance to avoid data loss Being volume oriented to support compression, splitting and merging for multiple devices and platforms Malware scanning Universal view and updates Cross platform Support for multiple data formats Support for multiple databases Generation of reports, alerts and logs
Free and proprietary software backup tools
A number of software tools are being used in personal as well as commercial domains for backup of files and documents. Table 1 shows
Linux based free software with command line interfaces. Program/Tool
Base language
AMANDA Areca Backup duplicity rdiff-backup Attic Bacula Back In Time DAR luckyBackup Mondo Rescue obnam Syncthing
C and Perl Java Python Python Python C++ Python C++ C++ C and UNIX Shell Python Go
Unison
OCaml
Table 1
How To
Admin
Table 2 features proprietary backup packages with a GUI (graphical user interface). @MAX SyncUp ARCserve Backup Backup Exec Bvckup 2 Comodo Backup Dolly Drive Druva Phoenix Handy Backup IBM Tivoli Storage Manager Iperius Backup LazySave Memopal Novabackup Norton 360 Retrospect System Center Data Protection Manager SyncToy TotalRecovery Pro Windows Home Server Computer Backup
Tonido Backup Ventis BackupSuite 2008 Yosemite Server Backup
Table 2
Free and open source tools for bare metal recovery Redo Backup and Recovery URL - http://redobackup.org
Redo Backup and Recovery is very user friendly and an effective tool for overall backup and disaster management with bare metal data restoration. With this software, disk based disaster management and restoration at the bare metal level is very effective in case of hardware failure. It is also used as an anti-virus. The complete system backup can be taken even if the hard drive gets burned, or damaged by a virus or due to malicious applications. Features of Redo Backup and Recovery include: No need for installation Very small download size of 250MB for live CD Boots in a few seconds Access to files and folders even without logging in www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 43
Admin How To Multi-casting, which involves the imaging of multiple systems in parallel Cross platform (Linux, Windows, Mac and multi-boot setup) Superior scalability, which does not degrade the performance even for the imaging of 50,000 computers
Backup
Step 3: Select Destination Drive Where is the destination drive? Connected directly to my computer Shared over a network Selected network-shared storage destination: Shared Folder specified below Server or share location: Username (Optional):
Figure 6: FOG installer
Password (Optional): Cancel
Next > WELCOME TO HONDO RESCUE
Figure 4: Network based storage in Redo Backup and Recovery
Configuration of disk drives can be done graphically with the number of inbuilt tools Besides the enormous features for backup, the tool includes a number of additional and powerful programs, featured in Table 3.
FOG project URL - https://fogproject.org FOG is another very powerful system imaging tool that is free, open source and based on Linux. FOG can be used with PXE and TFTP without any boot disk or CDs. With it, the system is booted using PXE and automatically a small Linux client is downloaded, after which many operations can be performed on the system. The features of FOG include: It is free and open source
Figure 5: The FOG project’s official portal 44 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Backup to:
Mondo Rescue
URL - http://www. mondorescue.org/ Mondo Rescue is a reliable and widely used disaster recovery software. Pleasechoose the backup media to which you want to archive data. It is used for the Figure 7: Different options and backing up of GNU/ features of Mondo Rescue Linux based systems to different types of storage media. Mondo is used by Lockheed-Martin, Siemens, Nortel Networks, IBM, HP, NASA JPL, the US Department of Agriculture and many other organisations worldwide. Mondo supports RAID, LVM 1/2, ext2, ext3, ext4, XFS, JFS, ReiserFS, VFAT and other assorted file systems without any issues. It supports software as well as hardware RAID. This tool is packaged with many distributions and flavours including Fedora, SLES, RHEL, openSUSE, Ubuntu, Gentoo Mandriva, Mageia and Debian. CD-R disks
CD-RW disks
DVD Disk
USB Key/Disk
Tape drive
NFS mount
Hard disk
Exit
Rear (Relax-and-Recover) URL - http://relax-and-recover.org Relax-and-Recover or Rear is another bare metal disaster recovery tool that is very easy to install and set up without any maintenance overheads. The features of Rear include: Modular design, written in Bash • Easy to extend with custom functionality • Targeted primarily at sysadmins Offers the ‘Set up and forget’ style of operation • Easy to install and set up • No maintenance issues Recovers the images based on original distribution using original tools Two-step fault tolerance and recovery Bare metal recovery on different types of hardware Support for • Physical-to-virtual (P2V) • Virtual-to-physical (V2P)
How To
Documentation
Downloads
Support
Development
Events
Relax-and-Recover is a setup-and-forget Linux bare metal disater recovery solution. It is easy to set up and require no maintenance so there is no excuse for not using it. Learn more about Relax-and-Recover form the slected usage scenarios below: Home user
Enterprise user
Admin
Working with the Rear tool: You can clone Relax-andRecover from GitHub as follows: $ git clone https://github.com/rear/rear.git $ cd rear/
You can also prepare your own USB media. Relax-andRecover will ‘own’ the device in the following example. • •
recover from a broken hard disk using a bootable USB stick recover a broken system form your bootlader
• • •
collact small ISO images on a central server integrate with your backup solution recover a broken system form your bootlader
This will destroy all data on that device. $ sudo usr/sbin/rear format /dev/sdb
Figure 8: The Relax-and-Recover official page
• Physical-to-physical (P2P) • Virtual-to-virtual (V2V) • Assorted virtualisation technologies Support for assorted boot media • USB • PXE • eSATA • ISO • OBDR/bootable tape Support for different transport protocols • HTTPS • HTTP • SFTP • NFS • FTP • CIFS (SMB) Extensive and exhaustive implementation for disk layout • SWRAID • LVM • HWRAID • multipathing • iSCSI • DRBD • LUKS (encrypted partitions and file systems) Support for third party technologies for backup • EMC NetWorker (Legato) • HP DataProtector • CommVault Galaxy • IBM Tivoli Storage Manager (TSM) • Symantec NetBackup • Bacula • SEP Sesam • Bareos • Duplicity/Duply Assorted methods for internal backup • rsync • tar Multi-phase disk layout recovery Effective and easy monitoring Integration with high performance scheduler
Relax-and-recover confirm to format the device Yes / No The device has been labeled REAR-XXXXX by the “format” workflow. Edit “etc/rear/local.conf” configuration file: $ cat > etc/rear/local.conf <
Now, create a rescue image. $ sudo usr/sbin/rear -v mkrescue Using log file: /home/myuser/tmp/quickstart/rear/ var/log/rear/rear-fireflash.log Creating disk layout Creating root filesystem layout WARNING: To login as root via ssh you need to setup an authorized_keys file in /root/.ssh Copying files and directories Copying binaries and libraries Copying kernel modules Creating initramfs Writing MBR to /dev/sdb Copying resulting files to usb location
Reboot the system to boot from the USB device. Initiate the backup as follows: $ sudo usr/sbin/rear -v mkbackup
By: Dr Gaurav Kumar The author is the MD of Magma Research and Consultancy Pvt Ltd, Ambala. He is associated with a number of academic institutes, where he delivers lectures and conducts technical workshops on the latest technologies and tools. You can contact him at [email protected] or www.gauravkumarindia.com.
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 45
Admin How To
Creating a Storage System using A storage system for data is one of the things that most computer users need. Solutions vary from the DIY kind to expensive cloud storage. This article points the reader to Rockstor, a DIY network attached storage (NAS) system.
C
ommon complaints from prosumers and small business owners with respect to storage systems are: a server or hard disk tends to crash when it matters most; one often runs out of storage space at the wrong time; and the challenges in selecting the best storage solution from myriad options. Eventually, many users end up choosing an inappropriate storage solution for their needs. This article guides you on how to select an appropriate storage solution. It also describes an easy way to build a complete storage system using inexpensive hardware and Rockstor, a CentOS based Linux distro. By the end of this article: 1. Prosumers will learn how to build a storage system using Rockstor. 2. Small and medium business users will be able to determine whether a Linux based DIY NAS or a dedicated NAS server (Rockstor Pro 8) is appropriate for their needs.
46 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Hardware requirements for building a storage system When it comes to building your Linux based storage system, there are three approaches you could take: 1. Assemble a NAS from components 2. Build on top of pre-built servers or old hardware, or even a hypervisor 3. Buy a pre-built Linux NAS box from a Linux NAS distro provider Assembling a DIY NAS from components: This approach is recommended for prosumers who are tinkerers at heart and like to experiment, build and break stuff. The advantage of this approach is its low cost, a high degree of control and the thrill of building a NAS. The disadvantages of this approach are that it’s time consuming and needs some degree of technical expertise. If you decide to take this approach, the following are
How To Admin
Figure 3: Rockstor dashboard
Figure 1: Rockstor installation
Figure 4: Create storage-available hard disks
A note on minimum system requirements: Each of the Linux NAS distros will have specific system requirements depending on the underlying file system.
Figure 2: Rockstor installation—select date and timezone, and installation destination
some of the components you’ll need: 1. Case (e.g., SilverStone DS380B) 2. Power supply unit (e.g., SS-350FE) 3. Motherboard and CPU combo (e.g., ASrock MB C2550D4I and the Intel Avoton C2550 or C2750 quadcore processor) 4. RAM (e.g., Kingston 8GB 1600MHz DDR3LDIMM) 5. Boot drive (a USB or a PCIemSATA) 6. HDDs (e.g., Western Digital hot pluggable — 8.89cm HDDs or 6.35cm HDDs) Building NAS on pre-built servers: This approach is recommended for prosumers or SMBs. The advantage of this approach is its cost-effectiveness and flexibility in the choice of hardware (as long as it meets the minimum system requirements) or hypervisors. You can use any of the following four options to build your storage system: 1. Old hardware such as an old laptop or an old server 2. Appliances such as Intel NUC, Asus Vivo PC, HP Proliant, etc 3. Hypervisors such as KVM, Proxmox or VirtualBox 4. A combination of virtualisation and appliances
With Rockstor, you’ll need the following: 64-bit Intel or AMD processor 2GB RAM or more (recommended) 8GB hard disk space for the OS One or more additional hard drives for data (recommended) Ethernet interface (with Internet access, for updates) A UPS system (if desired) that is supported by Network UPS Tools DVD drive and a blank DVD, or a USB port and minimum 1GB USB key (for the installation media) Buying a complete NAS solution from an open source Linux NAS distro provider: This solution is recommended for SMBs or even prosumers who would like to save time and don’t mind paying the extra cost. The advantage of this solution is its plug-and-play nature. Rockstor Pro 8 DIY NAD Build is one such solution with BTRFS at its core. This solution is superior to other Linux based solutions available in the market as users have access to advanced features at a fraction of the cost. Some of these features are: Simple and secure browser based management Supports different sized HDDs and has online capacity scaling. Adds more drives as and when you need them Bitrot protection, checksums, compression and other advanced file system (BTRFS) features Efficient Copy-on-Write (CoW) snapshots of shares (on demand or on schedule) File sharing and access from Linux, Mac, Windows and mobile devices (Android and iOS)
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 47
Admin How To
Create Share
Create Pool
Name Raid configuration Compression
vinima_photos
Name
Photos-2013
Single
Pool
vinima_photos
Compression
Dont enable compression
Size
Mount options Selected disks summary
1 x 931.50 GB
931.50 GB
Total Raw Capacity
931.50 GB
Total Usable Capacity
931.50 GB
Inherit from pool wide configuration 102.1 GB 0.0 GB
932.0 GB 500.0 GB
102.1 GI (931.5 GB) Space that is completely free and unprovisioned
Select disks
No.
Disk Name
Capacity
1
sdb
232.90 GB
2
sdc
931.50 GB
Select all
(0.0 GB) Space that is provisioned for other shares, but currently free (0.0 GB) Space that is provisioned for other shares and is in use
Cancel Cancel
Submit
Submit
Figure 6: Create storage-create, a Rockstor share Figure 5: Create storage-create, a Rockstor pool
Must have prosumer apps for media streaming (Plex), backup (ownCloud) and file synchronisation (Syncthing) Apps for developers and small businesses such as JenkinsCI, Gitlab, Discourse, etc Privacy and productivity enhancing apps for everybody Efficient Rockstor ->Rockstor replication for backup and data recovery
Building a NAS storage system using bare metal hardware and Rockstor 3.8-12 I am going to focus on building a NAS on a pre-built server (option 2 mentioned earlier) and use a HP Proliant server with 4x1TB HDDs. I’ve used a USB as a boot drive to install Rockstor 3.8-12, but you can also use a PCI-Express mSATA III boot drive. You need to prepare a bootable USB with the Rockstor3.8-12.iso. Instructions on Rockstor documentation are available at http:// rockstor.com/docs/quickstart.html#minsysreqs. Then proceed to install Rockstor in the HP Proliant server. The instructions are given at http://rockstor.com/docs/quickstart.html#installation. Once Rockstor 3.8-12 is installed, you’ve set up a basic storage system, and are now ready to add or remove disks, create a pool (BTRFS volume), create shares (BTRFS subvolume) and use advanced BTRFS features. I will briefly describe these features. Disks: Rockstor supports whole disks, not partitions. My HP Proliant box has 4x1TB HDDs that you can see on the Rockstor Web UI, under the ‘Storage’ tab. I can add additional space by connecting USB and external hard drives to my box. I cannot add more HDDs as there are no empty bays, but if your case has empty bays, you could add more disks. Also, note that disk sda3 has the system-created rockstor_rockstor pool containing the OS. It’s recommended that this pool is not touched except for small shares, and even that, only if absolutely necessary. Resizing, compression or deletion of the rockstor_rockstor pool is not permitted. Pool: A pool is space carved out of a disk or disks. You can use multiple disks to create a single pool. Pools can be resized in capacity by adding or removing disks. One of the advantages 48 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
of BTRFS over other file systems is that pools can be created by combining disks of different sizes, but these disks have to be whole (not partitioned). To create a pool using the Web UI, go to Storage->Pools. Click on ‘Create Pools’ and specify the ‘Name’, ‘RAID Configuration’ and ‘Select Disks’ that should be the part of this pool. Click ‘Submit’ to create the pool. Advanced users can also apply different compression and BTRFS mount options while creating a pool (or afterwards). You can see the pool ‘vinima_photos’ in Figure 5. Share: A share is a space carved out of a pool and provides storage for user generated data. It behaves similar to a directory in a file system and can be exported using protocols like Samba/ CIFS, AFP, SFTP, NFS, etc, from a Rockstor machine. A share can be resized, cloned and deleted. To create a share using the Rockstor Web UI, go to Storage->Shares and click ‘Create Share’. Specify the ‘Name’, ‘Pool’ and ‘Size’, and then click ‘Submit’. Similar to a pool, advanced users can apply compression options. You can see the ‘Photos-2013’ share in Figure 6. Advanced features: Rockstor leverages other Linux technologies, such as Docker, to go beyond the NAS paradigm. Rockstor provides an app, also called the Rock-ons hosting framework. Using Rock-ons, media content can be streamed from Rockstor, files from various smart devices can be synchronised or backed up to Rockstor, etc. Almost any containerised application can be deployed to take advantage of the solid storage platform provided by Rockstor. As discussed earlier, Rock-ons are Docker-based applications that help extend functionality and run applications like syncing and streaming. The installation procedure for all Rock-ons is similar. I hope that most readers now have a basic understanding on how to choose or build their own storage system. Please refer to Rockstor documentation to understand a range of applications that can be run on your storage system (https://rockstor.com/docs). By: Vinima Aggarwal The author is a passionate marketer, and loves technology and marketing. She is currently head of marketing at Rockstor and is based in Silicon Valley, California. You can reach her at [email protected].
Insight Developers
JavaScript: The New Parts This eighth article in our series on JavaScript takes us to a feature known as Promises. This feature can be used in place of callbacks and to avoid its side effects. Promises are called futures or deferreds in other languages.
B
efore we get started, a word of caution. Understanding Promises could be as difficult as making sense of the Christopher Nolan movie ‘Inception’. This article is an attempt to simplify the topic as much as possible. The attempt is to go beyond syntax and show examples on why Promises are indeed needed. Promises are required if your application performs operations such as database access, file access or fetching user data from the online APIs of Twitter. These operations could take variable amounts of time. Waiting for these calls to complete could stall the application for a few seconds or freeze the UI. The result of the operation could even be an error, after a long wait and another retry attempt. Hence, it is better to make these calls asynchronous (not wait for the results).
JavaScript callbacks JavaScript began as a browser language, with the primary task of handling user events such as a mouse click. The language has excellent support for event-driven programming. In this language, functions are first class citizens, which means you can use a function where any other primitive data type fits. For example, a function can be passed as an argument. A function can also return a function. A callback is a function that will get executed upon an event. It is common for frontend developers to write many callback functions on mouse events like onClick, onmousedown, etc. The functions (callback) get called upon mouse events. Another important point to note is that JavaScript is single-threaded. Any operation (function call), which takes time, should be asynchronous. Let’s take an example of a callback. 1. 2. 3. 4. 5.
var fs = require(‘fs’); fs.readFile(‘readme.txt’, printfile); function printfile(err, filecontents) { console.log(filecontents.toString()); }
printfile() is a callback function, which would be called after a read operation. One drawback of a callback function is that the flow of execution is different from the sequence of
instructions written from top to bottom. Another drawback of the callback function is called Pyramid of Doom. As we go on nesting multiple callbacks, it becomes difficult to debug. Some nested callbacks might result in an exception, and tracing exceptions is a nightmare.
Promise Promise is a place holder for a future value. This is because we want to store the results of an asynchronous operation. Promise is an object. The syntax below is used to create a Promise. The syntax is: new Promise (function)
Given below is an example of how to create a function that returns a Promise. We have converted an asynchronous fs.readFile() operation into a Promise. 1. function readFile(filename) { 2. let p = new Promise(function(resolve, reject) { 3. fs.readFile(filename, function(err, contents) { 4. if (err) { 5. reject(err); // error case 6. } 7. resolve(contents); 8. }); 9. }); 10. return p; 11. }
Line 2: Creates a new Promise object. The object ‘p’ gets a value after reading the file. Line 3: A Promise takes a function. The function in turn takes two functions ‘resolve’ and ‘reject’ as parameters. For all successful cases, it executes ‘resolve’. For error scenarios, the second parameter function is executed.
Promise states Promise has states. Initially, when a Promise is created, it is in the pending state. This is the initial state. Once the asynchronous operation completes, the Promise moves to the resolved state. There are two possible states within ‘resolve’. These are: ‘fulfilled’ and ‘rejected’. A successful result of a www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 49
Developers Insight resolved Promise is fulfilled. An error case of a Promise is the rejected state. Using the readFile() function, we can see how the Promise code looks. 1. #!/usr/local/bin/node 2. “use strict”; 3. console.log(‘first line’); 4. let fs = require(“fs”); 5. function readFile(filename) { 6. let p = new Promise(function(resolve, reject) { 7. fs.readFile(filename, function(err, contents) { 8. if (err) { 9. reject(err); // error case 10. } 11. resolve(contents); 12. }); 13. }); 14. return p; 15. } 16. let promise = readFile(“readme.txt”); 17. 18. promise.then(fulfill, reject); 19. 20. function fulfill (contents) { 21. console.log(contents.toString()); 22. } 23. 24. function reject (err) { 25. console.log(err.message) 26. } 27. console.log(‘last line’);
The output of the above program is shown below: first line last line README file contents
Line 16: This invoked the function readFile(), which returns a Promise. Line 18: After the file contents are read, the Promise comes to the resolved state. For a successful file read, fullfill() gets executed. For an error scenario like the file not existing, ‘reject’ gets executed. One of the advantages of Promise is chaining, which is done by appending multiple .then() functions. You can also have one error handling function using .catch(). Line 18 can be changed to what’s shown below: promise.then(fulfill).then(process).then(processFurther);
…or, what follows:
50 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
resolved settled fulfill(...) resolve(...)
fulfilled
pending rejecte(...) rejected
Figure 1: Promise states diagram promise.then(fulfill).then(process).catch(reject);
Promise all() and Promise.race() There are two more methods that Promise provides. These two are used when you have to iterate through multiple Promises (async operations). Instances of it could be useful when you have to gather content from multiple sources/APIs. Promise.all([p1, p2, p3]).then(function(value) { console.log(value); }, function(reason) { console.log(reason) });
Promise.all() takes multiple Promises as an array. The first function of .then is executed when all Promises (p1, p2 and p3) are fulfilled. Even if one fails, the second function is called for error handling. Promise.race() is similar, except that the parent Promise will be fulfilled even if only one of the three Promises are fulfilled. This could be used when fetching dictionary data from multiple sources. The result from whichever source is fetched faster, and is good to display. This method can also be used to timeout after predefined milliseconds.
Promises with arrow functions In previous articles in this series, we covered another feature of ES6 called arrow functions. They come handy when used in combination with Promises. The earlier program, from Lines 8 to 16, can be modified to use anonymous functions, as follows: promise.then(function (contents) { console.log(contents.toString()); }, function (err) { console.log(err.message) }); This can further be enhanced to: promise.then((contents) => {
node-fetch is a lightweight module to get the contents of a file or URL. It returns a Promise. The above code is a small node.js program that fetches a joke from the Internet database and prints it on the console.
Support matrix
Promise is supported in the Chrome, Firefox and Edge
browsers. In non-browser environments, it is supported in the latest versions of Node and Babel. For detailed support, check the Kangax link in References. Many popular JavaScript libraries and frameworks have adopted Promise due to its merit. Some popular ones are jQuery, Ember.js, etc. Understanding Promise is essential for the async feature, which will come with future versions of ECMAScript. References [1] An excellent detailed article on Promise with code examples: http://www.codeproject.com/Articles/1079322/Learn-Howto-Make-ES-Promises-with-Executable-Exam [2] An introduction to JavaScript Promise http://dev. paperlesspost.com/introduction-javascript-promises/205 [3] Promise in JavaScript https://www.youtube.com/ watch?v=oa2clhsYIDY [4] A detailed chapter on Promise in ExploringJS book http://exploringjs.com/es6/ch_promises.html [5] Kangax support matrix http://kangax.github.io/ compat-table/es6/
By: Janardan Revuru The author believes that ‘JavaScript as first language’ for new programmers and encourages developers to learn pure JavaScript. He is a co-organiser of the ‘JavaScript meetup – Bangalore’ group at meetup.com. http://www.meetup.com/ JavaScript-Meetup-Bangalore/
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 51
Developers Let’s Try
GNU Emacs: How to Work with HTML Mode, Indentation and Magit In this latest article in our well-documented series on GNU Emacs, the reader is taken on a journey through HTML mode, indentation in HTML code and Magit — the magic interface to Git, inside Emacs.
T
his article in the GNU Emacs series takes readers on how to use HTML mode, do indentation, and use the Magit interface.
HTML mode You can use HTML mode to effectively edit HTML and CSS files using GNU Emacs. To start the mode, use M-x htmlmode. You will see the string ‘HTML’ in the mode line.
Default template A default HTML template can be started by opening a test.html file, and using C-c C-t html. It will produce the following content:
You will then be prompted with the string ‘Title:’ to input the title of the HTML page. After you type ‘Hello World’, the default template is written to the buffer, as follows:
Tags You can enter HTML tags using C-c C-t. GNU Emacs will prompt you with the available list of tags. A screenshot of the available tags is shown in Figure 1. The anchor tag can be inserted using ‘a’. You will then receive a message prompt: ‘Attribute:’. You can provide the value as ‘href’. It will then prompt you for a value, and you can enter a URL, say, http://www.shakthimaan.com. The anchor tag will be constructed in the buffer as you input values in the mini-buffer. You will be prompted for more attributes. If you want to finish, simply hit the Enter key and the anchor tag will be completed. The final output is shown below:
You can insert a h2 tag by specifying the same after C-c
C-t. You can also add attributes, as required. Otherwise, Hello World
Hello World
52 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
simply hitting the Enter key will complete the tag. The rendered text is as follows:
You can insert images using the alt tag. You can specify the src attribute and a value for the same. It is also a good
Let’s Try Developers practice to specify the alt attribute for the image tag. An example is shown below:
Unordered lists can be created using C-c C-t followed by ul. It will then prompt you for any attributes that you want included in the tag. You can hit the Enter key, which will prompt you with the string ‘List item:’ to key in list values. An example of the output is shown below:
One
Two
Three
You can neatly align the code by highlighting the above text and indenting the region using C-M-\. The resultant output is shown below:
One
Two
Three
If you wish to comment out text, you can select the region and type M-q. The text is enclosed using ‘’. For example, the commented address tags in the above example look like what follows:
A number of major modes exist for different programming environments. You are encouraged to try them out and customise them to your needs.
Accents In HTML mode, you can insert special characters, accents, symbols and punctuation marks. These characters are mapped to Emacs shortcuts. Some of them are listed in the following table: Shortcut
Indentation Consider the following paragraph: “When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.” You can neatly fit the above text into 80 columns and 25 rows inside GNU Emacs using M-q. The result is shown below: When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it
Figure 1: HTML tags
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 53
Developers Let’s Try if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
You can also neatly indent regions using the C-M-\ shortcut. For example, look at the following HTML snippet:
Tamil Nadu
Chennai
Karnataka
Bengaluru
Punjab
Chandigarh
are many ways in which you can install Magit. To install it from the Melpa repository, add the following to your ~/.emacs: (require ‘package) (add-to-list ‘package-archives ‘(“melpa” . “http://melpa.org/packages/”) t)
When you do M-x list-packages, you will see ‘magit’ in the list. You can press ‘i’ to mark Magit for installation, followed by ‘x’ to actually install it. This will install Magit in ~/.emacs.d/elpa. The version installed on my system is magit-20160303.502. When you open any file inside GNU Emacs that is version controlled using Git, you can start the Magit interface using M-x magit-status. I have bound this key to the C-x g shortcut in ~/.emacs using the following: (global-set-key (kbd “C-x g”) ‘magit-status)
The default Magit screenshot for the GNU Emacs project README file is shown in Figure 2. Pressing ‘l’ followed by ‘l’ will produce the history log in
After indenting the region with C-M-\, the resultant output is shown below:
Tamil Nadu
Chennai
Karnataka
Bengaluru
Punjab
Chandigarh
If you have a long line which you would like to split, you can use the C-M-o shortcut. Consider the quote: “When you’re running a startup, your competitors decide how hard you work.” Paul Graham If you keep the cursor after the comma, and use C-M-o, the result is shown below:
Figure 2: Magit
“When you’re running a startup, your competitors decide how hard you work.” Paul Graham
Magit Magit is a fantastic interface to Git inside GNU Emacs. There 54 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
Figure 3: History
Let’s Try Developers
Figure 5: Branch Figure 4: Staged changes
the Magit buffer. A screenshot is provided in Figure 3. You can make changes to the project sources and stage them to the index using the ‘s’ shortcut. You can unstage the changes using the ‘u’ shortcut. After making changes to a file, you need to use M-x magit-status to update the Magit buffer status. A sample screenshot of the modified files and staged changes is shown in Figure 4. You can hit TAB and Shift-TAB to cycle through the different sections in the Magit buffer. To commit a message, press ‘c’ followed by ‘c’. It will pop up a buffer where you
can enter the commit message. You can create and check out branches using the ‘b’ shortcut. A screenshot of the Magit branch pop-up menu is shown in Figure 5. All the basic Git commands are supported in Magit diffing, tagging, resetting, stashing, push-pull, merging and rebasing. You can read the Magit manual (http://magit.vc/ ) to learn more. By: Shakthi Kannan The author is a Free Software enthusiast and blogs at shakthimaan.com.
OSFY Magazine Attractions During 2016-17 MONTH
THEME
March 2016
Open source Databases
April 2016
Backup and Data Storage
May 2016
Web Development
June 2016
Open Source Firewall and Network security
July 2016
Mobile App Development
August 2016
Network Monitoring
September 2016
Open Source Programming Languages
October 2016
Cloud Special
November 2016
Open Source on Windows
December 2016
Machine Learning
January 2017
Virtualisation (containers)
February 2017
Top 10 of Everything
www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 55
Developers How To
Get Started with Developing MS Office Add-ins
You can spruce up Microsoft Office, and add custom commands and new features to Office programs that help increase your productivity by using add-ins. Further, you can develop your own custom add-ins to give your documents style and class. In this tutorial, you can learn how to build a word cloud generator add-in for Microsoft Word.
A
dd-ins have been an important part of the Microsoft Office suite as they allow users and third party developers to extend the capabilities and functionalities of Office, and with more than 1.2 billion users worldwide (https://news.microsoft.com/bythenumbers/ ms_numbers.pdf), there is quite a large market for apps on this platform. Previously, one had to use things like VBA, Com add-ins, VSTO, etc, to write add-ins for Office, but now all that you need is the Open Web Platform.
A high level overview of the platform
The Office Add-ins platform allows us to build apps using existing Web standards like HTML 5, CSS, JavaScript and RESTful Service that can interact with office documents, 56 | APRIL 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com
mails, calendars, meeting requests and appointments. Office currently has three types of add-ins that you can build: Task pane add-in: This appears at the side of the document and allows you to provide contextual and functional features in it. Content add-ins: These appear in the body of the document. Outlook add-ins: These appear next to an Outlook item when it’s being viewed or edited. They require Exchange 2013 or Exchange Online to host the user’s mailbox.
Anatomy of an Office add-in
An Office project consists of a mainifest.xml file bundled with a Web application. The Manifest file can be submitted
How To Developers
Figure 1: Examples of task pane and content add-ins
napacloudapp.com to the Office app store or SharePoint app catalogue, and contains metadata about the application like the developer’s name, target platform, etc. The Web application can be hosted on any Web server.
Prerequisites for developing add-ins for Office
The recommended IDE for add-ins development is Visual Studio 2015. There is a free version for individual developers and open source contributors (Visual Studio Community Edition) that can be downloaded from https://www.visualstudio.com/en-us/ products/visual-studio-community-vs.aspx Download and install the Office Developer Tools extension for Visual Studio from https://www.visualstudio. com/en-us/features/office-tools-vs.aspx Last of all, but most important, you will need Microsoft Office 2013, 2016 or Office 365. If you don’t already have one of these, you can download a trial version from the Office website or sign up for the Office 365 Dev program (http://dev.office.com/devprogram). Tip: Alternatively, Napa Development Tools can also be used for Office development. Napa is a free, quick, in-browser tool for developing Office add-ins. Napa tools can be accessed from the website https://www.
Building a word cloud generator add-in
In this section, we will build a simple word cloud generator task pane add-in for Microsoft Word, which allows users to select a bunch of words from which it generates a word cloud image. It also allows users to insert that word cloud into the document. For the actual word cloud generation, we will take the help of a JavaScript library called wordcloud2.js (https:// github.com/timdream/wordcloud2.js). So, let’s get started with the interesting parts. 1. Fire up Visual Studio and go to File > New > Project. 2. Although almost all of the development is done in HTML 5, CSS and JavaScript, for some reason, the template for the project is located under the Visual C# section. So navigate through the drop down hierarchy to Visual C# > Office/SharePoint> Office Add-in. 3. Select Office Add-in Project, name it ‘Word Cloud Generator’, click OK and in the next screen, select Task Pane Add-in as the type. Now, if you navigate to the Solution Explorer, you will see two projects — ‘Word Cloud Generator’ and ‘Word Cloud Generator Web’. The former contains the Manifest.xml file and the latter contains all our HTML 5, CSS and JavaScript files. 4. Navigate to the Word Cloud Generator Project and open www.OpenSourceForU.com | OPEN SOURCE FOR YOU | APRIL 2016 | 57
Developers How To
Figure 2: Screenshot of our word cloud add-in
the Manifest file in the editor. Select the Activation tab, and in the Application section, choose Word as the only option as we are targeting only Word. Now, in the Required API sets section, add the ImageCoercion API set. It is required to insert images into the document. 5. Now navigate to the Word Cloud Generator Web Project. In this article, we will only work with the standard project structure, except for downloading the wordcloud2.js file from Github.com and adding it to the Scripts folder. 6. Open the Home.html file located in the Addin> Home directory and remove all the contents from inside the tag. Then add wordcloud2.js inside the tag as follows: <scriptsrc=”../../Scripts/wordcloud2.js”>
7. After this, we will create a header in the page with the application name, a