Chapter 1 1.1 Introduction In this this chapte c hapterr we will will explore explore the evolutio evolutio n of Lin Linux® ux® and popular ope o perating rating systems. We will will also discuss the considerations consideratio ns for choosin choo sing g an opera op eratin ting g system. syste m.
Linux® is the registered trademark t rademark of Linus Torvalds Torv alds in in the U.S. and other countries. count ries.
1.2 Linux Evolution and Popular Operating Systems Systems The definitio definition n of the word Linux depe de pends nds on o n the context in which which it is is used. used . Linux Linux means the kernel of the system, which which is the central centra l controll c ontroller er of everything everythin g that happens happe ns on the computer co mputer (more on o n this this later). Peop Pe oplle that say their computer “runs Linux” Linux” usually refer to the kernel and suite of tools that come with it (called the distribution). If you have “Linux “Linux experience expe rience”, ”, you are most like likely ly talking talking about ab out the the programs themselves, though depe de pendin nding g on the context, context, you mig might ht be talking talking about about knowing knowing how to fine fine-- tune the kernel. ke rnel. Each of these components compo nents will will be investigate investigated d so that you you understa understand nd exactly exact ly what roles each plays. Further compli c omplicat cat ing in g things things is the term UNIX . UNIX UNI X was origin originaa lly ll y an opera op erati ting ng system s ystem developed de veloped at AT&T Bell Labs Lab s in the 1970’s. 197 0’s. It was mod modiified and forked fork ed (that (that is, people modified modified it and those modifica modificatio tions ns served as the basis ba sis for other systems) such that that at the present present tim time there there are man many y dif different vari varian ants ts of UNIX. UNIX. However, However, UNIX is is now both a trademark and a speci spec ificatio cat ion, n, owned by an industry ndustry consortium called called the Open Ope n Group. O nly nly software that has been be en certifi certified by the Open Op en Group Gro up may may call ca ll itself UNIX. Despite Desp ite adop ad opti ting ng all the requirements requirements of the UNIX specif spe cifiicatio ca tio n, Linux Linux has not not been bee n certif cer tifiied, ed , so Linux Linux reall rea lly y isn’t UNIX! UN IX! It’s just… UNIX-like UNIX -like
1.2.1 Role of the Kernel The kernel of o f the operating ope rating system is like like an air traffi traffic contro co ntroll llee r at an airport. The kernel dictates dictate s which program p rogram gets which which pieces of o f memory, it starts and kills kills programs, p rograms, and itit handles displaying displayin g text on a monitor. monitor. When an a n appli ap plicatio cation n needs need s to write write to disk, it must must ask the the operating o perating system s ystem to do it. it. If two appl app lications ask for the the same resource, the kernel decides decides who gets it, it, and in in som so me cases, ca ses, kill killss off one of the appli ap plicatio cations ns in order ord er to save the rest of the system. The kernel also handles switching of appli ap plica cati tio o ns. A computer co mputer will will have a small small numbe numberr of C PUs and a finite finite amount of memory. The kernel take ta kess care of unload unloading ing one task and loadi oad ing a new task if there there are more tasks than CPUs. CP Us. When the current task has run a suffi sufficc ient ie nt amount of time, time, the CPU C PU pauses pauses the task so that another may run. This This is called cal led pre-emptive multitasking. Multi Multitask task ing means that the computer co mputer is doin do ing g several severa l tasks task s at once, once , and prepre - emptive emptive means that the kernel ke rnel is is deci dec iding when to switch focus betwee b etween n tasks. task s. With With the tasks rapidly switching, it appea ap pears rs that the computer is doing doing many things things at once. Each appli ap plica cati tio o n may think think it has a large block of memory memory on the the system syste m, but it is is the the kernel ke rnel that mainta maintains ins this this illu illuss io n, remappi remapp ing small smaller blocks block s of memory, memory, sharing blocks block s of memory memory with with other appli ap plicatio cations, ns, or even swappin swapp ing g out o ut blocks blocks that haven’t been touched to disk. When the computer starts up itit loa loads ds a small small piece of code cod e called called a boot loader . The boot loader’s loade r’s job is is to load oa d the kernel and a nd get get it started. started . If you are more more familiar familiar with with operating op erating systems such as Microsoft Microso ft Windo Windows ws or Apple’s OS X, you probabl probab ly never see se e the boot loader, loade r, but b ut in the UNIX world it’s usually visibl visiblee so that you can tweak twea k the way your computer boots.
The boot loader loade r loa loads ds the Linux Linux kernel, ker nel, and then transfers control. co ntrol. Linux Linux then continues continues with running running the programs necessary neces sary to make the computer co mputer usefu use ful, l, such as connecting co nnecting to the network or o r starting starting a web server.
1.2.2 Applications Like Like an air air traffi traffic controll co ntrollee r, the kernel ke rnel is is not useful useful without without something to control. If the the kernel is the the tower, the appli applications cation s are the airplan airplanes. es. Applicat Applicatiio ns make requests to the kernel and receive resources, re sources, such as memory, emory, CPU, CP U, and disk, disk, in return. return. The The kernel k ernel also abstrac ab stracts ts the compl co mpliicated cat ed detail de tailss away a way from the appli ap plica cati tio o n. The appli ap plicatio cation n doesn’t do esn’t know if a block of disk is on a solid solid - state drive from manufact manufacture urerr A, a spinning metal hard drive drive from manufac manufactt urer ure r B, or even a network file file share. Applicatio Ap plications ns just foll follow ow the kernel’s Application Programming Programm ing Interface (API) and in return don’t have to worry about abo ut the implementatio plemen tatio n details. details. When we, as users, think think of appl app lications, cation s, we tend to think think of word processors, processo rs, web ke rnel doesn’t doe sn’t care if if itit is is running running something that’s user browsers, browsers, and and emai emaill cli clients. ents. The The kernel facing facing,, a network service service that talks talks to a remote computer, or an intern internaa l task. So, So , from this this we get an abstrac ab straction tion called called a process. A process is just one task that is loaded and tracked tracke d by the kernel. kernel. An appl app lication may even e ven need need mul multip tiple le processes proce sses to functi unctio o n, so the kernel takes care of running running the processes, proce sses, starting starting and stopping stopping them as requested requested,, and handi handing ng out system resources. reso urces.
1.2.3 Role of Open Source Linu Linux x started out in in 1991 1 991 as a hobby hobb y project projec t by Linu Linuss Torvalds. Torvalds. He made made the source so urce freely availab availab le and others joined in to shape this fledglin fledgling g opera op eratin ting g system. s ystem. His His was not the fi first system to be developed by a group, but since since it was a buil built-fr t- fro o m-scratc m-sc ratch h project, projec t, early adopters had the a bil bility to inf influence luence the the project’s project’s direct directiion and to make sure sure mistakes from other UNIXes were were not not repeated. repea ted. Software projects take the form of source code, which which is a human human reada rea dable ble set of computer co mputer instruct instructio ions. ns. The source code co de may be written in any of hund hund reds red s of differe different nt langua languages, ges, Linux Linux just happens happe ns to be written in C, whi which is a language language that shares history history with the origina originall UNIX. Source So urce code cod e is is not unde understoo rstood d directly by the computer, so it must must be b e compi co mpiled led into into machine machine instructio nstruct ions ns by a compiler . The The compi co mpiller gathers all of the source fil files and generates generate s something so mething that can be run on the computer, co mputer, such as the Linux Linux kernel. ker nel. Histori Historica cally, lly, most software has been issued under a closed-source license, meani ea ning ng that you get the ri right ght to use the the machine machine code, cod e, but cannot see the source code. co de. Often the license license specif spe cifiica l ly says that you will will not attempt to reverse engineer engineer the machine machine code co de back to source source code to fi figure gure out what what it does! Open source takes a source-centri source- centricc view view of software. software. The open ope n source phi philosop h y is that that you have a right right to obtain ob tain the software, and to modify it for your own use. use. Linux Linux adop ad opted ted this this phil philosophy osop hy to great great success. success . People Pe ople took took the source, source, made changes, changes, and shared shared them back with with the the rest of the the group. group.
Alongs Alongs ide this, this, was wa s the GNU project (GNU’s, (GNU’s, not UNIX). Whil While GNU was buildin building g their own operat op eratiing system, s ystem, they were we re far far more effec effective tive at build build ing the tools too ls that go along with a UNIX UNI X opera op erati ting ng system, such as the compilers compilers and user interfaces. interfaces. The source was all free freelly availab availab le, so Linux Linux was abl ab le to target their their tools to ols and provi pro vide de a compl co mplete ete system. As such, most most of o f the the tools that are part pa rt of the Linux Linux system com co me from these GNU tool too ls. There are many many differe different nt variants on open op en source, source , and those will will be b e examined examined in a later ate r chapter. chap ter. All agree agre e that that you should should have access acc ess to the source cod c ode, e, but they diff differ er in how you can, or in in some some cases, c ases, must, must, redistribute redistribute changes. changes.
1.2.4 Linux Distributions Take Linux Linux and the GNU tools, add some more more user fac faciing appli ap plica cati tio o ns like like an email email client, client, and you you have a full full Linux Linux system. Peop Pe ople le started bundling bundling all this this software so ftware into a distribution alm almost as soon as Linux Linux became bec ame usable. The The di d istribut stributiio n takes care ca re of setting setting up the storage, stora ge, installin installing g the kernel, and installi installing ng the rest of the software. The ful fulll featured di d istributi stribut io ns also inclu include de tools too ls to manage the system and a package manager to help help you add and remove remove software software after the the install installat atio ion n is complete. Like Like UNIX, UN IX, there there are many many diff different ere nt flavors of distri distributi but io ns. These These days, there there are distributions distributio ns that foc focus us on running running servers, servers , desktop desk tops, s, or even industr industry y specif spe cifiic tools too ls like like electroni electro nics cs desi des ign or statistical statistica l computi co mputing. ng. The major players in the market can be traced trac ed back to eith either er Red Hat or Debian Debian.. The The most visi visibl blee dif difference is the the package package manage anager, r, though you will will find find other differe differences nces on everything everythin g from fil filee locations to poli po litica ticall phil philosophies. osophies.
Red Hat started out as a sim simple distribution distribution that introduced introduced the Red Hat Package Pac kage Manager (RPM). ( RPM). The The developer develope r eventually eventua lly formed formed a com co mpany pa ny around aro und it, it, which which tried to commercial commercialize ize a Linux Linux desktop desk top for for busi b usiness. ness. Over tim time, Red Hat started started to focus focus more on the server appl app lications such as web web and and fi file serving serving,, and releas released ed Red Hat Enterpri Enterprise se Linux Linux,, which which was a pai pa id service s ervice on a long release cycle. The release cycle dictates how often software is upgrad upgraded ed.. A business business may value value stabi stab ility and want long long rel re lease eas e cycles, a hobbyist hobbyist or a startup may want the latest latest software software and opt op t for for a shorter release cycle. cycle. To satisfy the latter atte r group, Red Hat sponsors spo nsors the Fedora Project which makes make s a personal pe rsonal deskto de sktop p com co mprising the latest ate st software, so ftware, but b ut still still built built on the same founda foundations tions as the enterprise version. Because every e verything thing in Red Hat Enterpri Enterprise se Linu Linux x is open op en source, source, a project projec t call called CentOS came to be, that that recompi rec ompilled all the the RHEL pack p ackages ages and gave them away for for free. CentOS C entOS and others lilike it (such as Sci Sc ientific Linux) Linux) are largely largely com co mpatibl pa tiblee with with RHEL and integrate ntegrate some newer software, software, but do not offer offer the paid suppo support rt that Red Hat does. doe s.
Scientific Linux is an example of a specif spec ifiic use distribution based ba sed on Red Red Hat. The project is is a Fermi Fermilab sponsored sponsored distr distriibutio butio n design designed ed to enabl enablee scien scienti tific fic comput computiing. ng. Among its its many appli app lica cati tio o ns, S cientific cientific Linux Linux is used with with particle accelerato ac celerators rs includin ncluding g the Large Large Hadron Colli Collider at CERN. Open SUSE origina origina lly ll y deriv de rived ed from Slackware , yet incorporates ncorpora tes many aspects aspe cts of Red Hat. The The original original company was purchased purchased by Novell Novell in 2003, 2003 , which which was then then purchased p urchased by the the Attach Attachm mate Group Group in 2011. The The Attach Attachm mate group group then then merged merged with with Micro Micro Focus
International. Through all of the mergers and acquisitions, SUSE has managed to continue and grow. While Open SUSE is desktop based and available to the general public, SUSE Linux Enterprise contains proprietary code and is sold as a server product.
Debian is more of a community effort, and as such, also promotes the use of open source software and adherence to standards. Debian came up with its own package management system based on the .deb file format. While Red Hat leaves non Intel and AMD platform support to derivative projects, Debian supports many of these platforms directly. Ubuntu is the most popular Debian derived distribution. It is the creation of Canonical, a company that was made to further the growth of Ubuntu and make money by providing support. Linux Mint was started as a fork of Ubuntu Linux, while still relying upon the Ubuntu repositories. There are various versions, all free of cost, but some include proprietary codecs, which can not be distributed without license restrictions in certain countries. Linux Mint is quickly supplanting Ubuntu as the world's most popular desktop Linux solution. We have discussed the distributions specifically mentioned in the Linux Essentials objectives. You should be aware that there are hundreds, if not thousands more that are availab le. It is important to understand that while there are many differe nt distributions of Linux, many of the programs and commands remain the same or are very similar.
1.2.4.1 What is a Command? The simplest answer to the question, "What is a command? ", is that a command is a software program that when executed on the command line, performs an action on the computer. When you consider a command using this definitio n, you are really considering what happens when you execute a command. When you type in a command, a process is run by the operating system that can read input, manipulate data and produce output. From this perspective, a command runs a process on the operating system, which then causes the computer to perform a job . However, there is another way of looking at what a command is: look at its source. The source is where the command "comes from" and there are several different sources of commands within the shell of your CLI:
Commands built-in to the shell itse lf : A good example is the cd command as it is part of the bash shell. When a user types the cd command, the bash shell is already executing and knows how to interpret that command, requiring no additional programs to be started. Commands that are stored in files that are searched by the shell: If you type a ls command, then the shell searches through the directories that are listed in the PATH variable to try to find a file named ls that it can execute. These commands can also be executed by typing the complete path to the command.
Aliases : An alias can override a built-in command, function, or a command that is found in a file. Aliases can be useful for cre ating new commands built from existing functions and commands. Functions : Functions can also be built using existing commands to either create new commands, override commands built-in to the shell or commands stored in files. Aliases and functions are normally loaded from the initialization files when the shell first starts, discussed later in this section.
Consider This While aliases will be covered in detail in a later section, this brief example may be helpful in understanding the concept of commands. An alias is essentially a nickname for another command or series of commands. For example, the cal 2014 command will display the calendar for the year 2014. Suppose you end up running this command often. Instead of executing the full command each time, you can create an alias called mycal and run the alias, as demonstrated in the following graphic: sysadmin@localhost:~$ alias mycal="cal 2014" sysadmin@localhost:~$ mycal 2014 January February Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su 1 2 3 4 1 5 6 7 8 9 10 11 2 3 4 5 6 7 8 2 12 13 14 15 16 17 18 9 10 11 12 13 14 15 9 19 20 21 22 23 24 25 16 17 18 19 20 21 22 16 26 27 28 29 30 31 23 24 25 26 27 28 23 30 April May Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su 1 2 3 4 5 1 2 3 1 6 7 8 9 10 11 12 4 5 6 7 8 9 10 8 13 14 15 16 17 18 19 11 12 13 14 15 16 17 15 20 21 22 23 24 25 26 18 19 20 21 22 23 24 22 27 28 29 30 25 26 27 28 29 30 31 29 July August Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su 1 2 3 4 5 1 2
March Mo Tu We Th Fr Sa 1 3 4 5 6 7 8 10 11 12 13 14 15 17 18 19 20 21 22 24 25 26 27 28 29 31 June Mo Tu We Th Fr Sa 2 3 4 5 6 7 9 10 11 12 13 14 16 17 18 19 20 21 23 24 25 26 27 28 30 September Mo Tu We Th Fr Sa 1 2 3 4 5 6
1.2.5 Hardware Platforms Linux started out as something that would only run on a computer like Linus’: a 386 with a specific hard drive controller. The range of support grew, as people built support for other hardware. Eventually, Linux started supporting other chips, including hardware that was made to run competitive operating systems! The types of hardware grew from the humble Intel chip up to supercomputers. Later, smaller-size, Linux supported, chips were developed to fit in consumer devices, called embedded devices. The support for Linux became ubiquitous such that it is often easier to build hardware to support Linux and then use Linux as a springboard for your custom software, than it is to build the custom hardware and software from scratch. Eventually, cellular phones and tablets started running Linux. A company, later bought by Google, came up with the Android platform which is a bundle of Linux and the software necessary to run a phone or tablet. This means that the effort to get a phone to market is significantly less, and companies can spend their time innovating on the user facing software rather than reinventing the wheel each time. Android is now one of the market leaders in the space. Aside from phones and tablets, Linux can be found in many consumer devices. Wireless routers often run Linux because it has a rich set of network features. The TiVo is a consumer digital video recorder built on Linux. Even though these devices have Linux at the core, the end users don’t have to know. The custom software interacts with the user and Linux provides the stable platform.
1.3 Choosing an Operating System You have learned that Linux is a UNIX-like operating system, which means that it has not undergone formal certification and therefore can’t use the official UNIX trademark. There are many other alternatives; some are UNIX- like and some are certified as UNIX. There are also non-Unix operating systems such as Microsoft Windows. The most important question to ask when determining the configuration of a machine is “what will this machine do?” If you need to run specialized software that only runs on Oracle Solaris, then that’s what you’ll need. If you need to be able to read and write Microsoft Office documents, then you’ll either need Windows or something capable of running LibreOffice or OpenOffice.
1.3.1 Decision Points The first thing you need to decide is the machine’s role. Will you be sitting at the console running productivity applications or web browsing? If so, you have a desktop. Will the machine be used as a Web server or otherwise sitting in a server rack somewhere? You’re looking at a server. Servers usually sit in a rack and shar e a keyboard and monitor with many other computers, since console access is only used to set up and troubleshoot the server. The server will run in non-graphical mode, which frees up resources for the real purpose of the computer. A desktop will primarily run a GUI.
Next, determine the functions of the machine. Is ther e specific software it needs to run, or specific functions it needs to do? Do you need to be able to manage hundreds or thousands of these machines at the same time? What is the skill set of the team managing the computer and software? You must also determine the lifetime and risk tolerance of the server. Operating systems and software upgrades come on a periodic basis, called the release cycle. Software vendors will only support older versions of software for a certain period of time before not offering any updates, which is called the maintenance cycle (or life cycle). For example, major Fedora Linux releases come out approximately every 6 months. Versions are considered End of Life (EOL) after 2 major versions plus one month, so you have between 7 and 13 months after installing Fedora befor e you need to upgrade. Contrast this with the commercial server variant, Red Hat Enterprise Linux, and you can go up to 13 years before needing to upgrade. The maintenance and release cycles are important because in an enterprise server environment it is time consuming, and therefore rare, to do a major upgrade on a server. Instead, the server itself is replaced when there are major upgrades or replacements to the application that necessitate an opera ting system upgrade. Similarly, a slow release cycle is important because applications often target the current version of the operating system and you want to avoid the overhead of upgradin g servers and operating systems constantly to keep up. There is a fair amount of work involved in upgrading a server, and the server role often has many customizations made that are difficult to port to a new server. This necessitates much more testing than if only the application were upgraded. If you are doing software development or traditional desktop work, you often want the latest software. Newer software has improvements in both functionality and appearance, which contributes to more enjoyment from the use of the computer. A desktop often stores its work on a remote server, so the desktop can be wiped clean and the newer operating system put on with little interruption. Individual software releases can be characterized as beta or stable. One of the great things about being an open source developer is that you can release your new software and quickly get feedback from users. If a software release is in a state that it has many new features that have not been rigorously tested, it is typically referred to as beta. After those features have been tested in the field, the software moves to a stable point. If you need the latest features, then you are looking for a distribution that has a quick release cycle and makes it easy to use beta software. On the server side, you want stable software unless those new features are necessary and you don’t mind running code that has not been thoroughly tested. Another loosely related concept is backward compatibility . This refers to the ability for a later operating system to be compatible with software made for earlier versions. This is usually a concern if you need to upgrade your operating system, but aren’t in a position to upgr ade your application software. Of course, cost is always a factor. Linux itself might be free, but you may need to pay for support, depending on which options you choose. Microsoft has server license costs and may have additional support costs over the lifetime of the server. Your chosen operating system might only run on a particular selection of hardware, which further affects the cost.
1.3.2 Microsoft Windows The Microsoft world splits the operating systems according to the machine’s purpose: desktop or server? The Windows desktop edition has undergone various naming schemes with the current version (as of this writing) being simply Windows 8. New versions of the desktop come out every 3-5 years and tend to be supported for many years. Backward compatibility is also a priority for Microsoft, even going so far as to bundle virtual machine technology so that users can run older software.
In the server realm, there is Windows Server, currently (at this writing) at version 2012 to denote the release date. The server runs a GUI, but largely as a competitive response to Linux, has made amazing strides in command line scripting abilities through PowerShell. You can also make the server look like a desktop with the optional Desktop Experience package.
1.3.3 Apple OS X Apple makes the OS X operating system, which has undergone UNIX certification. OS X is partially based on software from the FreeBSD project. At the moment, OS X is primarily a desktop operating system but there are optional packages that help with management of network services that allow many OS X desktops to collaborate, such as to share files or have a network login. OS X on the desktop is usually a personal decision as many find the system easier to use. The growing popularity of OS X has ensured healthy support from software vendors. OS X is also quite popular in the creative industries such as video production. This is one area where the applications drive the o perating system decision, and therefore the hardware choice since OS X runs on Apple hardware.
1.3.4 BSD There are several open source BSD (Berkely Software Distribution) projects, such as OpenBSD, FreeBSD, and NetBSD. These are alternatives to Linux in many respects as they use a large amount of common software. BSDs are typically implemented in the server role, though there are also variants such as GNOME and KDE that were developed for desktop roles.
1.3.5 Other Commercial UNIXes Some of the more popular commercial UNIXes are:
Oracle Solaris IBM AIX HP-UX
Each of these runs on hardware from their respective creators. The hardware is usually large and powerful, offering such features as hot-swap CPU and memory, or integration with legacy mainframe systems also offered by the vendor. Unless the software requires the specific hardware or the needs of the application require some of the redundancy built into the hardware, most people tend to choose these options because they are already users of the company's products. For example, IBM AIX runs on a wide variety of IBM hardware and can share hardware with mainframes. Thus, you find AIX in companies that already have a large IBM footprint, or that make use of IBM software like WebSphere.
1.3.6 Linux One aspect where Linux is much different than the alternatives is that after an administrator has chosen Linux they still have to choose a distribution. Recall from Topic 1 that the distribution packages the Linux kernel, utilities, and management tools into an installable package and provides a way to install and update packages after the initial installation. Some operating systems are available through only one vendor, such as OS X and Windows, with system support provided through the vendor. With Linux, there are multiple options, from commercial offerings for the server or desktop, to custom distributions made to turn an old computer into a network firewall. Often application vendors will choose a subset of distributions to support. Different distributions have different versions of key libraries and it is difficult for a company to support all these different versions. Governments and large enterprises may also limit their cho ices to distributions that offer commercial support. This is common in larger companies where paying for another tier of support is better than risking extensive outages. Various distributions also have release cycles, sometimes as often as every six months. While upgrades are not required, each version can only be supported for a reasonable length of time. Therefore, some Linux releases are considered to have long term support (LTS) of 5 years or more while others will only be supported for two years or less. Some distributions differentiate between stable, testing, and unstable releases. The difference being that unstable r eleases trade reliability for features. When features have been integrated into the system for a long time, and many of the bugs and issues addressed, the software moves through testing into the stable release. The Debian distribution warns users about the pitfalls of using the “sid” release with the following warning:
‘"sid" is subject to massive chan ges and in -place library updates. This can result in a very "unstable" system which contains packages that cannot be installed due to missing libra ries, dependencies that cannot be fulfilled etc. Use it at your own risk!’
Other releases depend on Beta distributions. For instance, the Fedora d istribution releases Beta or pre-releases of its software ahead of the full release to minimize bugs. Fedora is often considered the community oriented Beta release of RedHat. Features are added and changed in the Fedora release before finding their way in to the Enterprise ready RedHat distribution.
1.3.7 Android Android, sponsored by Google, is the world’s most popular Linux distribution. It is fundamentally different from its counterparts. Linux is a kernel, and many of the commands that will be covered in this course are actually part of the GNU (GNU's Not Unix) package. That is why some people insist on using the term GNU/Linux instead of Linux alone. Android uses the Dalvik virtual machine with Linux, providing a robust platform for mobile devices such as phones and tablets. However, lacking the traditional packages that are often distributed with Linux (such as GNU and Xorg), Android is generally incompatible with desktop Linux distributions. This incompatibility means that a RedHat or Ubuntu user can not download software from the Google Play store. Likewise, a terminal emulator in Android lacks many of the commands of its Linux counterparts. It is possible, however, to use BusyBox with Android to enable most commands to work.
Chapter 2 2.1 Introduction In this chapter we will become familiar with several open source applications and tools. We will also discuss open source software and licensing.
2.2 Major Open Source Applications The Linux kernel can run a wide variety of software across many hardware platforms. A computer can act as a server , which means it primarily handles data on other’s behalf, or can act as a desktop, which means a user will be interacting with it directly. The machine can run
software or it can be used as a development machine in the process of creating software. You can even run multiple roles as there is no distinction to Linux about the role of the machine; it’s merely a matter of configuring which applications run. One advantage of this is that you can simulate almost all aspects of a production environment, from development, to testing, to verification on scaled down hardware, which saves costs and time. As someone learning Linux, you can run the same server applications on your desktop or inexpensive virtual server that are run on a large Internet Service Provider. Of course, you will not be able to handle the volume a large provider would, as they will have much more expensive hardware. But you can simulate almost any configuration without needing powerful hardware or server licensing. Linux software generally falls into one of three categories:
Serv er software – software that has no direct interaction with the monitor and keyboard of the machine it runs on. Its purpose is to serve information to other computers, called clients. Sometimes server software may not talk to other computers but will just sit there and "crunch" data. Desktop software – a web browser, text editor, music player, or other software that you interact with. In many cases, such as a web browser, the software is talking to a server on the other end and interpreting the data for you. Here, the desktop software is the client. Tools – a loose category of software that exists to make it easier to manage your system. You might have a tool that helps you configure your display, or something that provides a Linux shell, or even more sophisticated tools that convert source code to something that the computer can execute
Additionally, we will consider mobile applications, mostly for the benefit of the LPI exam. A mobile application is a lot like a desktop application but it runs on a phone or tablet instead of a desktop computer. Any task you want to do in Linux can likely be accommodated by any number of applications. There are many web browsers, many web servers, and many text editors (the benefits of each are the subject of many UNIX holy wars). This is no different than the closed source world. However, a benefit of open source is that if someone that doesn’t like the way their web server works, they can start building their own. One thing you will le arn as you progress with Linux is how to evaluate software. Sometimes you’ll go with the leader of the pack, sometimes you’ll want to look over the bleeding edge.
2.2.1 Server Applications Linux excels at running server applications because of its reliabil ity and efficiency. When considering server software, the most important question is “what ser vice am I running?” If you want to serve web pages, you will need web server software, not a mail server! One of the early uses of Linux was for web servers. A web server hosts content for web pages, which are viewed by a web browser using the Hype rte xt Transfer Protocol ( HTTP) or its encrypted flavor, HTTPS. The web page itself can be static which means that when the web browser requests the page the web server just sends the file as it appears on disk. The server can also serve dynamic content , meaning that the request is sent by the web server to an application, which generates the content. WordPress is one popular example. Users can develop content through their browser in the WordPress application and the software turns it into a fully functional website. Each time you do online shopping, you are looking at a dynamic site.
Apache is the dominant web server in use today. Apache was originally a standalone project but the group has since formed the Apache Software Fo undation and maintains over a hundred open source software projects. Another web server is nginx, which is based out of Russia. It focuses on performance by making use of more modern UNIX kernels and only does a subset of what Apache can do. Over 65% of websites are powered by either nginx or Apache. Email has always been a popular use for Linux servers. When d iscussing email servers it is always helpful to look at the 3 different roles required to get email between people:
Mail Transfer Agent ( MTA) – figures out which server needs to receive the email and uses the Simple Mail Transfer Protocol (SMTP) to move the email to that server. It is not unusual for an email to take several “hops” to get to its final destination, since an organization might have several MTAs. Mail Delivery Age nt (MDA, also called the Local Deliv ery Age nt) – takes care of storing the email in the user’s mailbox. Usually invoked from the final MTA in the chain. POP/IMAP server – The Post Office Protocol and Internet Message Access Protocol are two communication protocols that let an email client running on your computer talk to a remote server to pick up the email.
Sometimes a piece of software will implement multiple components. In the closed source world, Microsoft Exchange implements all the components, so there is no option to make individual selections. In the open source world there are many options. Some POP/IMAP servers implement their own mail database format for performance, so will also include the MDA if the custom database is desired. People using standard file for mats (such as all the emails in one text file) can choose any MDA. The most well known MTA is sendmail. Postfix is another popular one and aims to be simpler and more secure than sendmail. If you’re using standard file formats for storing emails, your MTA can also deliver mail. Alternatively, you can use something like procmail, which lets you define custom filters to process mail and filter it. Dovecot is a popular POP/IMAP server owing to its ease of use and low maintenance. Cyrus IMAP is another option. For file sharing, Samba is the clear winner. Samba allows a Linux machine to look like a Windows machine so that it can share files and participate in a Windows domain. Samba implements the server components, such as making files available for sharing and certain Windows server roles, and also the client end so that a Linux machine may consume a Windows file share. If you have Apple machines on your network, the Netatalk project lets your Linux machine behave as an Apple file server. The native file sharing protocol for UNIX is called the Network File Syste m (NFS). NFS is usually part of the kernel which means that a remote file system can be mounted just like a regular disk, making file access transparent to other applications. As your computer network gets larger, you will need to implement some kind of directory. The oldest directory is called the Domain Name System and is used to convert a name like http://www.linux.com to an IP address like 192.168.100.100, which is a unique identifier of that computer on the Internet. DNS also holds such global information like the address of the
MTA for a given domain name. An organization may want to run their own DNS server to host their public facing names, and also to serve as an internal directory of services. The Internet Software Consortium maintains the most popular DNS server, simply called bind after the name of the process that runs the service. The DNS is largely focused on computer names and IP addresses and is not easily searchable. Other directories have sprung up to store other information such as user accounts and security roles. The Lightwe ight Directory Acce ss Protocol (LDAP) is the most common directory which also powers Microsoft’s Active Directory. In LDAP, an object is stored in a tree, and the position of that object on the tree can be used to derive information about the object in addition to what’s stored with the object itself . For example, a Linux administrator may be stored in a branch of the tree called “IT department”, which is under a branch called “Operations”. Thus one can find all the technical staff by searching under the IT department branch. OpenLDAP is the dominant player here. One final piece of network infrastructure is called the Dynamic Host Configur ation Protocol (DHCP). When a computer boots up, it needs an IP address for the local network so it can be unique ly identified. DHCP’s job is to listen for requests and to assign a free address from the DHCP pool. The Internet Software Consortium also maintains the ISC DHCP server, which is the most common player here. A database stores information and also allows for easy retrieval and querying. The most popular databases here are MySQL and PostgreSQL. You might enter raw sales figures into the database and then use a language called Structured Query Language (SQL) to aggregate sales by product and date in order to produce a report.
2.2.2 Desktop Applications The Linux ecosystem has a wide variety of desktop applications. You can find games, productivity applications, creative tools, and more. This section is a mere sur vey of what’s out there, focusing on what the LPI deems most important. Before looking at individual applications, it is helpful to look at the desktop environment. A Linux desktop runs a system called X Window, also known as X11. The Linux X11 server is X.org, which provides a way for software to operate in a graphical mode and accept input from a keyboard and a mouse. Windows and icons are handled by another piece of software called the window manager or desktop environment . A window manager is a simpler version of desktop environment as it only provides the code to draw menus and manage the application windows on the screen. A desktop environment layers in features like login windows, sessions, a file manager, and other utilities. In summary, a text-only Linux workstation becomes a graphical desktop with the addition of X-Windows and either a desktop environment or a window manager. Window managers include Compiz, FVWM, and Enlightenment, though there are many more. Desktop environments are primarily KDE and GNOME, both of which have their own window managers. Both KDE and GNOME are mature projects with an incredible amount of utilities built against them, and the choice is often a matter of personal preference. The basic productivity applications, such as a word processor, spreadsheet, and presentation package are very important. Collectively they’re known as an office suite, largely due to Microsoft Office who is the dominant player in the market. OpenOffice (sometimes called OpenOffice.org) and LibreOffice offer a full office suite, including a drawing tool that strives for compatibility with Microsoft Office both in terms of features and file formats. These two projects are also a great example of how politics influence open source.
In 1999 Sun Microsystems acquired a relatively obscure German company that was making an office suite for Linux called StarOffice. Soon after that, Sun rebranded it as OpenOffice and released it under an open source license. To further complicate things, StarOffice remained a proprietary product that drew from OpenOffice. In 2010 Sun was acquired by Oracle, who later turned the project over to the Apache Foundation. Oracle has had a poor history of supporting open source projects that it acquires, so shortly after the acquisition by Oracle the project was forked to become LibreOffice. At that point there became two groups of people developing the same piece of software. Most of the momentum went to the LibreOffice project which is why it is included by default in many Linux distributions. For browsing the web, the two main contenders are Firefox and Google Chrome. Both are open source web browsers that are fast, feature rich , and have excellent support for web developers. These two packages are a good example of how diversity is good for open source – improvements to one spur the oth er team to try and best the other. As a result, the Internet has two excellent browsers that p ush the limits of what can be done on the web and work across a variety of platforms. The Mozilla project has also come out with Thunderbird, a full featured desktop email client. Thunderbird connects to a POP or IMAP server, displays email locally, and se nds email through an external SMTP server. Other notable email clients are Evolution and KMail which are the GNOME and KDE project’s email clients. Standardization through POP and IMAP and local email formats means that it’s easy to switch be tween email clients without losing da ta. Web based email is also another option. For the creative types, there is Blender, GIMP, and Audacity which handle 3D movie creation, 2D image manipulation, and audio editing respectively. They have h ad various degrees of success in professional markets. Blender is used for everything from independent films to Hollywood movies, for example
2.2.3 Console Tools The history of the development of UNIX shows considerable overlap between the skills of software development and systems administration. The tools that let you manage the system have features of computer languages such as loops, and some computer languages are used extensively in automating systems administration tasks. Thus, one should consi der these skills complementary. At the basic level, you interact with a Linux system through a shell no matter if you are connecting to the system remotely or fr om an attached keyboard. The shell’s job is to accept commands, such as file manipulations and starting applications, and to pass those to the Linux kernel for execution. Here, we show a typical interaction with the Linux shell: sysadmin@localhost:~$ ls -l /tmp/*.gz
-rw-r--r-- 1 sean root 246841 Mar 5 2013 /tmp/fdboot.img.gz sysadmin@localhost:~$ rm /tmp/fdboot.img.gz
The user is given a prompt, which typically ends in a dollar sign $ to indicate an unprivileged account. Anything before the prompt, in this case sysadmin@localhost:~ , is a configurable prompt that provides extra information to the user. In the figure above,
sysadmin is the name of the current user, localhost is the name of the server, and ~ is the current directory (in UNIX, the tilde symbol is a short form for the us er’s home directory). We will look at Linux commands in more detail in further chapters, but to finish the explanation, the first command lists files with the ls command, receives some information about the file, and then removes that file with the rm command.
The Linux shell provides a rich language for iterating over files and customizing the environment, all without leaving the shell. For example, it is possible to write a single command line that finds files with contents matching a certain pattern, extracts useful information from the file, then copies the new information to a new file. Linux offers a variety of shells to choose from, mostly differing in how and what can be customized, and the syntax of the built-in scripting language. The two main families are the Bourne shell and the C shell. The Bourne shell was named after the creator and the C shell was named because the syntax borrows heavily from the C language. As both these shells were invented in the 1970’s there are more modern versions, the Bourne Again Shell (Bash) and the tcsh (tee-cee-shell). Bash is the default shell on most systems, though you can almost be certain that tcsh is available if that is your preference. Other people took their favorite features from Bash and tcsh and have made other shells, such as the Korn shell (ksh) and zsh. The choice of shells is mostly a personal one. If you can become comfortable with Bash then you can operate effectively on most Linux systems. After that you can branch out and try new shells to see if they help your productivity. Even more dividing than the selection of shells is the choice of text editors. A text editor is used at the console to edit configuration files. The two main camps are vi (or the more modern vim) and emacs. Both are remarkably powerful tools to edit text files, they differ in the format of the commands and how you write plugins for them. Plugins could be anything from syntax highlighting of software projects to integrated calendars. Both vim and emacs are complex and have a steep learning curve. This is not helpful if all you need is simple editing of a small text file. Therefore pico and nano are available on most systems (the latter being a derivative of the former) and provide very basic text editing. Even if you choose not to use vi you should strive to gain some basic familiarity because the basic vi is on every Linux system. If you are restoring a broken Linux system by running in the distribution’s recovery mode you are certain to have vi available. If you have a Linux system you will need to add, remove, and update software. At one point this meant downloading the source code, setting it up, building it, and copying files on each system. Thankfully, distributions created packages which are compressed copies of the application. A pack age manager takes care of keeping track of which files belong to which package and even downloading updates from a remote server called a repository . On Debian systems the tools include dpkg, apt-get, and apt-cache. On Red Hat derived systems, you use rpm and yum. We will look more at packages later.
2.2.4 Development Tools It should come as no surprise that as software built on contributions from programmers, Linux has excellent support for software development. The shells are built to be programmable and there are powerful editor s included on every system. There are also many development tools available, and many modern languages treat Linux as a first class citizen.
Computer languages provide a way for a programmer to enter instructions in a more human readable format, and for those instructions to eventually become translated into something the computer understands. Languages fall into one of two camps: interpreted or compiled . An interpreted language translates the written code into computer code as the program runs, and a compiled language is translated all at once. Linux itself was written in a compiled language called C. C’s main benefit is that the language itself maps closely to the generated machine code so that a skilled programmer can write code that is small and efficient. When computer memory was measured in the Kilobytes, this was very important. Even with large memory sizes today, C is still helpful for writing code that must run fast, such as an operating system.
C has been extended over the years. There is C++, which adds object support to C (a different style of programming), and Objective C that took another direction and is in heavy use in Apple products. The Java language takes a different spin on the compiled approach. Instead of compiling to machine code, Java first imagines a hypothetical CPU called the Java Virtual Machine (JVM) and compiles all the code to that. Each host computer then runs JVM software to translate the JVM instructions (called bytecode) into native instructions. The extra translation with Java might make you think it would be slow. However, the JVM is fairly simple so it can be implemented quickly and reliably on anything from a powerful computer to a low power device that connects to a television. A compiled Java file can also be run on any computer implementing the JVM! Another benefit of compiling to an intermediate target is that the JVM can provide services to the application that normally wouldn’t be avai lable on a CPU. Allocating memory to a program is a complex problem, but that’s built into the JVM. This also means that JVM makers can focus their improvements on the JVM as a whole, so any progress they make is instantly available to applications. Interpreted languages, on the other hand, are translated to machine code as they execute. The extra computer power spent doing this can often be r ecouped by the increased productivity the programmer gains by not having to stop working to compile. Interpreted languages also tend to offer more features than compiled languages, meaning that often less code is needed. The language interpreter itself is usually written in another language such as C, and sometimes even Java! This means that an interpreted language is being run on the JVM, which is translated at runtime into actual machine code. Perl is an interpreted language. Perl was originally developed to perform text manipulation. Over the years, it gained favor with systems administrators and still continues to be improved and used in everything from automation to building web applications. PHP is a language that was originally built to create dynamic web pages. A PHP file is read by a web server such as Apache. Special tags in the file indicate that parts of the c ode should be interpreted as instructions. The web server pulls all the different parts of the file together and sends it to the web browser. PHP’s main advantages are that it is easy to learn and available on almost any system. Because of this, many popular projects are built on PHP. Notable examples include WordPress (blogging) , cacti (for monitoring), and even parts of Facebook. Ruby is another language that was influenced by Perl and Shell, along with many other languages. It makes complex programming t asks relatively easy, and with the inclusion of the Ruby on Rails framework, is a popular choice for building complex web applications. Ruby is also the language that powers many of the leading automation tools like Chef and Puppet, which make managing a large number of Linux systems much easier.
Python is another scripting language that is in common use. Much like Ruby it makes complex tasks easier and has a framework called Django that makes building web applications very easy. Python has excellent statistical processing abilities and is a favorite in academia. A language is just a tool that makes it easier to tell the computer what you want it to do. A library bundles common tasks into a distinct package that can be used by the developer. ImageMagick is one such library that lets programmers manipulate images in code. ImageMagick also ships with some command line tools that enable you to process images from a shell and take advantage of the scripting capabilities there. OpenSSL is a cryptographic library that is used in everything from web servers to the command line. It provides a standard interface so that you can add cryptography into your Perl script, for example. At a much lower level is the C library. This provides a basic set of functions for reading and writing to files and displays, which is used by applications and other languages alike.
2.3 Understanding Open Source Software and Licensing When we talk about buying software there are three distinct components:
Ownership – Who owns the intellectual property behind the software? Mo ney transfer – How does money change hands, if at all? Licensing – What do you get? What can you do with the software? Can you use it on only one computer? Can you give it to someone else?
In most cases, the ownership of the software remains with the person or company that created it. Users are only being granted a license to use the software. This is a matter of copyright law. T he money transfer depends on the business model of the creator . It’s the licensing that really differentiates open source software from closed source software. Two contrasting examples will get things started. With Microsoft Windows, the Microsoft Corporation owns the intellectual proper ty. The license itself, the End User License Agre ement (EULA), is a custom legal document that you must click through, indicating your acceptan ce, in order to install the software. Microsoft keeps the source code and distributes only binary copies through authorized channels. For most consumer products you are allowed to install the software on one computer and are not allowed to make copies of the disk other than for a backup. You are not allowed to reverse engineer the software. You pay for one copy of the software, which gets you minor updates but not major upgrades. Linux is owned by Linus Torvalds. He has placed the code under a license called GNU Public License v ersion 2 (GPLv2). This license, among other things, says that the source code must be made available to anyone who asks and that you are allowed to make any changes you want. One caveat to this is that if you make changes and distribute them, you must put your changes under the same license so that others can be nefit. GPLv2 also says that you are not allowed to charge for distributing the source code other th an your actual costs of doing so (such as copying it to removable media). In general, when you create something, you also get the right to decide how it is used and distributed. Fre e and Open Source Software (FOSS) refers to software where this right has been given up and you are allowed to view the source code and redistribute it. Linus
Torvalds has done that with Linux – even though h e created Linux he can’t tell you that you can’t use it on your computer because he has given up that right through the GPLv2 license. Software licensing is a political issue and it should come as no surprise that there are many different opinions. Organizations have come up with their own license that embodies their particular views so it is easier to choo se an existing license than come up with your own. For example, universities like the Massachusetts Institute of Technology (MIT) and University of California have come up with licenses, as have projects like the Apache Foundation. In addition groups like the Free Software Foundation have created their own licenses to further their agenda.
2.3.1 The Free Software Foundation and the Open Source I Two groups can be considered the most influential forces in the world of open source: The Fre e Software Foundation (FSF) and the Ope n Source Initiative (OSI). The Free Software Foundation was founded in 1985 by Richard Stallman (RMS). The goal of the FSF is to promote Fre e Software. Free Software does not refer to the price, but to the freedom to share, study, and modify the underlying source code. It is the view of the FSF that proprietary software (software distributed under a closed source license) is bad. FSF also advocates that software licenses should enforce the openness of modification s. It is their view that if you modify Free Software that you should be required to share your changes. This specific philosophy is called copyleft . The FSF also advocates against software patents and acts as a watchdog for standards organizations, speaking out when a proposed standard might violate the Free Software principles by including items like Digital Rights Management (DRM) that could restrict what you could do with the service. The FSF have developed their own set of licenses, such as the GPLv2 an d GPLv3, and the Lesser GPL licenses versions 2 and 3 (LGPLv2 & LGPLv3). The lesser licenses are much like the regular licenses except they have provisions for linking against non-Free Software. For example, under GPLv2 you can’t redistribu te software that uses a closed source library (such as a hardware driver) but the lesser variant allows this. The changes between version 2 and 3 are largely focused on using Free Software on a closed hardware device which has been coined Tivoization. TiVo is a company that builds a television digital video recorder on their own hardware and used Linux as the base for their software. While TiVo released the source code to their version of Linux as required under GPLv2, the hardware would not run any modified binaries. In the eyes of the FSF this went against the spirit of the GPLv2 so they added a specific clause to version 3 of the license. Linus Torvalds agrees with TiVo on this matter and has chosen to stay with GPLv2. The Open Source Initiative was founded in 1998 by Bruce Per ens and Eric Raymond (ESR). They believe that Free Software was too politically charged and that less extreme licenses were necessary, particularly around the copyleft aspects of FSF licenses. OSI believes that not only should the source be freely available, but also that no restri ctions should be placed on the use of the software no matter what the intended use. Unlike the FSF, the OSI does not have its own set of licenses. Instead, the OSI has a set of principles and adds other licenses to that list if they meet those principles, called Open Source licenses. Software that conforms to an Open Source license is therefore Open Source Software.
Some of the Open Source licenses are the BSD family of licenses, which ar e much simpler than GPL. They merely state that you may redistribute the source and binaries as long as you maintain copyright notices and don’t imply that the original creator endorses your version. In other words “do what you want with this software, just don’t say you wrote it.” The MIT license has much the same spirit, just with different wording. FSF licenses, such as GPLv2, are also Open Source licenses. However, many Open Source licenses such as BSD and MIT do not contain the copyleft provisions and are thus not acceptable to the FSF. These licenses are called permissive free software licenses because they are permissive in how you can redistribute the software. You can take BSD licensed software and include it in a closed software product as long as you give proper attribution.
2.3.2 More Terms for the Same Thing Rather than dwell over the finer points of Open Source vs. Free Software, the community has started referring to it all as Free and Open Source software (FOSS). The English word “free” can mean “free as in lunch” (as in no cost) or “free as in speech” (as in n o restrictions). This ambiguity has lead to the inclusion of the word libre to refer to the latter definition. Thus, we end up with Fre e/Libre/Open Source Software (FLOSS). While these terms are convenient, they hide the differences between the two schools of thought. At the very least, when you’r e using FOSS software, you know you do n’t have to pay for it and you can redistribute it as you wish.
2.3.3 Other Licensing Schemes FOSS licenses are mostly related to software. People have placed works such as dra wings and plans under FOSS licenses but this was not the intent. When software has been placed in the Public domain, the author has relinquished all rights, including the copyright on the work. In some countries, this is the default when the work is done by a government agency. In some countries, copyrighted work becomes public domain after the author has died and a lengthy waiting period has elapsed. The Creative Commons (CC) organization has created the Creative Commons Licen ses which try to address the intentions behind FOSS licenses for non software entities. CC licenses can also be used to restrict commercial use if that is the desire of the copyright holder. The CC licenses are:
Attribution (CC BY) – much like the BSD license, you can use CC BY conten t for any use but must credit the copyright holder Attribution ShareAlike (CC BY-SA) – a copyleft version of the Attribution license. Derived works must be shared under the same license, much like in the Free Software ideals Attribution No-De rivs (CC BY-ND) – you may redistribute the content under the same conditions as CC-BY but may not change it Attribution-NonCommercial (CC BY-NC) – just like CC BY, but you may not use it for commercial purposes Attribution-NonCommercial-ShareAlike (CC-BY-NC-SA) – Builds on the CC BYNC license but requires that your changes be shared under the same license. Attribution-NonCommercial-No-Derivs (CC-BY-NC-ND) – You are sharing the content to be used for non commercial purposes, but people may not change the content. No Rights Reserved (CC0) – This is the Creative Commons version of public domain.
The licenses above can all be summarized as ShareAlike or no restrictions, and whether or not commercial use or derivations are allowed.
2.3.4 Open Source Business Models If you are giving your software away for free, how can you make money off of it? The simplest way to make money is to sell support or warranty around the software. You may make money by installing the software for people, helping people when they have problems, or fixing bugs for money. You are effectively a consultant. You can also charge for a service or subscription that enhances the software. The Open Source MythTV digital video recorder project is an excellent example. The software is free, but you can pay to hook it up to a TV listing service to know what time particular television shows are on. You can package hardware or add extra closed source software to sell alongside the free software. Appliances and embedded systems that use Linux can be developed and sold. Many consumer firewalls and entertainment devices follow this model. You can also develop open source software as part of your job. If you create a tool to make your life easier at your regular job you may be able to convince your employer to let you open source it. It may be a situation where you were working on the software while getting paid but licensing as open source would allow other people with the same problem to be helped and even contribute. In the 1990’s, Gerald Combs was working at an Internet service provider and started writing his own network analysis tool because similar tools at the time were very expensive. Over 600 people have now contributed to the project, called Wireshark. It is now often considered better than commercial offerings and has led to a company being formed around Gerald to support Wireshark and to sell products and support that make it more useful. This company was later bought by a large network vendor who supports its development.
Other companies get such immense value out of open source software that they find it worth their while to hire people to work on the software full time. The search engine Google has hired the creator of the Python computer language, and even Linus Torvalds is hired by the Linux Foundation to work on Linux. The American telephone company AT&T gets such value out of the Ruby and Rails projects for their Yellow Pages property that they have an employee who does nothing but work for those projects. One final way that people make money indirectly through open source is that it is an open way to judge one’s skills. It is one thing to say you performed certain tasks at your job, but showing off your creation and sharing it with the world lets potential employers see the quality of your work. Similarly, companies have found that open sourcing non critical parts of their internal software attracts the interest of higher caliber people.
Chapter 3 3.1 Introduction Before you can become an effective Linux systems administrator, you must be able to use Linux as a desktop and have proficiency with basic Information and Communication Technology (ICT) skills. Not only will it help you when dealing with users, immersing yourself in Linux will help to improve your skills more quickly. Furthe rmore, the life of a systems administrator is more than just server work – there’s email and documentation to do!
3.2 Graphical vs. Non-Graphical Mode Linux can be used in one of two ways: graphically and non -graphically. In graphical mode your applications live in windows that you can resize and move around. You have menus
and tools to help you find what you’re looking for. T his is where you’ll use a we b browser, your graphics editing tools, and your email. Here we see an example of the graphical desktop, with a menu bar of popular applications to the left and a LibreOffice document being edited with a web browser in the background.
In graphical mode, you can have several shells open, which is very helpful when you are performing tasks on multiple remote computers. You even log in with your username and password through a graphical interface. An example of a graphical login is shown in the figure below.
After logging in, you are taken to the desktop where you can load applications. Non-graphical mode starts off with a text-based login, shown below. You are simply prompted for your username and after that, your password. If the login is successful, you are taken straight to a shell.
In non-graphical mode, there are no windows to move around. Even though you have text editors, web browsers, and email clients, they’re text only. This is how UNIX got its start before graphical environments were the norm. Most servers will be running in this mode too, since people don’t log into them directly, which makes a graphical interface a waste of resources. Here is an example of the screen you might see after logging in.
You can see the original prompt to login at the top with the newer text added below. During login, you might see some messages, called the message of the day (MOTD), which is an opportunity for the systems administrator to pass information to the users. Following the MOTD is the command prompt. In the example above, the user has entered the w command, which shows who is logged in. As new commands are entered and processed, the window scrolls up and older text is lost across the top. The terminal itself is responsible for keeping any history, such as to allow the user to scroll up and see previously entered commands. As far as Linux is concerned, what is on the screen is all that there is. There’s nothing to move around.
3.3 Command Line The command line is a simple text input that lets you enter anything from one word commands to complicated scripts. If you log in through text-mode, you’re immediately at the console. If you log in graphically, then you’ll need to launch a gr aphical shell which is just a text console with a window around it so that you can resize and move it around.
Each Linux desktop is different, so you will want to look around your menus for an option called either terminal or x-term. Both of those are graphical shells, differing mostly in appearances rather than functionality. If you have a search tool such as Ubuntu One, you can look for terminal, as shown here.
3.4 Virtualization and Cloud Computing Linux is a multiuser operating system, which means that many different users can work on the same system simultaneously and fo r the most part can’t do things to harm oth er users. However, this does have limitations – users can hog disk space or take up too much memory or CPU resources and make the system slow for everyone . Sharing the system in multiuser mode also requires that everyone run as unprivileged users, so letting each user run their own web server is very difficult. Virtualization is the process where one physical computer, ca lled the host , runs multiple copies of an operating system, each called a guest . The host runs software called the hypervisor that switches control between the various guests just like the Linux kernel does for individual processes. Virtualization works because servers spend most of their time idling and don’t need physical resources such as a monitor and keyboard. You can now take a powerful CPU and spread it around multiple virtual machines and maintain more equitable sharing between the guests than is possible on a bare metal Linux system. The main limitation is usually memory and with advances in hypervisor technology and CPUs it is possible to put more virtual machines on one host than ever. In a virtualized environment one host can run dozens of guest operating systems, and with support from the CPU itself, the guests don’t even know they are running on a virtual machine. Each guest gets its own virtual CPU, RAM, and disk, and communicates with the network on its own. It is not even necessary to run the same operating system on all the guests, which further reduces the number of physical servers needed. Virtualization offers a way for an enterprise to lower power usage and reduce datacenter space over an equivalent fleet of physical servers. Guests are n ow just software
configurations, so it is easy to spin up a new machine for testing and destroy it when its usefulness has passed. If it is possible to run multiple instances of an operating system on one physical machine and connect to it over the network, then the location of the machine doesn’t really matter. Cloud computing takes this approach and allows you to have a virtual machine in a remote datacenter that you don’t own, and only pay for the resources you use. Cloud computing vendors can take advantage of scales of economy to offer computing resources at prices better than what it would cost to procure your own hardware, space, and cooling. Virtual servers are only one facet of cloud computing. You can also get file storage, databases, or even software. The key in most of these products is that you pay for what you use, such as a certain amount per gigabyte of data per month, rather than buying the hardware and software then hosting it yourself. Some situations are more suitable for the cloud than others. Security and performance concerns are usually the first items to come up, followed by cost and functionality. Linux plays a pivotal role in cloud computing. Most virtual servers are based on some kind of Linux kernel and Linux is often used to host the applications behind cloud computing services.
3.5 Using Linux For Work The basic tools used in most offices are:
Word processor Spreadsheet Presentation package Web browser
OpenOffice, or the more active LibreOffice, takes care of the first three roles. A word processor is used to edit documents, such as reports and memos. Spreadsheets are useful for working with numbers, such as to summarize sales data and making future predictions. A presentation package is used to create slides with features such as text, graphics and embedded video. Slides may be printed or displayed on a screen or projector to share with an audience. Shown below is the spreadsheet and the document editor of LibreOffice. Note how the spreadsheet, LibreOffice Calc, is not limited to rows and columns of numbers. The numbers can be the source of a graph, and formulas can be written to calculate values based on information, such as pulling together interest rates and loan amounts to help compare different borrowing options. Using LibreOffice Writer, a document can contain text, graphics, data tables, and much more. You can link documents and spreadsheets together, for example, so that you can summarize data in written form and know that any changes to the spreadsheet will be reflected in the document.
LibreOffice can also work with other file formats, such as Microsoft Office or Adobe Portable Document Format (PDF) files. Additionally, through the use of extensions, LibreOffice can be made to integrate with Wiki software to give you a powerful intranet solution. Linux is a first class citizen for the Firefox and Google Chrome browsers. As such, you can expect to have the latest software available for your platform and timely access to bug fixes and new features. Some plugins, such as Adobe Flash, may not always work correctly since those rely on another company with different priorities.
3.6 Keeping Your Linux Computer Safe Linux doesn’t care if you are on the keyboard of a computer or connecting over th e Internet, so you’ll want to take some basic precautions to make sure your data is safe and secure.
The easiest thing you can do is to use a good, unique password everywhere you go, especially on your local machine. A good password is at least 10 characters long and contains a mixture of numbers, letters (both upper and lower case) and special symbols. Use a package like KeePassX to generate passwords, and then you only need to have a login password to your machine and a password to open up your KeePassX file. After that, make a point of checking for updates periodically. Here, we show the Ubuntu software update configuration, which is available from the Settings menu.
At the top, you can see that the system is configured to check for updates on a daily basis. If there are security related updates, then you will be prompted immediately to install them. Otherwise, you will get the updates batched up for running every week. At the bottom of the screen is the dialog that comes up when there are updates. All you h ave to do is click on Install Now and you will be updated! Finally, you will want to protect your computer from accepting incoming connections. A firewall is a device that filters network traffic, and Linux has one built -in. If you are using Ubuntu, then the gufw is a graphical interface to Ubuntu’s “ uncomplicated firewall”.
By simply changing the status to “on” you will block out all traffic coming into your computer, unless you initiated it. You can selectively allow things in, by clicking on the plus sign.
Under the hood, you are using iptables, which is the built in firewall system. Instead of entering complicated iptables commands you use a GUI. While this GUI lets you build an effective policy for a desktop, it barely scratches the surface of what iptables can do.
3.7 Protecting Yourself As you browse the web, you leave a digital footprint. Much of this information goes ignored, some of it is gathered to collect statistics for advertising, and some can be used for malicious purposes. As a general rule, you should not trust sites you interact with. Use separate passwords on each website so that if that website is hacked , the password can’t be us ed to gain access to other sites. Using KeePassX, mentioned earlier, is the easiest way of doing this. Also, limit the information you give to sites to only what is needed. While giving your mother’s maiden name and birthdate might help unlock your social network login if you lose your password, the same information can be used to impersonate you to your bank. Cookies are the main mechanism that websites use to track you. Sometimes this tracking is good, such as to keep track of what is in your shopping cart or to keep you logged in when you return to the site. As you browse the web, a web server can send back the cookie, which is a small piece of text, along with the web page. Your browser stores that and sends it back with every request to the same site. You do not send cookies for example.com to sites at example.org.
However, many sites have embedded scripts that come from third parties, such as a banner advertisement or analytics pixel. If both example.com and example.org have a tracking pixel, such as one from an advertiser, then that same cookie will be sent when browsing both sites. The advertiser then knows that you have visited both example.com and example.org. With a broad enough reach, such as social network “Like” buttons and such, a website can gain an understanding of which websites you frequent and figure out your interests and demographics.
There are various strategies for dealing with this. One is to ignore it. The other is to limit the tracking pixels you accept, either by blocking them entire ly or clearing them out periodically. The cookie related settings for Firefox are shown in the figure below. At the to p, you will see that the user has opted to have Firefox tell the site not to track. This is a voluntary tag sent in the request that some sites will honor. Below that, the browser is told to never remember third party cookies and to remove regular cookies (such as from the site you are browsing) after Firefox is closed. Tweaking privacy settings can make you more anonymous on the Internet, but it can also cause problems with some sites that depend on third party cookies. If this happens, you might have to explicitly permit some cookies to be saved.
Here you are also given the option to forget search history or to not track it at all. With s earch history removed, there will be no record on your local computer of which sites you visited. If you are very concerned about being anonymous on the Internet, you can download and use the Tor Browser. Tor is short for “The Onion Router” which is a network of publically run servers that bounce your traffic around to hide the origin. The browser that comes with the package is a stripped down version that doesn’t even run scripts, so some sites may not work correctly. However, it is the best way of concealing your identity if you wish to do so.
Chapter 4 4.1 Introduction If you are like most people, you are probably most familiar with using a Graphical User Interface (GUI) to control your computer. Introduced to the masses by Apple on the Macintosh computer and popularized by Microsoft, a GUI provides an easy, discoverable way to manage your system. Without a GUI, some tools for graphics and video would not be practical. Prior to the popularity of the GUI, the Command Line Interface (CLI) was the preferred way to control a computer. The CLI relies solely on keyboar d input. Everything you want the computer to do is relayed by typing commands rather than clicking on icons. If you have never used a CLI, at first it may prove challenging because it requires memorizing commands and their options. However, a CLI provides more pre cise control, greater speed and the ability to easily automate tasks through scripting (see sidebar). Although Linux does have many GUI environments, you will be able to control Linux much more effectively by using the Command Line Interface.
4.2 Command Line Interface (CLI)
The Command Line Interface (CLI) , is a text-based interface to the computer, where the user types in a command and the computer then executes it. The CLI environment is provided by an application on the computer known as a terminal. The terminal accepts what the user types and passes to a shell . The shell interprets what the user has typed into instructions that can be executed by the operating system. If output is produced by the command, then this text is displayed in the terminal. If problems with the command are encountered, then an error message is displayed.
4.3 Accessing a Terminal There are many ways to access a terminal window. Some systems will boot directly to a terminal. This is often the case with servers, as a Graphical User Interface (GUI) can be resource intensive and may not be needed to perform server-based operations. A good example of a server that doesn't necessarily require a GUI is a web server. Web servers need to run as quickly as possible and a GUI would just s low the system down. On systems that boot to a GUI, there are commonly two ways to access a terminal, a GUIbased terminal and a virtual terminal:
A GUI terminal is a program within the GUI environment that emulates a terminal window. GUI terminals can be accessed through the menu system. For example, on a CentOS machine, you could click on Applications on the menu bar, then System Tools > and, finally, Terminal:
A virtual terminal can be run at the same time as a GUI, but requires the user to log in via the virtual terminal before they can execute commands (as they would before accessing the GUI interface). Most systems have multiple virtual terminals that can be accessed by pressing a combination of keys, for example: Ctrl-Alt-F1
Note: On virtual machines, virtual terminals may not be available.
4.3.1 Prompt A terminal window displays a prompt; the prompt appears when no commands are being run and when all command output has been printed to the screen. The prompt is designed to tell the user to enter a command. The structure of the prompt may vary between distributions, but will typically contain information about the user and the system. Below is a common prompt structure:
sysadmin@localhost :~$
The previous prompt provides the name of the user that is logged in (sysadmin ), the name of the system ( localhost ) and the current directory ( ~). The ~ symbol is used as shorthand for the user's home directory (typically the home directory for the user is under the /home directory and named after the user account name, for example: /home/sysadmin ).
4.3.2 Shell A shell is the interpreter that translates commands entered by a user into actions to be performed by the operating system. The Linux environment provides many different types of shells, some of which have been around for many years. The most commonly used shell for Linux distributions is called the BASH shell. It is a shell that provides many advanced featu res, such as command history, which allows you to easily re-execute previously executed commands. The BASH shell also has other popular features:
Scripting: The ability to place commands in a file and execute the file, resulting in all of the commands being executed. This feature also has some programming features, such as conditional statements and the ability to create functions (AKA, subroutines). Aliases: The ability to create shor t "nicknames" for longer commands. Variables: Variables are used to store information for the BASH shell. These variables can be used to modify how commands and features work as well as provide vital system information.
Note: The previous list is just a short summary of some of the many features provided by the BASH shell.
4.3.3 Formatting commands Many commands can be used by themselves with no further input. Some commands require additional input to run properly. This additional input comes in two forms: options and arguments. The typical format for a command is as follows: command [options] [arguments]
Options are used to modify the core behavior of a command while ar guments are used to provide additional information (such as a filename or a username). Each option and argument is normally separated by a space, although options can often be combined together. Keep in mind that Linux is case sensitive. Commands, options, arguments, variables and filenames must be entered exactly as shown. The ls command will provide useful examples. By itself, the ls command will list the files and directories contained in your current working directory: sysadmin@localhost:~$ ls
Desktop eos
Documents
Downloads
Music
Pictures
Public
Templates
Vid
sysadmin@localhost:~$
The ls command will be covered in complete detail in a later chapter. The purpose of introducing this command now is to demonstrate how arguments and options work. At this point you shouldn't worry about what the output of the command is, but rather focus on understanding what an argument and an option is. An argument can also be passed to the ls command to specify which directory to list the contents of. For example, the command ls /etc/ppp will list the contents of the /etc/ppp directory instead of the current directory: sysadmin@localhost:~$ ls /etc/ppp
ip-down.d
ip-up.d
sysadmin@localhost:~$
Since the ls command will accept multiple arguments, you can list the contents of multiple directories at once by typing the ls /etc/ppp /etc/ssh command: sysadmin@localhost:~$ ls /etc/ppp /etc/ssh
/etc/ppp: ip-down.d
ip-up.d
/etc/ssh: moduli d_configssh_config
ssh_host_dsa_key.pub
ssh_host_rsa_key
ssh_host_ecdsa_key
ssh_host_rsa_key.pub
ssh_host_dsa_key
ssh_host_ecdsa_key.pub
ssh
ssh_import_id
sysadmin@localhost:~$
4.3.4 Working with Options Options can be used with commands to expand or modify the way a command behaves. Options are often single letters; however, they sometimes will be "words" as well. Typically, older commands use single letters while newer commands use complete words for opt ions. Single-letter options are preceded by a single dash -. Full-word options are preceded by two dashes --. For example, you can use the -l option with the ls command to display more information about the files that are listed. The ls -l command will list the files contained within the current directory and provide additional information, such as the permissions, the size of the file and other information: sysadmin@localhost:~$ ls -l
total 0 drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Desktop
drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Documents
drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Downloads
drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Music
drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Pictures
drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Public
drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Templates
drwxr-xr-x 1 sysadmin sysadmin 0 Jan 29
2015 Videos
sysadmin@localhost:~$
In most cases, options can be used in conjunction with other options. For example, the ls l -h or ls -lh command will list files with details, but will display the file sizes in humanreadable format instead of the default value (bytes): sysadmin@localhost:~$ ls -l /usr/bin/perl
-rwxr-xr-x 2 root root 10376 Feb
2014 /usr/bin/perl
4
sysadmin@localhost:~$ ls -lh /usr/bin/perl
-rwxr-xr-x 2 root root 11K Feb
4
2014 /usr/bin/perl
sysadmin@localhost:~$
Note that the previous example also demonstrated how you can combine single letter options: -lh . The order of the combined options isn't important. The -h option also has a full-word form: --human-readable. Options can often be used with an argument. In fact, some options require their own arguments. You can use options and arguments with the ls command to list the contents of another directory by executing the ls -l /etc/ppp command: sysadmin@localhost:~$ ls -l /etc/ppp
total 0 drwxr-xr-x 1 root root 10 Jan 29
2015 ip-down.d
drwxr-xr-x 1 root root 10 Jan 29
2015 ip-up.d
sysadmin@localhost:~$
4.4 Command history When you execute a command in a terminal, the command is stored in a "history list". This is designed to make it easy for you to execute the same command later since you won't need to retype the entire command. To view the history list of a terminal, use the history command: sysadmin@localhost:~$ date
Sun Nov
1 00:40:28 UTC 2015
sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Video s sysadmin@localhost:~$ cal 5 2015
May 2015
Pictures
Public
Templates
Su Mo Tu We Th Fr Sa
3
4
5
6
7
1
2
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 sysadmin@localhost:~$ history
1
date
2
ls
3
cal 5 2015
4
history
sysadmin@localhost:~$
Pressing the Up Arrow ↑ key will display the previous command on your prompt line. You can press up repeatedly to move back through the history of commands you have run. Pressing the Enter key will run the displayed command again. When you find the command that you want to execute, you can use the Left arrow ← keys and Right arrow → keys to position the cursor for editing. Other useful keys for editing include the Home, End, Backspace and Delete keys. If you see a command you wish to run in the list that the history command generates, you can execute this command by typing an exclamation point and then the number next to the command, for example: !3 sysadmin@localhost:~$ history
1
date
2
ls
3
cal 5 2015
4
history
sysadmin@localhost:~$ !3
cal 5 2015 May 2015 Su Mo Tu We Th Fr Sa
3
4
5
6
7
1
2
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
sysadmin@localhost:~$
Some additional history examples: Example
Meaning
history 5
Show the last five commands from the history list
!!
Execute the last command again
!-5
Execute the fifth command from the bottom of the history list
!ls
Execute the most recent ls command
4.5 Introducing BASH shell variables A BASH shell variable is a feature that allows you or the shell to store data. This data can be used to provide critical system information or to change the behavior of how the BASH shell (or other commands) work. Variables are given names and stored temporarily in memory. When you close a terminal window or shell, all of the variables are lost. However, the system automatically recreates many of these variables when a new shell is opened. To display the value of a variable, you can use the echo command. The echo command is used to display output in the terminal; in the example below, the command will display the value of the HISTSIZE variable: sysadmin@localhost:~$ echo $HISTSIZE
1000 sysadmin@localhost:~$
The HISTSIZE variable defines how many previous commands to store in the history list. To display the value of the variable, use a dollar sign$ character before the variable name. To modify the value of the variable, you don't use the $ character: sysadmin@localhost:~$ HISTSIZE=500 sysadmin@localhost:~$ echo $HISTSIZE
500 sysadmin@localhost:~$
There are many shell variables that are available for the BASH shell, as well as variables that will affect different Linux commands. A discussion of all shell variables is beyond the scope of this chapter, however more shell variables will be covered as this course progresses.
4.6 PATH variable One of the most important BASH shell variables to understand is the PATH variable. The term path refers to a list that defines which directories the shell will look in for commands. If you type in a command and receive a "command not found" error, it is
because the BASH shell was unable to locate a command by that name in any of the directories included in the path. The following command displays the path of the current shell: sysadmin@localhost:~$ echo $PATH
/home/sysadmin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/s bin:/bin: /usr/games sysadmin@localhost:~$
Based on the proceeding output, when you attempt to execute a command, the shell will first look for the command in the /usr/lib/qt-3.3/bin directory. If the command is found in that directory, then it is executed. If it isn't found, then the shell will look in the /usr/local/bin directory. If the command is not found in any directory listed in the PATH variable, then you will receive a command not found error: sysadmin@localhost:~$ zed
-bash: zed: command not found sysadmin@localhost:~$
If custom software is installed on your system, you may need to modify the PATH to make it easier to execute these commands. For example, the following will add the /usr/bin/custom directory to the PATHvariable: sysadmin@localhost:~$ PATH=/usr/bin/custom:$PATH sysadmin@localhost:~$ echo $PATH
/usr/bin/custom:/home/sysadmin/bin:/usr/local/sbin:/usr/local/bin:/usr/ sbin:/usr/bin:/sbin:/bin:/usr/games sysadmin@localhost:~$
4.7 export Command There are two types of variables used in the BASH shell, local and environment. Environment variables, such as PATH and HOME, are used by BASH when interpreting commands and performing tasks. Local variables are often associated with user based tasks and are lowercase by convention. To create a local variable, simply type: sysadmin@localhost:~$ variable1='Something'
To view the contents of the variable, refer to it with a leading $ sign: sysadmin@localhost:~$ echo $variable1
Something
To view environment variables, use the env command (searching through the output u sing grep, as shown here, will be discussed in later chapters). In this case, the search for variable1 in the environment variables results in no output:
sysadmin@localhost:~$ env | grep variable1 sysadmin@localhost:~$
After exporting variable1 , it is now an environment variable. Notice that this time, it is found in the search through the environment variables: sysadmin@localhost:~$ export variable1 sysadmin@localhost:~$ env | grep variable1 variable1 =Something
The export command can also be used to make an environment variable upon its crea tion: sysadmin@localhost:~$ export variable2='Else' sysadmin@localhost:~$ env | grep variable2 variable2 =Else
To change the value of an environment variable, simply omit the $ when referencing it: sysadmin@localhost:~$ variable1=$variable1' '$variable2 sysadmin@localhost:~$ echo $variable1
Something Else
Exported variables can be removed using the unset command: sysadmin@localhost:~$ unset $variable2
4.8 which Command There may be situations where different versions of the same command are installed on a system or where commands are accessible to some user s and not others. If a command does not behave as expected or if a command is not accessible that should be, it can be beneficial to know where the shell is finding the command or which version it is using. It would be tedious to have to manually look in each dire ctory that is listed in the PATH variable. Instead, you can use the which command to display the full path to the command in question: sysadmin@localhost:~$ which date
/bin/date sysadmin@localhost:~$ which cal
/usr/bin/cal sysadmin@localhost:~$
The which command searches for the location of a command by searching the PATH variable.
4.9 type Command The type command can be used to determine information about various commands. Some commands originate from a specific file: sysadmin@localhost:~$ type which
which is hashed (/usr/bin/which)
This output would be similar to the output of the which command (as discussed in the previous section, which displays the full path of the command): sysadmin@localhost:~$ which which
/usr/bin/which
The type command can also identify commands that are built into the bash (or other) shell: sysadmin@localhost:~$ type echo
echo is a shell builtin
In this case, the output is significantly different from the output of the which command: sysadmin@localhost:~$ which echo
/bin/echo
Using the -a option, the type command can also reveal the path of another command: sysadmin@localhost:~$ type -a echo
echo is a shell builtin echo is /bin/echo
The type command can also identify aliases to other commands: sysadmin@localhost:~$ type ll
ll is aliased to `ls -alF' sysadmin@localhost:~$ type ls
ls is aliased to `ls --color=auto'
The output of these commands indicate that ll is an alias for ls -alF, and even ls is an alias for ls --color=auto. Again, the output is significantly different from the which command: sysadmin@localhost:~$ which ll sysadmin@localhost:~$ which ls
/bin/ls
The type command supports other options, and can lookup multiple commands simultaneously. To display only a single word describing the echo, ll, and which commands, use the -t option: sysadmin@localhost:~$ type -t echo ll which
builtin alias file
4.10 Aliases An alias can be used to map longer commands to shorter key sequences. When the shell sees an alias being executed, it substitutes the longer sequence before proceeding to interpret commands. For example, the command ls -l is commonly aliased to l or ll. Because these smaller commands are easier to type, it becomes faster to run the ls -l command line. You can determine what aliases are set on your shell with the alias command: sysadmin@localhost:~$ alias
alias egrep='egrep --color=auto' alias fgrep='fgrep --color=auto' alias grep='grep --color=auto' alias l='ls -CF' alias la='ls -A' alias ll='ls -alF' alias ls='ls --color=auto'
The aliases that you see from the previous examples were created by initialization files. These files are designed to make the process of creating aliases automatic and will be discussed in more detail in a later chapter. New aliases can be created by typing alias name=command where name is the name you want to give the alias and command is the command you want to have executed when you run the alias. For example, you can create an alias so that lh displays a long listing of files, sorted by size with a "human friendly" size with the alias lh='ls -Shl' command. Typing lh should now result in the same output as typing the ls -Shl command: sysadmin@localhost:~$ alias lh='ls -Shl' sysadmin@localhost:~$ lh /etc/ppp
total 0 drwxr-xr-x 1 root root 10 Jan 29
2015 ip-down.d
drwxr-xr-x 1 root root 10 Jan 29
2015 ip-up.d
Aliases created this way will only persist while the shell is open. Once the shell is closed, the new aliases that you created will be lost. Additionally, each shell has its own aliases, so if
you create an alias in one shell and then open another shell, you won't see the alias in the new shell.
4.11 Globbing Glob characters are often referred to as "wild cards". These are symbols that have special meaning to the shell. Unlike commands that the shell will run, or options and arguments that the shell will pass to commands, glob characters are interpreted by the shell itself bef ore it attempts to run any command. This means that glob characters can be used with any command. Globs are powerful because they allow you to specify patterns that match filenames in a directory, so instead of manipulating a single file at a time, you ca n easily execute commands that will affect many files. For instance, by using glob characters it is possible to manipulate all files with a certain extension or with a particular filename length. Keep in mind that these globs can be used with any command because it is the shell, not the command that expands with globs into matching filenames. The examples provided in this chapter will use the echo command for demonstration.
4.11.1 Asterisk (*) The asterisk character is used to represent zero or more of any character in a filename. For example, suppose you want to display all of the files in the /etc directory that begin with the letter t: sysadmin@localhost:~$ echo /etc/t*
/etc/terminfo /etc/timezone sysadmin@localhost:~$
The pattern t* means "match any file that begins with the character t and has zero or more of any character after the t". You can use the asterisk character at any place within the filename pattern. For example, the following will match any filename in the /etc directory that ends with .d: sysadmin@localhost:~$ echo /etc/*.d
/etc/apparmor.d /etc/bash_completion.d /etc/cron.d /etc/depmod.d /et c/fstab.d /etc/init.d /etc/insserv.conf.d /etc/ld.so.conf.d /etc/logrot ate.d /etc/modprobe.d /etc/pam.d /etc/profile.d /etc/rc0.d /etc/rc1.d / etc/rc2.d /etc/rc3.d /etc/rc4.d /etc/rc5.d /etc/rc6.d /etc/rcS.d /etc/r syslog.d /etc/sudoers.d /etc/sysctl.d /etc/update-m otd.d
In the next example, all of the files in the /etc directory that begin with the letter r and end with .conf will be displayed: sysadmin@localhost:~$ echo /etc/r*.conf
/etc/resolv.conf /etc/rsyslog.conf
4.11.2 Question Mark (?) The question mark represents any one character. Each question mark character matches exactly one character, no more and no less.
Suppose you want to display all of the files in the /etc directory that begin with the letter t and have exactly 7 characters after the t character: sysadmin@localhost:~$ echo /etc/t???????
/etc/terminfo /etc/timezone sysadmin@localhost:~$
Glob characters can be used together to find even more complex patterns. The echo /etc/*???????????????????? command will print only files in the /etc directory with twenty or more characters in the filename: sysadmin@localhost:~$ echo /etc/*????????????????????
/etc/bindresvport.blacklist /etc/ca-certificates.conf sysadmin@localhost:~$
The asterisk and question mark could also be used together to look for files with three -letter extensions by running the echo /etc/*.??? command: sysadmin@localhost:~$ echo /etc/*.???
/etc/blkid.tab /etc/issue.net sysadmin@localhost:~$
4.11.3 Brackets [] Brackets are used to match a single character by representing a range of characters that are possible match characters. For example, echo /etc/[gu]* will print any file that begins with either a g or u character and contains zero or more additional characters: sysadmin@localhost:~$ echo /etc/[gu]*
/etc/gai.conf /etc/groff /etc/group /etc/group- /etc/gshadow /etc/gshad ow- /etc/ucf.conf /etc/udev /etc/ufw /etc/update-motd.d /etc/updatedb.c onf sysadmin@localhost:~$
Brackets can also be used to a represent a range of characters. For example, the echo /etc/[a-d]* command will print all files that begin with any letter between and including a and d: sysadmin@localhost:~$ echo /etc/[a-d]*
/etc/adduser.conf /etc/adjtime /etc/alternatives /etc/apparmor.d /etc/apt /etc/bash.bashrc /etc/bash_completion.d /etc/bind /etc/bindres vport.blacklist /etc/blkid.conf /etc/blkid.tab /etc/ca-certificates /et c/ca-certificates.conf /etc/calendar /etc/cron.d /etc/cron.daily /etc/c ron.hourly /etc/cron.monthly /etc/cron.weekly /etc/crontab /etc/dbus-1 /etc/debconf.conf /etc/debian_version /etc/default /etc/deluser.conf /etc/depmod.d /etc/dpkg sysadmin@localhost:~$
The echo /etc/*[0-9]* command would display any file that contains at least one number: sysadmin@localhost:~$ echo /etc/*[0-9]*
/etc/dbus-1 /etc/iproute2 /etc/mke2fs.conf /etc/python2.7 /etc/rc0.d /etc/rc1.d /etc/rc2.d /etc/rc3.d /etc/rc4.d /etc/rc5.d /etc/rc6.d sysadmin@localhost:~$
The range is based on the ASCII text table. This table defines a list of characters, arranging them in a specific standard order. If you provide an invalid order, no match will be made: sysadmin@localhost:~$ echo /etc/*[9-0]*
/etc/*[9-0]* sysadmin@localhost:~$
4.11.4 Exclamation Point (!) The exclamation point is used in conjunction with the square brackets to negate a range. For example, the command echo [!DP]* will display any file that does not begin with a D or P.
4.12 Quoting There are three types of quotes that have special significance to the Bash shell: double quotes ", single quotes ', and back quotes `. Each set of quotes indicates to the shell that it should treat the text within the quotes differently than it would normally be treated.
4.12.1 Double Quotes Double quotes will stop the shell from interpreting some metacharacters, including glob characters. Within double quotes an asterisk is just an asterisk, a question mark is just a question mark, and so on. This means that when you use the second echo command below, the BASH shell doesn't convert the glob pattern into filenames that match the pattern: sysadmin@localhost:~$ echo /etc/[DP]*
/etc/DIR_COLORS /etc/DIR_COLORS.256color /etc/DIR_COLORS.lightbgcolor / etc/PackageKit sysadmin@localhost:~$ echo "/etc/[DP]*"
/etc/[DP]* sysadmin@localhost:~$
This is useful when you want to display something on the screen that is normally a special character to the shell: sysadmin@localhost:~$ echo "The glob characters are *, ? and [ ]"
The glob characters are *, ? and [ ] sysadmin@localhost:~$
Double quotes still allow for command substitution (discussed later in this chapter), variable substitution and permit some other shell metacharacters that haven't been discussed yet. For example, in the following demonstration, you will notice that the value of the PATH variable is displayed: sysadmin@localhost:~$ echo "The path is $PATH"
The path is /usr/bin/custom:/home/sysadmin/bin:/usr/local/sbin:/usr/loc al/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games sysadmin@localhost:~$
4.12.2 Single Quotes Single quotes prevent the shell from doing any interpreting of special characters. This includes globs, variables, command substitution and other metacharacter that have not been discussed yet. For example, if you want the $ character to simply mean a $, rather than it acting as an indicator to the shell to look for the value of a variable, you could execute the second command displayed below: sysadmin@localhost:~$ echo The car costs $100
The car costs 00 sysadmin@localhost:~$ echo 'The car costs $100'
The car costs $100 sysadmin@localhost:~$
4.12.3 Backslash Character (\) You can use an alternative technique to essentially single quote a single character. For example, suppose you want to print the following: “ The services costs $100 and the path is $PATH ". If you place this in double quotes, $1 and $PATH are considered variables. If you place this in single quotes, $1and $PATH are not variables. But what if you want to have $PATH treated as a variable and $1 not? If you place a backslash \ character in front of another character, it treats the other character as a "single quoted" character. The third command below demonstrates using the \ character while the other two demonstrate how the variables would be treated within double and single quotes: sysadmin@localhost:~$ echo "The service costs $100 and the path is $PAT H"
The service costs 00 and the path is /usr/bin/custom:/home/sysadmin/bin :/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/game s sysadmin@localhost:~$ echo 'The service costs $100 and the path is $PAT H'
The service costs $100 and the path is $PATH sysadmin@localhost:~$ echo The service costs \$100 and the path is $PAT H
The service costs $100 and the path is /usr/bin/custom:/home/sysadmin/b in:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/ga mes sysadmin@localhost:~$
4.12.4 Back Quotes Back quotes are used to specify a command within a command, a process called command substitution. This allows for very powerful and sophisticated use of commands. While it may sound confusing, an example should make things more clear. To begin, note the output of the date command: sysadmin@localhost:~$ date
Mon Nov
2 03:35:50 UTC 2015
Now note the output of the echo Today is date command line: sysadmin@localhost:~$ echo Today is date
Today is date sysadmin@localhost:~$
In the previous command the word date is treated as regular text and the shell simply passes date to the echo command. But, you probably want to execute the date command and have the output of that command sent to the echo command. To accomplish this, you would run the echo Today is `date`command line: sysadmin@localhost:~$ echo Today is `date`
Today is Mon Nov 2 03:40:04 UTC 2015 sysadmin@localhost:~$
4.13 Control Statements Control statements allow you to use multiple commands at once or run additional commands, depending on the success of a previous command. Typically these control statements are used within scripts, but they can also be used on the command line as well.
4.13.1 Semicolon The semicolon can be used to run multiple commands, one after the other. Each command runs independently and consecutively; no matter the result of the first command, the second will run once the first has completed, then the third and so on. For example, if you want to print the months of January, Februar y and March of 2015, you can execute cal 1 2 015; cal 2 2015; cal 3 2015 on the command line: sysadmin@localhost:~$ cal 1 2015; cal 2 2015; cal 3 2015
January 2015 Su Mo Tu We Th Fr Sa
4
5
6
7
1
2
3
8
9 10
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
February 2015 Su Mo Tu We Th Fr Sa 1
2
3
4
5
6
7
8
9 10 11 12 13 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28
March 2015 Su Mo Tu We Th Fr Sa 1
2
3
4
5
6
7
8
9 10 11 12 13 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
4.13.2 Double Ampersand (&&) The double ampersand && acts as a logical "and" if the first command is successful, then the second command (to the right of the &&) will also run. If the first command fails, then the second command will not run. To better understand how this works, consider first the concept of failure and success for commands. Commands succeed when they work properly and fail when something goes wrong. For example, consider the ls /etc/xml command line. The command will succeed if the /etc/xml directory is accessible and fail if it isn't. For example, the first command will succeed because the /etc/xml directory exists and is accessible while the second command will fail because there is no /junk directory: sysadmin@localhost:~$ ls /etc/xml
catalog
catalog.old
xml-core.xml
xml-core.xml.old
sysadmin@localhost:~$ ls /etc/junk
ls: cannot access /etc/junk: No such file or directory sysadmin@localhost:~$
The way that you would use the success or failure of the ls command in conjunction with && would be to execute a command line like the following:
sysadmin@localhost:~$ ls /etc/xml && echo success
catalog
catalog.old
xml-core.xml
xml-core.xml.old
success sysadmin@localhost:~$ ls /etc/junk && echo success
ls: cannot access /etc/junk: No such file or directory sysadmin@localhost:~$
In the first example above, the echo command executed because the ls command succeeded. In the second example, the echo command wasn't executed because the ls command failed.
4.13.3 Double Pipe The double pipe || is a logical "or". It works in a similar way to &&; depending on the result of the first command, the second command will either run or be skipped. With the double pipe, if the first command runs successfully, the second command is skipped; if the first command fails, then the second command will be run. In other words, you are essentially telling the shell, "Either run this first command or the second one”. In the following example, the echo command will only execute if the ls command fails: sysadmin@localhost:~$ ls /etc/xml || echo failed
catalog
catalog.old
xml-core.xml
xml-core.xml.old
sysadmin@localhost:~$ ls /etc/junk || echo failed
ls: cannot access /etc/junk: No such file or directory failed sysadmin@localhost:~$
Lab 4 4.1 Introduction This is Lab 4: Command Line Basics. By performing this lab, students will learn how to use basic features of the shell. In this lab, you will perform the following tasks:
Explore Bash features Use shell variables Understand how to use globbing Be able to make use of quoting
4.2 Files and Directories In this task, we will access the Command Line Interface (CLI) for Linux to explore how to execute basic commands and what affects how they can be executed. Most users are probably more familiar with how commands are executed using a Graphical User Interface (GUI). So, this task will likely present some new concepts to you, if you have not previously worked with a CLI. To use a CLI, you will need to type the command that you want to run. The window where you will type your command is known as a terminal emulator application. Inside of the Terminal window the system is displaying a prompt, which currently contains a prompt following by a blinking cursor: sysadmin@localhost:~$
Remember: You may need to press Enter in the window to display the prompt. The prompt tells you that you are user sysadmin ; the host or computer you are using: localhost ; and the directory where you are at: ~, which represents your home directory. When you type a command it will appear at the text cursor. You can use keys such as the home, end, backspace, and arrow keys for editing the command you are typing. Once you have typed the command correctly, press Enter to execute it.
4.2.1 Step 1 The following command will display the same information that you see in the first part of the prompt. Make sure that you have selected (clicked on) the Terminal window first and then type the following command followed by the Enter key:
whoami
Your output should be similar to the following: sysadmin@localhost:~$ whoami
sysadmin sysadmin@localhost:~$
The output of the whoami command, sysadmin , displays the user name of the current user. Although in this case your username is displayed in the prompt, this command could be used to obtain this information in a situation when the prompt did not contain this information.
4.2.2 Step 2 The next command displays information about the current system. To be able to see the name of the kernel you are using, type the following command into the terminal: uname
Your output should be similar to the following: sysadmin@localhost:~$ uname
Linux
Many commands that are executed produce text output like this. You can change what output is produced by a command by using options after the name of the command. Options for a command can be specified in several ways. Traditionally in UNIX, options were expressed by a hyphen followed by another character, for example: -n. In Linux, options can sometimes also be given by two hyphen characters followed by a word, or hyphenated word, for example: --nodename. Execute the name command again twice in the terminal, once with the option -n and again with the option --nodename. This will display the network node hostname, also found in the prompt. uname -n uname --nodename
Your output should be similar to the following: sysadmin@localhost:~$ uname -n
localhost sysadmin@localhost:~$ uname --nodename
localhost
4.2.3 Step 3 The pwd command is used to display your current "location" or current "working" directory. Type the following command to display the working directory:
pwd
Your output should be similar to the following: sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$
The current directory in the example above is /home/sysadmin. This is also referred to as your home directory, a special place where you have control of files and other users normally have no access. By default, this directory is named the same as your username and is located underneath the /home directory. As you can see from the output of the command, /home/sysadmin, Linux uses the forward slash / to separate directories to make what is called a path. The initial forward slash represents the top level directory, known as the root directory. More information regarding files, directories and paths will be presented in later labs. The tilde ~ character that you see in your prompt is also indicating what the current directory is. This character is a "shortcut" way to represent your home. Consider This: pwd stands for "print working directory". While it doesn't actually "print" in modern versions, older UNIX machines didn't have monitors and output of commands went to a printer, hence the funny name of pwd .
4.3 Shell Variables Shell variables are used to store data in Linux. This data is used by the shell itself as well as programs and users. The focus of this section is to learn how to display the values of shell variables.
4.3.1 Step 1 The echo command can be used to print text, the value of a variable and show how the shell environment expands metacharacters (more on metacharacters later in this lab). Type the following command to have it output literal text: echo Hello Student
Your output should be similar to the following: sysadmin@localhost:~$ echo Hello Student
Hello Student sysadmin@localhost:~$
4.3.2 Step 2 Type the following command to display the value of the PATH variable: echo $PATH
Your output should be similar to the following: sysadmin@localhost:~$ echo $PATH
/home/sysadmin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/s bin:/bin:/usr/games sysadmin@localhost:~$
The PATH variable is displayed by placing a $ character in front of the name of the variable. This variable is used to find the location of commands. Each of the directories listed above are searched when you run a command. For example, if you try to run the date command, the shell will first look for the command in the /home/sysadmin/bin directory and then in the /usr/local/sbin directory and so on. Once the date command is found, the shell "runs it”.
4.3.3 Step 3 Use the which command to determine if there is an executable file named date that is located in a directory listed in the PATH value: which date
Your output should be similar to the following: sysadmin@localhost:~$ which date
/bin/date sysadmin@localhost:~$
The output of the which command tells you that when you execute the date command, the system will run the command /bin/date . The which command makes use of the PATH variable to determine the location of the date command.
4.4 Globbing The use of glob characters in Linux is similar to what many operating systems refer to as "wildcard" characters. Using glob characters, you match filenames using patterns. Glob characters are a shell feature, not something that is particular to any specific command. As a result, you can use glob char acters with any Linux command. When glob characters are used, the shell will "expand" the entire pattern to match all files in the specified directory that matches the pattern. For demonstration purposes, we will use the echo command to display this expansion process.
4.4.1 Step 1 Use the following echo command to display all filenames in the current directory that matches the glob pattern *: echo *
Your output should be similar to the following:
sysadmin@localhost:~$ echo *
Desktop Documents Downloads Music Pictures Public Templates Videos sysadmin@localhost:~$
The asterisk * matches "zero or more" characters in a file name. In the example above, this results in matching all filenames in the current directory. The echo command, in turn, displays the filenames that were matched.
4.4.2 Step 2 The following commands will display all the files in the current directory that start with the letter D, and the letter P: echo D* echo P*
Your output should be similar to the following: sysadmin@localhost:~$ echo D*
Desktop Documents Downloads sysadmin@localhost:~$ echo P*
Pictures Public sysadmin@localhost:~$
Think of the first example, D*, as "match all filenames in the current directory that begin with a capital d character and have zero or more of any other character after the D".
4.4.3 Step 3 The asterisk * can be used anywhere in the string. The following command will display all the files in your current directory that end in the letter s: echo *s
Your output should be similar to the following: sysadmin@localhost:~$ echo *s
Documents Downloads Pictures Templates Videos sysadmin@localhost:~$
4.4.4 Step 4 Notice that the asterisk can also appear multiple times or in the middle of several characters: echo D*n*s
Your output should be similar to the following:
sysadmin@localhost:~$ echo D*n*s
Documents Downloads sysadmin@localhost:~$
The next glob metacharacter that we will examine is the question mark ?. The question mark matches exactly one character. This single character can be any possible character. Like the asterisk it can be used anywhere in a string and can appear multiple times.
4.4.5 Step 5 Since each question mark matches one unknown character, typing six of them will match six character filenames. Type the following to display the filenames that are exactly six characters long: echo ??????
Your output should be similar to the following: sysadmin@localhost:~$ echo ??????
Public Videos sysadmin@localhost:~$
Important: Each ? character must match exactly one character in a filename, no more and no less than one character.
4.4.6 Step 6 Using the question mark with other characters will limit the matches. Type the following to display the file names that start with the letter D and are exactly nine characters long: echo D????????
Your output should be similar to the following: sysadmin@localhost:~$ echo D????????
Documents Downloads sysadmin@localhost:~$
4.4.7 Step 7 Wildcards or glob characters can be combined together. The following command will display file names that are at least six characters long and end in the letter s. echo ?????*s
Your output should be similar to the following: sysadmin@localhost:~$ echo ?????*s
Documents Downloads Pictures Templates Videos sysadmin@localhost:~$
Think of the pattern ?????*s to mean "match filenames that begin with any five characters, then have zero or more of any characters and then end with an s character".
4.4.8 Step 8 The next glob is similar to the question mark glob to specify one character. This glob uses a pair of square brackets [ ] to specify which one character will be allowed. The allowed characters can be specified as a range, a list, or by what is known as a character class. The allowed characters can also be negated with an exclamation point !. In the first example, the first character of the file name can be either a D or a P. In the second example, the first character can be any character except a D or P: echo [DP]* echo [!DP]*
Your output should be similar to the following: sysadmin@localhost:~$ echo [DP]*
Desktop Documents Downloads Pictures Public sysadmin@localhost:~$ echo [!DP]*
Music Templates Videos sysadmin@localhost:~$
4.4.9 Step 9 In these next examples, a range of characters will be specified. In the first example, the first character of the file name can be any character starting at D and ending at P. In the second example, this range of characters is negated, meaning any single c haracter will match as long as it is not between the letters D and P: echo [D-P]* echo [!D-P]*
Your output should be similar to the following: sysadmin@localhost:~$ echo [D-P]*
Desktop Documents Downloads Music Pictures Public sysadmin@localhost:~$ echo [!D-P]*
Templates Videos sysadmin@localhost:~$
You may be asking yourself "who decides what letters come between D and P ?". In this case the answer is fairly obvious (E, F, G, H, I, J, K, L, M, N and O), but what if the range was [1-A]? The ASCII text table is used to determine the range of characters. You can view this table by searching for it on the Internet or typing the following command: ascii So, what characters does the glob [1-A] match? According to the ASCII text table: 1, 2, 3, 4, 5, 6, 7, 8, 9, :, ;, <, =, >, ?, @ and A.
4.5 Quoting There are three types of quotes used by the Bash shell: single quotes ( '), double quotes ( ") and back quotes ( `). These quotes have special features in the Bash shell as described below. To understand single and double quotes, consider that there are times that you don't want the shell to treat some characters as "special". For example, as you learned earlier in this lab, the * character is used as a wildcard. What if you wanted the * character to just mean an asterisk? Single quotes prevent the shell from "interpreting" or expanding all special characters. Often single quotes are used to protect a string from being changed by the shell, so that the string can be interpreted by a command as a parameter to affect the way the command is executed. Double quotes stop the expansion of glob characters like the asterisk (*), question mark ( ?), and square brackets ( [ ] ). Double quotes do allow for both variable expansion and command substitution (see back quotes) to take place. Back quotes cause "command substitution" which allows for a command to be executed within the line of another command. When using quotes, they must be entered in pairs or else the shell will not consider the command complete. While single quotes are useful for blocking the shell from interpreting one or more characters, the shell also provides a way to block the interpretation of just a single character called "escaping" the character. To "escape" the special meaning of a she ll metacharacter, the backslash \ character is used as a prefix to that one character.
4.5.1 Step 1 Execute the following command to use back quotes ` to execute the date command within the line of the echo command: echo Today is `date`
Your output should be similar to the following: sysadmin@localhost:~$ echo Today is `date`
Today is Tue Jan 19 15:48:57 UTC 2016 sysadmin@localhost:~$
4.5.2 Step 2
You can also place $( before the command and ) after the command to accomplish command substitution: echo Today is $(date)
Your output should be similar to the following: sysadmin@localhost:~$ echo Today is $(date)
Today is Tue Jan 19 15:51:09 UTC 2016 sysadmin@localhost:~$
Why two different methods that accomplish the same thing? Backquotes look very similar to single quotes, making it harder to "see" what a command is supposed to do. Originally shells used backquotes; the $(command ) format was added in a later version of the Bash shell to make the statement more visually clear.
4.5.3 Step 3 If you don't want the backquotes to be used to execute a command, place single quotes around them. Execute the following: echo This is the command '`date`'
Your output should be similar to the following: sysadmin@localhost:~$ echo This is the command '`date`'
This is the command `date` sysadmin@localhost:~$
4.5.4 Step 4 Note that you could also place a backslash character in front of each backquote character. Execute the following: echo This is the command \`date\`
Your output should be similar to the following: sysadmin@localhost:~$ echo This is the command \`date\`
This is the command `date` sysadmin@localhost:~$
4.5.5 Step 5 Double quote characters don't have a ny effect on backquote characters. The shell will still use them as command substitution. Execute the following to see a demonstration: echo This is the command "`date`"
Your output should be similar to the following: sysadmin@localhost:~$ echo This is the command "`date`"
This is the command Tue Jan 19 16:05:41 UTC 2016 sysadmin@localhost:~$
4.5.6 Step 6 Double quote characters will have an effect on wildcard characters, disabling their special meaning. Execute the following: echo D* echo "D*"
Your output should be similar to the following: sysadmin@localhost:~$ echo D*
Desktop Documents Downloads sysadmin@localhost:~$ echo "D*"
D* sysadmin@localhost:~$
Important: Quoting may seem trivial and weird at the moment, but as you gain more experience working in the command shell, you will discover that having a good understanding of how different quotes work is critical to using the shell.
4.6 Control Statements Typically, you type a single command and you execute it when you press Enter. The Bash shell offers three different statements that can be used to separate multiple commands typed together. The simplest separator is the semicolon ( ;). Using the semicolon between multiple commands allows for them to be executed one right after another, sequentially from left to right. The && characters create a logical and statement. Commands separated by && are conditionally executed. If the command on the left of the && is successful, then the command to the right of the && will also be executed. If the command to the left of the && fails, then the command to the right of the && is not executed. The || characters create a logical or statement, which also causes conditional execution. When commands are separated by ||, then only if the command to the left fails, does the command to the right of the || execute. If the command to the left of the || succeeds, then the command to the right of the || will not execute. To see how these control statements work, you will be using two special executables: true and false. The true executable always succeeds when it executes, whereas, the false executable always fails. While this may not provide you with realistic examples of how && and || work, it does provide a means to demonstrate how they work without having to introduce new commands.
4.6.1 Step 1 Execute the following three commands together separated by semicolons: echo Hello; echo Linux; echo Student
As you can see the output shows all three commands executed sequentially: sysadmin@localhost:~$ echo Hello; echo Linux; echo Student
Hello Linux Student sysadmin@localhost:~$
4.6.2 Step 2 Now, put three commands together separated by semicolons, where the first command executes with a failure result: false; echo Not; echo Conditional
Your output should be similar to the following: sysadmin@localhost:~$ false; echo Not; echo Conditional
Not Conditional sysadmin@localhost:~$
Note that in the previous example, all three commands still executed even though the first one failed. While you can't see from the output of the false command, it did execute. However, when commands are separated by the ; character, they are completely independent of each other.
4.6.3 Step 3 Next, use "logical and" to separate the commands: echo Start && echo Going && echo Gone
Your output should be similar to the following: sysadmin@localhost:~$ echo Start && echo Going && echo Gone
Start Going Gone sysadmin@localhost:~$
Because each echo statement executes correctly, a return value of success is provided, allowing the next statement to also be executed.
4.6.4 Step 4 Use "logical and" with a command that fails as shown below: echo Success && false && echo Bye
Your output should be similar to the following: sysadmin@localhost:~$ echo Success && false && echo Bye
Success sysadmin@localhost:~$
The first echo command succeeds and we see its output. The false command executes with a failure result, so the last echo statement is not executed.
4.6.5 Step 5 The "or" characters separating the following commands demonstrates how the failure before the "or" statement causes the command after it to execute; however, a successful first statement causes the command to not execute: false || echo Fail Or true || echo Nothing to see here
Your output should be similar to the following: sysadmin@localhost:~$ false || echo Fail Or
Fail Or sysadmin@localhost:~$ true || echo Nothing to see here sysadmin@localhost:~$
4.7 Shell History The Bash shell maintains a history of the commands that you type. Previous commands can be easily accessed in this history in several ways. The first and easiest way to recall a previous command is to use the up arrow key. Each press of the up arrow key goes backwards one command through history. If you accidentally go back too far, then the down arrow key will go forwards through the history of commands. When you find the command that you want to execute, you can use the left arr ow keys and right arrow ke ys to position the cursor for editing. Other useful keys for editing include the Home, End, Backspace and Delete keys. Another way to use your command history is to execute the history command to be able to view a numbered history list. The number listed to the left of the command can be used to execute the command again. The history command also has a number of options and arguments which can manipulate which commands will be stored or displayed.
4.7.1 Step 1 Execute a few commands and then execute the history command: date clear echo Hi history
Remember: The date command will print the time and date on the system. The clear command clears the screen. Your output should be similar to the following: sysadmin@localhost:~$ echo Hi
Hi sysadmin@localhost:~$ history
1 date 2 clear 3 echo Hi 4 history sysadmin@localhost:~$
Your command numbers will likely differ from those provided above. This is because you likely have executed a different number of commands.
4.7.2 Step 2 To view a limited number of commands, the history command can take a number as a parameter to display exactly that many recent entries. Type the following command to display the last five commands from your history: history 5
Your output should be similar to the following: sysadmin@localhost:~$ history 5
185
false || Fail Or
186
false || echo Fail Or
187
true || echo Nothing to see here
188
history
189
history 5
sysadmin@localhost:~$
4.7.3 Step 3
To execute a command again, type the exclamation point and the history list number. For example, to execute the 94th command in your history list, you would execute the following: !94
4.7.4 Step 4 Next, experiment with accessing your history using the up arrow ke ys and down arrow keys. Keep pressing the up arrow key until you find a command you want to execute. If necessary use other keys to edit the command and then press Enter to execute the command.
Chapter 5 5.1 Introduction If you ask users what feature of the Linux Operating System they most enjoy, many will answer "the power provided by the command line environment". This is because there are literally thousands of commands available with many options, making them powerful tools. However, with this power comes complexity. Complexity, in turn, can create confusion. As a result, knowing how to find help while working in Linux is an essential skill for any user. Referring to help allows you to be reminded of how a command works, as well as being an information resource when learning new commands.
5.2 man Pages As previously mentioned, UNIX was the operating system from which the Linux foundation was built. The developers of UNIX created help documents called man pages (man stands for manual). Man pages are used to describe the features of commands. They will provide you with a basic description of the purpose of the command, as well as provide details regarding the options of the command.
5.2.1 Viewing man pages To view a man page for a command, execute man command in a terminal window. For example, the command man cal will display the man page for the cal command: CAL(1) )
BSD General Commands Manual
CAL(1
NAME
cal, ncal -- displays a calendar and the date of Easter
SYNOPSIS
cal [-3hjy] [-A number] [-B number] [[month] year] cal [-3hj] [-A number] [-B number] -m month [year] ncal [-3bhjJpwySM] [-A number] [-B number] [-s country_code] [[month] year] ncal [-3bhJeoSM] [-A number] [-B number] [year] ncal [-CN] [-H yyyy-mm-dd] [-d yyyy-mm]
DESCRIPTION
The cal utility displays a simple calendar in traditional format an d ncal offers an alternative layout, more options and the date of Easter. The new format is a little cramped but it makes a year fit on a 25x80 terminal. If arguments are not specified, the current month is displayed.
The options are as follows: -h
Turns off highlighting of today.
Manual page cal(1) line 1 (press h for help or q to quit)
5.2.2 Controlling the man Page Display The man command uses a "pager" to display documents. Normally this pager is the less command, but on some distributions it may be the more command. Both are very similar in how they perform and will be discussed in more detail in a later chapter. If you want to view the various movement commands that are available, you can type the letter h while viewing a man page. This will display a help page: Note: If you are working on a Linux distribution that uses the more command as a pager, your output will be different than the example shown here. SUMMARY OF LESS COMMANDS
Commands marked with * may be preceded by a number, N. Notes in parentheses indicate the behavior if N is given. A key preceded by a caret indicates the Ctrl key; thus ^K is ctrl-K .
h
H
q
:q
Display this help. Q
:Q
ZZ
Exit.
----------------------------------------------------------------------
MOVING
e
^E
j
^N
CR
*
Forward
one line
(or N lines).
y
^Y
k
^K
^P
*
Backward one line
(or N lines).
f
^F
^V
SPACE
*
Forward
b
^B
ESC-v
*
Back ward one window (or N lines).
z
*
Forward
w
*
Backward one window (and set window to N).
ESC-SPACE .
*
Forward
one window, but don't stop at end-of-file
d )
^D
*
Forward
one half-window (and set half-window to N
u )
^U
*
Backward one half-window (and set half-window to N
ESC-)
RightArrow *
Left
one window (or N lines).
one window (and set window to N).
one half screen width (or N positions).
HELP -- Press RETURN for more, or q when done
If your distribution uses the less command, you might be a bit overwhelmed with the large number of "commands" that are available. The following table provides a summary of the more useful commands: Command
Function
Return (or Enter)
Go down one line
Space
Go down one page
/ term
Search for term
n
Find next search item
1G
Go to beginning
G
Go to end
Command
Function
h
Display help
q
Quit man page
5.2.3 Sections of the man Page Man pages are broken into sections. Each section is designed to provide specific information about a command. While there are common sections that you will see in most man pages, some developers also create sections that you will only see in a specific man page. The following table describes some of the more common sections that you will find in man pages: Section name Purpose NAME
Provides the name of the command and a very brief description.
SYNOPSIS
Provides examples of how the command is executed. See below for more information.
DESCRIPTION Provides a more detailed description of the command.
OPTIONS
Lists the options for the command as well as a description of how they are used. Often this information will be found in the DESCRIPTION section and not in a separate OPTIONS section.
FILES
Lists the files that are associated with the command as well as a description of how they are used. These files may be used to configure the command's more advanced features. Often this information will be found in the DESCRIPTION section and not in a separate OPTIONS section.
AUTHOR
The name of the person who created the man page and (sometimes) how to contact the person.
REPORTING BUGS
Provides details on how to report problems with the command.
COPYRIGHT
Provides basic copyright information.
SEE ALSO
Provides you with an idea of where you can find additional information. This also will often include other commands that are related to this command.
5.2.4 man Page SYNOPSIS Section The SYNOPSIS section of a man page can be difficult to understand, but is very important because it provides a concise example of how to use the command. For example, consider the SYNOPSIS of the man page for the cal command: SYNOPSIS cal [-3hjy] [-A number] [-B number] [[[day] month] year]
The square brackets [ ] are used to indicate that this feature is not required to run the command. For example, [-3hjy]means you can use the options -h, -j, -y, 1 or 3, but none of these options are required for the cal command to function properly. The second set of square brackets in the cal SYNOPSIS ( [[[day] month] year] ) demonstrates another feature; it means that you can specify a year by itself, but if you specify a month you must also specify a year. Additionally, if you specify a day then you also need to specify a month and a year. Another component of the SYNOPSIS that might cause some confusion can be seen in the SYNOPSIS of the date command: SYNOPSIS date [OPTION]... [+FORMAT] date [-u|--utc|--universal] [MMDDhhmm[[CC]YY][.ss]]
In this SYNOPSIS there are two syntaxes for the date command. The first one is used to display the date on the system while the second is used to set the date. The ellipses following [OPTION] , , indicate that one or more of the items before it may be used. …
Additionally the >[-u|--utc|--universal] notation means that you can either use the u option or the --utc option or the --universal option. Typically this means that all three options really do the same thing, but sometimes this format (use of the | character) is used to indicate that the options can't be used in combination, like a logical “or".
5.2.5 Searching Within a man Page In order to search a man page for a term, press the / and type the term followed by the Enter key. The program will search from the current location down towards the bottom of the page to try to locate and highlight the term. If the term is not found, or you have reached the end of the matches, then the program will report Pattern not found (press Return) . If a match is found and you want to move to the next match of the term, press n. To return to a previous match of the term, press N.
5.2.6 man Pages Categorized by Sections Until now, we have been displaying man pages for commands. However, sometimes configuration files also have man pages. Configuration files (sometimes called system files) contain information that is used to store information about the Operating System or services. Additionally, there are several different types of commands (user commands, system commands, and administration commands) as well as other features that require documentation, such as libraries and Kernel components. As a result, there are thousands of man pages on a typical Linux distribution. To organize all of these man pages, the pages are categorized by sections, much like each individual man page is broken into sections. Consider This: By default there are nine default sections of man pages:
Executable programs or shell commands
System calls (functions provided by the kernel) Library calls (functions within program libraries) Special files (usually found in /dev) File formats and conventions, e.g. /etc/passwd Games Miscellaneous (including macro packages and conventions), e.g. man(7), groff(7) System administration commands (usually only for root) Kernel routines [Non standard]
When you use the man command, it searches each of these sections in order until it finds the first "match". For example, if you execute the command man cal, the first section (Executable programs or shell commands) is searched for a man page called cal. If not found, then the second section is searched. If no man page is found after searching all sections, you will receive an error message: sysadmin@localhost:~$ man zed
No manual entry for zed sysadmin@localhost:~$
5.2.6.1 Determining Which Section To determine which section a specific man page belongs to, look at the numeric value on the first line of the output of the man page. For example, if you execute the command man cal, you will see that the cal command belongs to the first section of man pages: CAL(1) )
BSD General Commands Manual
CAL(1
5.2.6.2 Specifying a Section In some cases you will need to specify the section in order to display the correct man page. This is necessary because sometimes there will be man pages with the same name in different sections. For example, there is a command called passwdthat allows you to change your password. There is also a file called passwdthat stores account information. Both the command and the file have a man page. The passwd command is a "user" command, so the command man passwd will display the man page for the passwd command by default: PASSWD(1) )
User Commands
PASSWD(1
To specify a different section, provide the number of the section as the first argument of the man command. For example, the command man 5 passwd will look for the passwd man page just in section 5: PASSWD(5) )
File Formats and Conversions
5.2.6.3 Searching Sections
PASSWD(5
Sometimes it isn't clear what section a man page is stored in. In cases like this, you can search for a man page by name. The -f option to the man command will display man pages that match, or partially match, a specific name and provide a brief description of each man page: sysadmin@localhost:~$ man -f passwd
passwd (5)
- the password file
passwd (1)
- change user password
passwd (1ssl)
- compute password hashes
sysadmin@localhost:~$
Note that on most Linux distributions, the whatis command does the same thing as man -f. On those distributions, both will produce the same output.
5.2.7 Searching man Pages by Keyword Unfortunately, you won't always remember the exact name of the man page that you want to view. In these cases you can search for man pages that match a keyword by using the -k option to the man command. For example, what if you knew you wanted a man page that displays how to change your password, but you didn't remember the exact name? You could run the command man -k password: sysadmin@localhost:~$ man -k passwd
chgpasswd (8)
- update group passwords in batch mode
chpasswd (8)
- update passwords in batch mode
fgetpwent_r (3)
- get passwd file entry reentrantly
getpwent_r (3)
- get passwd file entry reentrantly
gpasswd (1)
- administer /etc/group and /etc/gshadow
pam_localuser (8)
- require users to be listed in /etc/passwd
passwd (1)
- change user password
passwd (1ssl)
- compute password hashes
passwd (5)
- the password file
passwd2des (3)
- RFS password encryption
update-passwd (8) group
- safely update /etc/passwd, /etc/shadow and /etc/
sysadmin@localhost:~$
When you use this option, you may end up with a large amount of output. The preceding command, for example, provided over 60 results. Recall that there are thousands of man pages, so when you search for a keyword, be as specific as possible. Using a generic word, such as "the" could result in hundreds or even thousands of results. Note that on most Linux distributions, the apropos command does the same thing as man k. On those distributions, both will produce the same output.
5.3 info Command Man pages are great sources of information, but they do tend to have a few disadvantages. One example of a disadvantage is that each man page is a separate document, not related to any other man page. While some man pages have a SEE ALSO section that may refer to other man pages, they really tend to be unrelated sources of documentation. The info command also provides documentation on operating system commands and features. The goal of this command is slightly different from man pages: to provide a documentation resource that provides a logical organizational structure, making reading documentation easier. Within info documents, information is broken down into categories that work much like a table of contents that you would find in a book. Hyperlinks are provided to pages with information on individual topics for a specific command or feature. In fact, all of the documentation is merged into a single "book" in which you can go to the top level of documentation and view the table of contents representing all of the documentation available. Another advantage of info over man pages is that the writing style of info documents is typically more conducive to learning a topic. Consider man pages to be more of a reference resource and info documents to be more of a learning guide.
5.3.1 Displaying Info Documentation for a Command To display the info documentation for a command, execute info command (replace command with the name of the command that you are seeking information about). For example, the following demonstrates the output of the command info ls: File: coreutils.info, Directo\ry listing
Node: ls invocation,
Next: dir invocation,
Up:
10.1 `ls': List directory contents ==================================
The `ls' program lists information about files (of any type, includ ing directories). Options and file arguments can be intermixed arbitra rily, as usual.
For non-option command-line arguments that are directories, by default `ls' lists the contents of directories, not recursively, and omitting files with names beginning with `.'.
For other non-option
arguments, by default `ls' lists just the file name.
If no non-opt ion
argument is specified, `ls' operates on the current directory, acting as if it had been invoked with a single argument of `.'.
By default, the output is sorted alphabetically, according to the
locale settings in effect.(1) If standard output is a terminal, the output is in columns (sorted vertically) and control characters are output as question marks; otherwise, the output is listed one per line and control characters are output as-is. --zz-Info: (coreutils.info.gz)ls invocation, 58 lines --Top-----------Welcome to Info version 5.2. Type h for help, m for menu item.
Notice that the first line provides some information that tells you where you are in the info documentation. This documentation is broken up into nodes and in the example above you are currently in the ls invocation node. If you went to the next node (like going to the next chapter in a book), you would be in the dir invocation node. If you went up one level you would be in the Directory listing node.
5.3.2 Moving Around While Viewing an info Document Like the man command, you can get a listing of movement commands by typing the letter h while reading the info documentation: Basic Info command keys l
Close this help window.
q
Quit Info altogether.
H
Invoke the Info tutorial.
Up
Move up one line.
Down
Move down one line.
DEL
Scroll backward one screenful.
SPC
Scroll forward one screenful.
Home
Go to the beginning of this node.
End
Go to the end of this node.
TAB
Skip to the next hypertext link.
RET
Follow the hypertext link under the cursor.
l
Go back to the last node seen in this window.
[
Go to the previous node in the document.
]
Go to the next node in the document.
p
Go to the previous node on this level.
n
Go to the next node on this level.
u
Go up one level.
-----Info: *Info Help*, 466 lines --Top--------------------------------
Note that if you want to close the help screen, you type the letter l. This brings you back to your document and allows you to continue reading. To quit entirely, you type the letter q. The following table provides a summary of useful commands: Command
Function
Down arrow ↓
Go down one line
Space
Go down one page
s
Search for term
[
Go to previous node
]
Go to next node
u
Go up one level
TAB
Skip to next hyperlink
HOME
Go to beginning
END
Go to end
h
Display help
L
Quit help page
q
Quit info command
If you scroll though the document, you will eventually see the menu for the ls command: * Menu:
* Which files are listed:: * What information is listed:: * Sorting the output:: * Details about version sort:: * General output formatting:: * Formatting file timestamps:: * Formatting the file names::
---------- Footnotes ----------
(1) If you use a non-POSIX locale (e.g., by setting `LC_ALL' to `en_US'), then `ls' may produce output that is sorted differently than you're accustomed to.
In that case, set the `LC_ALL' environment
variable to `C'. --zz-Info: (coreutils.info.gz)ls invocation, 58 lines --Bot------------
The items under the menu are hyperlinks that can take you to nod es that describe more about the ls command. For example, if you placed your cursor on the line "* Sorting the output:: " and pressed the Enter key, you would be taken to a node that describes sorting the output of the ls command: File: coreutils.info, Node: Sorting the output, ersion s\ort, Prev: What information is listed,
Next: Details about v Up: ls invocation
10.1.3 Sorting the output ------------------------These options change the order in which `ls' sorts the information it outputs.
By default, sorting is done by character code (e.g., ASCII
order).
`-c' `--time=ctime' `--time=status' If the long listing format (e.g., `-l', `-o') is being used, print the status change time (the `ctime' in the inode) instead of the modification time.
When explicitly sorting by time (`--sort=ti me'
or `-t') or when not using a long listing format, sort according to the status change time.
`-f' Primarily, like `-U'--do not sort; list the files in whatever order they are stored in the directory.
But also enable `-a' (lis
--zz-Info: (coreutils.info.gz)Sorting the output, 68 lines --Top-------
Note that by going into the node about sorting, you essentially went into a sub -node of the one in which you originally started. To go back to your previou s node, you can use the u key. While u will take you to the start of the node one level up, you could also use the l key to return you exactly to the previous location that you were before entering the sorting node.
5.3.3 Exploring info Documentation Instead of using info documentation to look up information about a specific command or feature, consider exploring the capabilities of Linux by reading through the info documentation. If you execute the info command without any arguments, you are taken to the top level of the documentation. From there you can explore many features: File: dir,
Node: Top
This is the top of the INFO tree
This (the Directory node) gives a menu of major topics. Typing "q" exits, "?" lists all Info commands, "d" returns here, "h" gives a primer for first-timers, "mEmacslt
" visits the Emacs manual, etc. In Emacs, you can click mouse button 2 on a menu item or cross refere n ce to select it. * Menu:
Basics * Common options: (coreutils)Common options. * Coreutils: (coreutils).
Core GNU (file, text, shell) utilities.
* Date input formats: (coreutils)Date input formats. * File permissions: (coreutils)File permissions. Access modes. * Finding files: (find). .
Operating on files matching certain criteria
C++ libraries * autosprintf: (autosprintf).
Support for printf format strings in C+
-----Info: (dir)Top, 211 lines --Top----------------------------------Welcome to Info version 5.2. Type h for help, m for menu item.
5.4 Additional Sources of Help In many cases, you will find that either man pages or info documentation will provide you with the answers you need. However, in some cases, you may need to look in other locations.
5.4.1 Using the --help Option Many commands will provide you basic information, very similar to the SYNOPSIS found in man pages, when you apply the --help option to the command. This is useful to learn the basic usage of a command: sysadmin@localhost:~$
ps --help
********* simple selection ********* ****
********* selection by list *****
-A all processes
-C by command name
-N negate selection es)
-G by real group ID (supports nam
-a all w/ tty except session leaders s)
-U by real user ID (supports name
-d all except session leaders up name
-g by session OR by effective gro
-e all processes
-p by process ID
T n
all processes on this terminal
-s processes in the sessions give
a
all w/ tty, including other users
-t by tty
g OBSOLETE -- DO NOT USE names)
-u by effective user ID (supports
r
only running processes
U
processes for specified users
x
processes w/o controlling ttys
t
b y tty
*********** output format ********** ***
*********** long options ********
-o,o user-defined d
-f full
--Group --User --pid --cols --ppi
-j,j job control o
s
signal
--group --user --sid --rows --inf
-O,O preloaded -o
v
virtual memory
--cumulative --format --deselect
-l,l long
u
user-oriented
--sort --tty --forest --version
-F
X
registers
--heading --no-headi
extra full
********* misc options ********* -V,V
sho w version
-m,m,-L,-T,H
threads
L
list format codes
f
ASCII art forest
S
children in sum
-y change -l format
-M,Z
security data
c
true command name
-c scheduling class
-w,w
wide output
n
numeric WCHAN,UID
-H process hierarchy
sysadmin@localhost:~$
5.4.2 Additional System Documentation On most systems, there is a directory where additional documentation is found. This will often be a place where vendors who create additiona l (third party) software can store documentation files. Typically, this will be a place where system administrators will go to learn how to set up more complex software services. However, sometimes regular users will also find this documentation to be useful. These documentation files are often called "readme" files, since the files typically have names such as README or readme.txt . The location of these files can vary depending on the distribution that you are using. Typical locations include /usr/share/doc and /usr/doc .
5.5 Finding Commands and Documentation Recall that the whatis command (or man -f) will tell you which section a man page is stored in. If you use this command often enough, you will likely come across an unusual output, such as the following:
sysadmin@localhost:~$ whatis ls
ls (1)
- list directory contents
ls (lp)
- list directory contents
sysadmin@localhost:~$
Based on this output, there are two commands tha t list directory contents. The simple answer to why there are two ls commands is that UNIX had two main variants, which resulted in some commands being developed "in parallel". This resulted in some commands behaving differently on different variants of UNIX. Many modern distributions of Linux include commands from both UNIX variants. This does, however, pose a bit of a problem: when you run the ls command, which command is executed? The focus of the next few sections will be to answer this question as well as to provide you with the tools to find where these files reside on the system.
5.5.1 Where Are These Commands Located? To search for the location of a command or the man pages for a command, use the whereis command. This command searches for commands, source files and man pages in specific locations where these files are typically stored: sysadmin@localhost:~$ whereis ls
ls: /bin/ls /usr/share/man/man1p/ls.1.gz /usr/share/man/man1/ls.1.gz sysadmin@localhost:~$
Man pages are normally easily distinguished between commands as they are normally compressed with a command called gzip, resulting in a filename that ends in .gz. The interesting note is that you see there are two man pa ges listed, but only one command (/bin/ls). This is because the ls command can be used with the options/features that are described by either man page. So, when you are learning what you can do with the ls command, you can explore both man pages. Fortunately, this is more of an exception a s most commands only have one man page.
5.5.2 Find Any File or Directory The whereis command is designed to specifically find commands and man pages. While this is useful, there are times where you want to find a file or directory, not just files that are commands or man pages. To find any file or directory, you can use the locate command. This command will search a database of all files and director ies that were on the system when the database was created. Typically, the command to generate this database is run nightly. sysadmin@localhost:~$ locate gshadow
/etc/gshadow /etc/gshadow/usr/include/gshadow.h /usr/share/man/cs/man5/gshadow.5.gz /usr/share/man/da/man5/gshadow.5.gz
/usr/share/man/de/man5/gshadow.5.gz /usr/share/man/fr/man5/gshadow.5.gz /usr/share/man/it/man5/gshadow.5.gz /usr/share/man/man5/gshadow.5.gz /usr/share/man/ru/man5/gshadow.5.gz /usr/share/man/sv/man5/gshadow.5.gz /usr/share/man/zh_CN/man5/gshadow.5.gz sysadmin@localhost:~$
Any files that you created today will not normally be searchable with the locate command. If you have access to the system as the root user (the system administrator account), you can manually update the locate database by running the updatedb command. Regular users cannot update the database file. Also note that when you use the locate command as a regular user, your output may be limited due to file permissions. Essentially, if you don't have access to a file or directory on the filesystem due to permissions, the locate command won't return those names. This is a security feature designed to keep users from "exploring" the filesystem by using the locate database. The root user can search for any file in the locate database.
5.5.3 Count the Number of Files The output of the locate command can be quite large. When you search for a filename, such as passwd, the locate command will produce every file that contains the string passwd, not just files named passwd. In many cases, you may want to start by listing how many files will match. You can do this by using the -c option to the locate command: sysadmin@localhost:~$ locate -c passwd
97 sysadmin@localhost:~$
5.5.4 Limiting the Output You can limit the output produced by the locate command by using the -b option. This option will only include listings that have the search term in the basename of the filename. The basename is the portion of the filename not including the directory names. sysadmin@localhost:~$ locate -c -b passwd
83 sysadmin@localhost:~$
As you can see from the previous output, there will still be many results when you use the -b option. To limit the output even further, you place a \character in front of the search term. This character limits the outp ut to filenames that exactly match the term: sysadmin@localhost:~$ locate -b "\passwd"
/etc/passwd /etc/cron.daily/passwd /etc/pam.d/passwd /usr/bin/passwd /usr/share/doc/passwd /usr/share/lintian/overrides/passwd sysadmin@localhost:~$
Lab 5 5.1 Introduction This is Lab 5: Getting Help. By performing this lab, students will learn how to get help on commands and find files. In this lab, you will perform the following tasks:
Use several help systems to get help for commands. Learn how to l ocate commands.
5.2 Getting Help In this task, you will explore the how to get help. This will be a very useful thing to know how to do when you find yourself stuck or when you can't remember how a command works. In addition to Internet searches, the Linux Operating System provides a variety of techniques to learn more about a given command or feature. Knowing these different techniques will allow you to more easily and quickly find the answer you need.
5.2.1 Step 1 Execute commands in the bash shell by typing the command and then pressing the Enter key. For example, type the following command to display today's date:
date
The output should be similar to the following: sysadmin@localhost:~$ date
Tue Jan 19 17:27:20 UTC 2 016 sysadmin@localhost:~$
5.2.2 Step 2 To learn more about commands, access the manual page for the command with the man command. For example, execute the following command to learn more about the date command: man date sysadmin@localhost:~$ man date
Your output should be similar to the following: DATE(1) TE(1)
User Commands
NAME
date - print or set the system date and time
SYNOPSIS
date [OPTION]... [+FORMAT] date [-u|--utc|--universal] [MMDDhhmm[[CC]YY][.ss]]
DESCRIPTION
Display the current time in th e given FORMAT, or set the system date.
-d, --date =STRING
display time described by STRING, not `now' -f, --file =DATEFILE
like --date once for each line of DATEFILE
-r, --reference =FILE
display the last modification time of FILE
DA
-R, --rfc-2822
output
date
and time in RFC 2822 format.
Example: Mon, 07 Aug
Manual page date(1) line 1 (press h for help or q to quit) Note: Documents that are displayed with the man command are called "Man Pages".
If the man command can find the manual page for the argument provided, then that manual page will be displayed using a command called less. The following table describes useful keys that can be used with the less command to control the output of the display: Key
Purpose
H or h
Display the help
Q or q
Quit the help or manual page
Spacebar or f or PageDown Move a screen forward b or PageUp
Move a screen backward
Enter or down arrow
Move down one line
Up arrow
Move up one line
/ followed by text to search Start searching forward ? followed by text to search Start searching backward n
Move to next text that matches search
N
Move to previous matching text
5.2.3 Step 3 Type the letter h to see a list of movement commands. After reading the movement commands, type the letter q to get back to the document. SUMMARY OF LESS COMMANDS
Commands marked with * may be preceded by a number, N. Notes in parentheses indicate the behavior if N is given. A key preceded by a caret indicates the Ctrl key; thus ^K is ct rl-K.
h
H
Display this help.
q
:q
Q
:Q
ZZ
Exit.
-----------------------------------------------------------------------
MOVING
e ^E
j
^N
CR
*
Forward
one line
(or N lines).
y
^Y
k
^K
^P
*
Backward one line
(or N lines).
f
^F
^V
SPACE
*
Forward
b
^B
ESC-v
*
B ackward one window (or N lines).
z
*
Forward
w
*
Backward one window (and set window to N).
ESC-SPACE
*
Forward
one window, but don't stop at end- of-file.
d
^D
*
Forward
one half-window (and set half-window to N)
u
^U
*
Backward one half-window (and set half-window to N)
ESC-)
RightArrow *
Left
one window (or N lines).
one window (and set window to N).
one half screen width (or N positions).
HELP -- Press RETURN for more, or q when done
Note that the man pages might be a bit of a mystery to you now, but as you learn more about Linux, you will find they are a very valuable resource.
5.2.4 Step 4 Searches are not case sensitive and do not "wrap" around from the bottom to top, or vice versa. Start a forward search for the word "file" by typing: /file
Note that what you are typing will appear at the bottom left portion of the screen. -r, --reference =FILE
display the last modification time of FILE
-R, --rfc-2822
output /file
5.2.5 Step 5
date
and time in RFC 2822 format.
Example: Mon, 07 Aug
Notice that the text matching the search is highlighted. You can move forward to the next match by pressing n. Also try moving backwards through the matches by pressing N : -f, --file=DATEFILE
like --date once for each line of DATE FILE
-r, --reference =FILE
display the last modification time of FILE
-R, --rfc-2822
output
date
and time in RFC 2822 format.
Example: Mon, 07 Aug
2006 12:34:56 -0600
--rfc-3339=TIMESPEC
output date and time in RFC 3339 format. onds',
or
`ns'
Date and time
for
TIMESPEC=`date ', `sec-
date and time to the indicated precision.
components
are
separated
by
a
single
space:
2006-08-07 12:34:56-06:00
-s, --set =STRING
set time de scribed by STRING
-u, --utc, --universal
print or set C oordinated Universal Time
--help display this help and exit
Manual page date(1) line 18/204 24% (press h for help or q to qu it)
5.2.6 Step 6 Use the movement commands previously described (such as using the spacebar to move down one screen) to read the man page for the date command. When you are finished reading, type q to exit the man page.
5.2.7 Step 7 In some cases you may not remember the exact name of the command. In these cases you can use the -k option to the man command and provide a keyword argument. For
example, execute the following command to display a summary of all man pages that have the keyword "password" in the description: man -k password sysadmin@localhost:~$ man -k password
chage (1)
- change user password expiry information
chgpasswd (8)
- update group passwords in batch mode
chpasswd (8)
- update passwords in batch mode
cpgr (8) ..
- copy with locking the given file to the password or gr.
cppw (8) ..
- copy with locking the given file to the password or gr.
expiry (1)
- check and enforce password expiration policy
login.defs (5)
- shadow password suite configuration
pam_pwhistory (8)
- PAM module to remember last passwords
pam_unix (8)
- Module for tradition al password authentication
passwd (1)
- change user password
passwd (1ssl)
- compute password hashes
passwd (5)
- the password file
pwck (8)
- verify integrity of password files
pwconv (8)
- convert to and from shadow passwords and groups
shadow (5)
- shadowed password file
shadowconfig (8)
- toggle shadow passwords on and off
unix_chkpwd (8) ..
- Helper binary that verifies the password of the curren.
unix_update (8)
- Helper binary that updates the password of a given user
vipw (8) ..
- edit the password, group, shadow -password or shadow-gr.
sysadmin@localhost:~$
The -k option to the man command will often produce a huge amount of output. You will learn a technique in a later lab to either limit this output or allow you to easily scroll though the data. For now, just use the scrollbar on the right hand side of the terminal window to move the display up and down as needed.
5.2.8 Step 8 Note that the apropos command is another way of viewing man page summaries with a keyword. Type the following command: apropos password
sysadmin@localhost:~$ apropos password
chage (1)
- change user password expiry information
chgpasswd (8)
- update group passwords in batch mode
chpasswd (8)
- update passwords in batch mode
cpgr (8) ..
- copy with locking the given file to the password or gr.
cppw (8) ..
- copy with locking the given file to the password or gr.
expiry (1)
- check and enforce password expiration policy
login.defs (5)
- shadow password suite configuration
pam_pwhistory (8)
- PAM module to remember last passwords
pam_unix (8)
- Module for traditional password authentication
passwd (1)
- change user password
passwd (1ssl)
- compute password hashes
passwd (5)
- the password file
pwck (8)
- verify integrity of password files
pwconv (8)
- convert to and from shadow passwords and groups
shadow (5)
- shadowed password file
shadowconfig (8)
- toggle shadow passwords on and off
unix_chkpwd (8) ..
- Helper binary that verifies the password of the curren.
unix_update (8)
- Helper binary that updates the password of a given user
vipw (8) ..
- edit the password, group, shadow-password or shadow-gr.
sysadmin@localhost:~$
Note: There is no difference between man -k and the apropos command.
5.2.9 Step 9 There are often multiple man pages with the same name. For example, the previous command showed three pages for passwd. Execute the following command to view the man pages for the word passwd: man -f passwd sysadmin@localhost:~$ man -f passwd
passwd (5)
- the password file
passwd (1)
- change user password
passwd (1ssl)
- compute password hashes
sysadmin@localhost:~$
The fact that there are different man pages for the same "name" is confusing for many beginning Linux users. Man pages are not just for Linux commands, but also for system files and other "features" of the Operating System. Additionally, there will sometimes be two commands with the same name, as in the example provided above. The different man pages are distinguished by "sections". By default there are nine default sections of man pages:
Executable programs or shell commands System calls (f unctions provided by the kernel) Library calls (functions withi n program libraries) Special files (usually found in /dev) File formats and conventions, e.g. /etc/passwd Games Miscellaneous (incl uding macro packages and conventions), e.g. man(7)>, groff(7) System administration commands (usually only for root) Kernel routines [Non standard]
When you type a command such as man passwd , the first section is searched and, if a match is found, the man page is displayed. The man -f passwd command that you previously executed shows that there is a section 1 man page for passwd: passwd (1) . As a result, that is the one that is displayed by default.
5.2.10 Step 10 To display a man page for a different section, provide the section number as the first argument to the man command. For example, execute the following command: man 5 passwd PASSWD(5)
File Formats and Conversions
PASSWD(5)
NAME
passwd - the password file
DESCRIPTION
/etc/passwd contains one line for each user account, with seven fields delimited by colons (":"). These fields are:
o
login name
o
optional encrypted password
o
numerical user ID
o
numerical group ID
o
user name or comment field
o
user home directory
o
optional user command interpreter
Manual page passwd(5) line 1 (press h for help or q to quit)
5.2.11 Step 11 Instead of using man -f to display all man page sections for a name, you can also use the whatis command: whatis passwd sysadmin@localhost:~$ whatis passwd
passwd (5)
- the password file
passwd (1)
- change user password
passwd (1ssl)
- compute password hashes
sysadmin@localhost:~$
Note: There is no difference between man -f and the whatis command.
5.2.12 Step 12 Almost all system features (commands, system files, etc.) have man pages. Some of these features also have a more advanced feature called info pages. For example, execute the following command: info date File: coreutils.info, t\
Node: date invocation,
em context
21.1 `date': Print or set system date and time ==============================================
Next: arch invocation,
Up: Sys
Synopses:
date [OPTION]... [+FORMAT] date [-u|--utc|--universal] [ MMDDhhmm[[CC]YY][.ss] ]
Invoking `date' with no FORMAT argument is equivalent to invoking it with a default format that depends on the `LC_TIME' locale category. In the default C locale, this format is `'+%a %b %e %H:%M:%S %Z %Y'', so the output looks like `Thu Mar
3 13:47:51 PST 2005'.
Normally, `date' uses the time zone rules indicated by the `T Z' environment variable, or the system default rules if `TZ' is not set. *Note S pecifying the T ime Zone with ` TZ': (libc)TZ Variable.
If given an a rgument that starts with a `+', `date' prints the current date and time (or the date and time specified by the ` --date'
--zz-Info: (coreutils.info.gz)date invocation, 41 lines --Top-----------------Welcome to Info version 4.13. Type h for help, m for menu item.
Many beginning Linux users find info pages to be easier to read. They are often written more like "lessons" while man pages are written purely as documentation.
5.2.13 Step 13 While viewing the info page from the previous step, type the letter h to see a list of movement commands. Note that they are different from the movement commands used in man pages. After reading the movement commands, type the letter l (lowercase L) to return to viewing the document.
5.2.14 Step 14 Use the movement commands to read the info page for the date command. When you are done, put your cursor anywhere on the line that reads *Examples of date:: and then press the Enter key. A new document will be displayed that shows examples of date.
5.2.15 Step 15
Type the l key to return to the previous screen. When you are finished reading, type q to exit the info page.
5.2.16 Step 16 Another way of getting help is by using the --help option to a command. Most commands allow you to pass an argument of --help to view basic command usage: date --help sysadmin@localhost:~$ date --help
Usage: date [OPTION]... [+FORMAT] or:
date [-u|--utc|--universal] [MMDDhhmm[[CC]YY][.ss]]
Display the current time in the given FORMAT, or set the system date.
-d, --date=STRING
display time described by STRING, not `now'
-f, --file=DATEFILE
like --date once for each line of DATEFILE
-r, --reference=FILE
display the last modification time of FILE
-R, --rfc-2822
output date and time in RFC 2822 format. Example: Mon, 07 Aug 2006 12:34:56 -0600
--rfc-3339=TIMESPEC
output date and time in RFC 3339 format. TIMESPEC=`date', `seconds', or `ns' for date and time to the indicated precision. Date and time components are separated by a single space: 2006-08-07 12:34:56-06:00
-s, --set=STRING
set time described by STRING
-u, --utc, --universal
print or set Coordinated Universal Time
--help
display this help and exit
--version
output version information and exit
5.2.17 Step 17 Some system features also have more detailed help documents located in the /usr/share/doc directory structure. Execute the following command to view the contents of this document: ls /usr/share/doc sysadmin@localhost:~$ ls /usr/share/doc adduser
libdrm2
libx11 -data
apt
libedit2
libxau6
ascii
libelf1
libxcb1
base-files
libffi6
libxdmcp6
base-passwd
libgcc1
libxext6
bash
libgcrypt11
libxml2
bind9
libgdbm3
libxmuu1
bind9-host
libgeoi p1
locales
bind9utils
libgettextpo0
login
bsdmainutils
libglib2.0-0
logrotate
bsdutils
libgnutls26
lsb-base
busybox-initramfs
libgomp1
makedev
bzip2
libgpg-error0
man-db
ca-certificates
libgpm2
mawk
coreutils
libgssapi-krb5-2
mc
cpio
libgssapi3-heimdal
mc-data
cron
libhcrypto4-heimdal
mime-support
curl
libheimbase1-heimdal
mlocate
dash
libheimntlm0-heimdal
module-init-tools
Note that in almost all cases, the man pages and info pages will provide you with the information that you need. However, if you need more in-depth information (something that system administrators sometimes need), then you may find this information in the files located in the /usr/share/doc directory.
5.3 Finding Files In this task, we will explore how to search for a file on the system. This is useful to know in situations when you can't find a file on the system, either one that you created or one that was created by someone else.
5.3.1 Step 1 An easy way to search for a file is to use the locate command. For example, you can find the location of the crontab file by executing the following command: locate crontab sysadmin@localhost:~$ locate crontab
/etc/crontab /usr/bin/crontab /usr/share/doc/cron/examples/crontab2english.pl /usr/share/man/man1/crontab.1.gz
/usr/share/man/man5/crontab.5.gz sysadmin@localhost:~$
5.3.2 Step 2 Note that the output from the previous example includes files that have crontab as part of their name. To find files that are just named crontab, use the following command: locate -b "\crontab" sysadmin@localhost:~$ locate -b "\crontab"
/etc/crontab /usr/bin/crontab sysadmin@localhost:~$
Note: The locate command makes use of a database that is traditionally updated once per day (normally in the middle of the night). This database contains a list of all files that were on the system when the database was last updated. As a result, any files that you created today will not normally be searchable with the locate command. If you have access to the system as the root user (the system administrator account), you can manually update this file by running the updatedb command. Regular users can not update the database file. Another possible solution to searching for "newer" files is to make use of the find command. This command searches the live filesystem , rather than a static database. The find command isn't part of the Linux Essentials objectives for this lab, so it is only mentioned here. Execute man find if you want to explore this command on your own or wait for the lab that explores the find command.
5.3.3 Step 3 You may just want to find where a command (or its man pages) is located. This can be accomplished with the whereis command : whereis passwd sysadmin@localhost:~$ whereis passwd
passwd: /usr/bin/passwd / etc/passwd /usr/share/man/man1/passwd.1.gz /usr/sha re /man/man1/passwd.1ssl.gz /usr/share/man/man5/passwd.5.gz sysadmin@localhost:~$
The whereis command only searches for commands and man pages, not just any file.
Recall from earlier that there is more than one passwd man page on the system. This is why you see multiple file names and man pages (the files that end in .gz are man pages) when you execute the previous command.
Chapter 6 6.1 Introduction When working in a Linux Operating System, you will need to know how to manipulate files and directories. Some Linux distributions have GUI- based applications that allow you to manage files, but it is important to know how to perform these operations via the command line. The command line features a rich collection of commands that allow you to manage files. In this chapter you will learn how to list files in a directory as well as how to copy, move and delete files. The core concepts taught in this chapter will be expanded in later chapters as more file manipulation commands are covered, such as how to view files, compress files and set file permissions.
6.2 Understanding Files and Directories Files are used to store data such as text, graphics and programs. Directories (AKA, "folders") are used to provide a hierarchical organization structure. This structure is somewhat different than what you might be used to if you have previously worked on Microsoft Windows systems. On a Windows system, the top level of the directory structure is called My Computer . Each physical device (hard drive, DVD drive, USB thumb drive, network drive, etc.) shows up under My Computer, each assigned a drive letter, such as C: or D:. A visual representation of this structure: The directory structures shown below are provided as ex amples only. These directories may not be present withi n the virtual machine environment of this course.
Like Windows, a Linux directory structure has a top level, however it is not called My Computer, but rather the root directory and it is symbolized by the / character. There are also no drives in Linux; each physical device is accessible under a directory, not a drive letter. A visual representation of a typical Linux directory structure:
This directory structure is called the filesystem by most Linux users.
To view the root filesyste m, type ls /: sysadmin@localhost:~$ ls / bin
dev
home
lib
media
opt
root
sbin
selinux
sys
usr
boot
etc
init
lib64
mnt
proc
run
sbin???
srv
tmp
var
Notice that there are many descriptive directories including /boot, which contains files to boot the computer.
6.2.1 Directory Path Using the graphic in the previous section as a point of reference, you will see that there is a directory named sound under a directory named etc, which is under the / directory. An easier way to say this, is to refer to the path. Consider This:
The /etc directory originally stood for “et cetera” in early documentation from Bell Labs and used to contain files that did not belong elsewhere. In modern Linux distributions, the /etc directory typically holds static configuration files as defined by the File Hierarchy Standard (FHS). A path allows you to specify the exact location of a directory. For the sound directory, the path would be /etc/sound. The first / character represents the root directory, while each other / character is used to separate the directory names. This sort of path is called an absolute path. With an absolute path, you always provide directions to a directory (or a file) starting from the top of the directory structure, the root directory. Later in this chapter, we will cover a different sort of path called a relative path. Note: The directory structures shown below are provided as examples only. These directories may not be present withi n the virtual machine environment of this course.
The following graphic demonstrates three additional absolute paths:
6.2.2 Home Directory The term home directory often causes confusion to beginning Linux users. To begin with, on most Linux distributions there is a directory called home under the root directory: /home. Under this /home directory there will be a directory for each user on the system. The directory name will be the same as the name of the user, so a user named "bob" would have a home directory called /home/bob. Your home directory is a very important directory. To begin with, when you open a shell, you should automatically be placed in your home directory, as this is where you will do most of your work. Additionally, your home directory is one of the few directories where you have the full control to create and delete additional files and directories. Most other directories in a Linux filesystem are protected with file permissions, a topic that will be covered in detail in a later chapter.
On most Linux distributions, the only users who can access any files in your home directory are you and the administrator on the system (the root user). This can be changed by using file permissions. Your home directory even has a special symbol that you can use to represent it: ~. If your home directory is /home/sysadmin, you can just type ~ on the command line in place of /home/sysadmin. You can also refer to another user's home directory by using the notation ~user , where user is the name of the user account whose home directory you want to refer to. For example, ~bob would be the same as /home/bob. Here, we will change to the user's home directory: sysadmin@localhost:~$ cd ~ sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos sysadmin@localhost:~$
Note that a l isting reveals subdirectories contained in the home di rectory. Changing directories requires attention to detail: sysadmin@localhost:~$ cd downloads
-bash: cd: downloads: No such file or directory sysadmin@localhost:~$
Why did the command above result in an error? That is because Linux environments are case sensitive. Changing into the Downloads directory requires the correct spelling - including the capital D: sysadmin@localhost:~$ cd Downloads sysadmin@localhost:~/Downloads$
6.2.3 Current Directory Your current directory is the directory where you are currently working in a terminal. When you first open a terminal, the current directory should be your home directory, but this can change as you explore the filesystem and change to other directories. While you are in a command line environment, you can determine your current directory by using the pwd command: sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$
Additionally, most systems have the default user prompt display the current directory: [sysadmin@localhost ~]$
In the graphic above, the ~ character indicates your current directory. As mentioned previously, the ~ character represents your home directory. Normally the prompt only displays the name of the current directory, not the full path from the root directory down. In other words, if you were in the /usr/share/doc directory, your prompt will likely just provide you with the name doc for the current directory. If you want the full path, use the pwd command.
6.2.4 Changing Directories If you want to change to a different directory, use the cd (change directory) command. For example, the following command will change the current directory to a directory called /etc/sound/events: sysadmin@localhost:~$ cd /etc/sound/events sysadmin@localhost:/etc/sound/events$
Note that there is no output if the cd command is successful. This is one of those "no news is good news" type of things. If you try to change to a directory that does not exist, you will receive an error message: sysadmin@localhost:/etc/sound/events$ cd /etc/junk
-bash: cd: /etc/junk: No such file or directory sysadmin@localhost:/etc/sound/events$
If you want to return to your home directory, you can either type the cd command with no arguments or use the cd command with the ~ character as an argument: sysadmin@localhost:/etc/sound/events$ cd sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$ cd /etc sysadmin@localhost:/etc$ cd ~ sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$
6.2.5 Absolute vs. Relative Pathnames Recall that a pathname is essentially a description of where a file or directory is located in the filesystem. You can also consider a pathname to be directions that tell the system where to find a file or directory. For example, the cd /etc/perl/Net command means "change to the Net directory, that you will find under the perl directory, that you will find under the etc directory, that you will find under the / directory". When you give a pathname that starts from the root directory, it is called an absolute path . In many cases, providing an absolute path makes sense. For example, if you are in your home directory and you want to go to the /etc/perl/Net directory, then providing an absolute path to the cd command makes sense: sysadmin@localhost:~$ cd /etc/perl/Net sysadmin@localhost:/etc/perl/Net$
However, what if you were in the /etc/perl directory and you wanted to go to the /etc/perl/Net directory? It would be tedious to type the complete path to get to a directory that is only one level below your current location. In a situatio n like this, you want to use a relative path: sysadmin@localhost:/etc/perl$ cd Net sysadmin@localhost:/etc/perl/Net$
A relative path provides directions using your current location as a point of reference. Recall that this is different from absolute paths, which always require you to use the root directory as a point of reference. There is a handy relative path technique that you can use to move up one level in the directory structure: the .. directory. Regardless of which directory you are in, .. always represents one directory higher than your current directory (with the exception of when you are in the / directory): sysadmin@localhost:/etc/perl/Net$ pwd
/etc/perl/Net sysadmin@localhost:/etc/perl/Net$ cd .. sysadmin@localhost:/etc/perl$ pwd
/etc/perl sysadmin@localhost:/etc/perl$
Sometimes using relative pathnames are a better choice than absolute pathnames, however this is not always the case. Consider if you were in the /etc/perl/Net directory and then you wanted to go to the /usr/share/doc directory. Using an
absolute pathname, you would execute the cd /usr/share/doc command. Using relative pathnames, you would execute the cd ../../../usr/share/doc command: sysadmin@localhost:/etc/perl/Net$ cd sysadmin@localhost:~$ cd /etc/perl/Net sysadmin@localhost:/etc/perl/Net$ cd /../../../usr/share/doc sysadmin@localhost:/usr/share/doc$ pwd
/usr/share/doc sysadmin@localhost:/usr/share/doc$
Note: Relative and absolute paths are not just for the cd command. Any time you specify a file or a directory you can use either relative or absolute paths.
While the double dot (..) is used to refer to the directory above the current directory, the single dot (.) is used to refer to the current directory. It would be pointless for an administrator to move to the current directory by typing cd . (although it actually works). It is more useful to refer to an item in the current directory by using the ./ notation. For instance: sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$ cd ./Downloads/ sysadmin@localhost:~/Downloads$ pwd
/home/sysadmin/Downloads sysadmin@localhost:~/Downloads$ cd .. sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$
Note: This use of the si ngle dot ( .) as a reference point is not to be confused with using it at the beginning of a fil ename. Read more about hidden files in Section 6.4.2.
6.3 Listing Files in a Directory Now that you are able to move from one directory to another, you will want to start displaying the contents of these directories. The ls command (ls is short for list) can be used to display the contents of a directory as well as detailed information about the files that are within a directory. By itself, the ls command will list the files in the current directory: sysadmin@localhost:~$ ls
Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos sysadmin@localhost:~$
6.3.1 Listing Colors There are a re many many differe differe nt types type s of fil files es in Linux. Linux. As you lea learn rn more about ab out Linux, Linux, you will will discover discove r many of these types. type s. The follow following ing is a brief summary summary of some of the more more common file types: Type
Description
plain file fil e
A file that that isn't a special special file type; also called a normal normal file file
directory
A directory fil e (contains other files) fi les)
executable exe cutable A file fil e that can be run like li ke a program symbolic link link A file that points points to another another file
O n many Linux Linux distri d istribu butio tions, ns, regular regular user accounts acc ounts are mod modiified so that the the ls com co mmand displays displays file names, color-cod color- coded ed by file type. For Fo r example, example, directories directories may be b e displayed displayed in blue, executab e xecutablle fil files may be b e displayed in green, and symbolic symbolic links inks may be displayed in cyan cyan (lig (light ht blue). This This is not not a normal normal behavi be havior or for the ls comm co mmand, and, but rather somethi s omething ng that hap happe pens ns when you use the --color option to the ls comm co mmand. and. The reason reas on why ls seems to automaticall automatica lly y perform pe rform this this colori co loring, ng, is that there there is an alias for the ls comm co mmand and so itit runs with the --color option: o ption: sysadmin@localhost:~$ alias
alias egrep='egrep --color=auto' alias fgrep='fgrep --color=auto' alias grep='grep --color=auto' alias ali as l='ls l ='ls -CF' alias la='ls -A' alias ll='ls -alF' alias ls='ls --color=auto' sysadmin@localhost:~$
As you you can ca n see from from the output abo a bove, ve, when the ls com co mmand is executed, exec uted, it reall rea lly y runs the command ls --color=auto .
In some case c ases, s, you mig might ht not want to see all of the the colors c olors (they can be a bit distracting distractin g sometimes sometimes). ). To avoid usin using g the alias, alias, place a backs ba cksllash characte char acterr \ in front of your command: sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Music
Pictures
Public
Templates
Videos sysadmin@localhost:~$ \ls
Desktop
Documents
Downloads
Videos sysadmin@localhost:~$
6.3.2 Listing Hidden Files When you use the ls com co mmand to display the contents co ntents of o f a directory, directo ry, not all fil files es are shown automatically. The ls command doesn't do esn't display hidden fil f iles es by defaul de fault. t. A hi hidden dd en file file is any file file (or (or director direc tory) y) that that begins with a dot do t . character. To display display all files, files, includ including ing hidd hidden en files files,, use use the -a option option to the the ls com co mmand: sysadmin@localhost:~$ ls -a .
.bashrc
.selected_editor .selected_ed itor
Downloads
Public
..
.cache
Desktop
Music
Templates Templat es
.bash_logout
.profile
Documents
Pictures
Videos
Why are fi files hidd hidden en in the first first place? place ? Most of the hidd hidden en fil files are customization fil f iles es, desig de signed ned to customize how Linux Linux,, your shell shell or o r programs pro grams work. For Fo r example, the .bashrc file file in your your hom ho me directory directo ry customizes features of the shell, shell, such as creating cre ating or modifying variables and aliases. These customiz custo mizaa t io n files files are not ones that that you work with with on a regular regular basis. ba sis. There There are a re also many of them, them, as a s you can see, see , and having having them di d isplayed will will make mak e itit more more difficu difficult lt to find find the fil files that you do regularly regularly work with. with. So, So , the fact that they are hidden hidden is to your benefit.
6.3.3 Long Display Listing There is is inf infor ormatio matio n about ab out each eac h file, file, cal ca lled metadata that is sometimes helpful helpful to display. This This may include include who owns owns a file, file, the size of a file file and the last time time the contents conte nts of a file file were mod modiified. You can ca n display this inf infor ormatio matio n by using using the -l option to the ls command: sysadmin@localhost:~$ ls -l
total tot al 0 drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Desktop
drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Documents
drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Downloads
drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Music
drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Pictures
drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Public
drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Templates
drwxr-xr-x 1 sysadmin sysadm sysadmin in 0 Jan 29
2015 Videos
sysadmin@localhost:~$
In the output above, ab ove, each eac h line ine describes de scribes metada eta data ta abo a bout ut a single single file. file. The foll follow owing ing describes de scribes each eac h of the fields fields of data that that you will will see in the output of the ls -l command:
6.3.3.1 Human Readable Sizes When When you display display file file sizes with the -l option to the ls comm co mmand and,, you end up with with file file sizes sizes in bytes. b ytes. For Fo r text fi files, a byte is 1 character chara cter.. For Fo r smaller smaller files, files, byte sizes sizes are fine. fine. However, However , for larger fil files it is hard to comprehend co mprehend how large large the file file is. For F or example, consi co nside derr the output o f the foll following ow ing comm co mmand and:: sysadmin@localhost:~$ ls -l /usr/bin/omshell
-rwxr-xr-c 1 root root 1561400 Oct 9 2012 /usr/bin/omshell sysadmin@localhost:~$
As you can see, the file size is hard to determine in bytes. Is 1561400 a large file or small? It seems fairly large, but it is hard to determine using bytes. Think of it this way: if someone were to give you the distance between Boston and New York using inches, that value would essentially be meaningless because for a distance like that, you think in terms of miles. It would be better if the file size was presented in a more human readable size, like megabytes or gigabytes. To accomplish this, add the -h option to the ls command: sysadmin@localhost:~$ ls -lh /usr/bin/omshell
-rwxr-xr-c 1 root root 1.5M Oct 9 2012 /usr/bin/omshell sysadmin@localhost:~$
Important: The -h option must be used with the -l option.
6.3.4 Listing Directories When the command ls -d is used, it refers to the current directory, and not the contents within it. Without any other options, it is rather meaningless, although it is important to note that the current directory is always referred to with a single period (.): sysadmin@localhost:~$ ls -d
.
To use the ls -d command in a meaningful way requires the addition of the -l option. In this case, note that the first command lists the details of the contents in the /home/sysadmin directory, while the second command lists the /home/sysadmin directory itself. sysadmin@localhost:~$ ls -l
total 0 drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
2015 Desktop
drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
2015 Documents
drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
2015 Downloads
drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
2015 Music
drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
2015 Pictures
drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
2015 Public
drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
2015 Templates
drwxr-xr-x 1 sysadmin sysadmin
0 Apr 15
drwxr-xr-x 1 sysadmin sysadmin 420 Apr 15
2015 Videos 2015 test
sysadmin@localhost:~$ ls -ld
drwxr-xr-x 1 sysadmin sysadmin 224 Nov
7 17:07 .
sysadmin@localhost:~$
Note the single period at the end of the second long listing. This indicates that the current directory is being listed, and not the contents.
6.3.5 Recursive Listing There will be times when you want to display all of the files in a directory as well as all of the files in all subdirectories under a directory. This is called a recursive listing. To perform a recursive listing, use the -R option to the ls command: Note: The output shown below wil l vary from the results you will see if you execute the command within the virtual machine environment of this course. sysadmin@localhost:~$ ls -R /etc/ppp
/etc/ppp: chap-secrets
ip-down.ipv6to4
ip-up.ipv6to4
ipv6-up
pap-secrets
ip-down
ip-up
ipv6-down
options
peers
/etc/ppp/peers: sysadmin@localhost:~$
Note that in the previous example, the files in the /etc/ppp directory were listed first. After that, the files in the /etc/ppp/peers directory were listed (there were no files in this case, but if any file had been in this directory, they would have been displayed). Be careful with this option; for example, running the command ls -R / would list every file on the file system, including all files on any attached USB device and DVD in the system. Limit the use of the -R option to smaller directory structures.
6.3.6 Sort a Listing By default, the ls command sorts files alphabetically by file name. Sometimes, It may be useful to sort files using different criteria. To sort files by size, we can use the -S option. Note the difference in the output of the following two commands:
sysadmin@localhost:~$ ls /etc/ssh
moduli ssh_config
ssh_host_dsa_key.pub
ssh_host_rsa_key
ssh_host_ecdsa_key
sshd_confi
ssh_host_rsa_key.pu b
ssh_host_dsa_key ssh_host_ecdsa_key.pub ssh_import_id sysadmin@localhost:~$ ls -S /etc/ssh
moduli
ssh_host_dsa_key
ssh_host_ecdsa_key
sshd_config
ssh_host_dsa_key.pub
ssh_host_ecdsa_ key.pub
ssh_host_rsa_key
ssh_host_rsa_key.pub
ssh_config
ssh_import_id
sysadmin@localhost:~$
The same files and directories are listed, but in a different order. While the -S option works by itself, you can't really tell that the output is sorted by size, so it is most useful when used with the -l option. The following command will list files from largest to smallest and display the actual size of the file. sysadmin@localhost:~$ ls -lS /etc/ssh
total 160 -rw-r--r-- 1 root root 125749 Apr 29
2014 moduli
-rw-r--r-- 1 root root
2489 Jan 29
2015 sshd_config
-rw------- 1 root root
1675 Jan 29
2015 ssh_host_rsa_key
-rw-r--r-- 1 root root
1669 Apr 29
2014 ssh_config
-rw------- 1 root root
668 Jan 29
2015 ssh_host_dsa_key
-rw-r--r-- 1 root root
607 Jan 29
2015 ssh_host_dsa_key.pub
-rw-r--r-- 1 root root
399 Jan 29
2015 ssh_host_rsa_key.pub
-rw-r--r-- 1 root root
302 Jan 10
2011 ssh_import_id
-rw------- 1 root root
227 Jan 29
2015 ssh_host_ecdsa_key
-rw-r--r-- 1 root root
179 Jan 29
2015 ssh_host_ecdsa_key.pub
sysadmin@localhost:~$
It may also be useful to use the -h option to display human-readable file sizes: sysadmin@localhost:~$ ls -lSh /etc/ssh
total 160K -rw-r--r-- 1 root root 123K Apr 29
2014 moduli
-rw-r--r-- 1 root root 2.5K Jan 29
2015 sshd_config
-rw------- 1 root root 1.7K Jan 29
2015 ssh_host_rsa_key
-rw-r--r-- 1 root root 1.7K Apr 29
2014 ssh_config
-rw------- 1 root root
668 Jan 29
2015 ssh_host_dsa_key
-rw-r--r-- 1 root root
607 Jan 29
2015 ssh_host_dsa_key.pub
-rw-r--r-- 1 root root
399 Jan 29
2015 ssh_host_rsa_key.pub
-rw-r--r-- 1 root root
302 Jan 10
2011 ssh_import_id
-rw------- 1 root root
227 Jan 29
2015 ssh_host_ecdsa_key
-rw-r--r-- 1 root root
179 Jan 29
2015 ssh_host_ecdsa_key.pub
sysadmin@localhost:~$
It is also possible to sort files based on the time they were modified. You can do this by using the -t option. The -t option will list the most recently modified files first. This option can be used alone, but again, is usually more helpful when paired with the -l option: sysadmin@localhost:~$ ls -tl /etc/ssh
total 160 -rw------- 1 root root
668 Jan 29
2015 ssh_host_dsa_key
-rw-r--r-- 1 root root
607 Jan 29
2015 ssh_host_dsa_key.pub
-rw------- 1 root root
227 Jan 29
2015 ssh_host_ecdsa_key
-rw-r--r-- 1 root root
179 Jan 29
2015 ssh_host_ecdsa_key.pub
-rw------- 1 root root
1675 Jan 29
-rw-r--r-- 1 root root
399 Jan 29
-rw-r--r-- 1 root root
2489 Jan 29
2015 ssh_host_rsa_key 2015 ssh_host_rsa_key.pub 2015 sshd_config
-rw-r--r-- 1 root root 125749 Apr 29
2014 moduli
-rw-r--r-- 1 root root
1669 Apr 29
2014 ssh_config
-rw-r--r-- 1 root root
302 Jan 10
2011 ssh_import_id
sysadmin@localhost:~$
It i s important to remember that the modified date on directories represents the last time a file was added to or removed from the directory.
If the files in a directory were modified many days or months ago, it may be harder to tell exactly when they were modified, as only the date is provided for older files. For more detailed modification time information you can use the --full-time option to display the complete timestamp (including hours, seconds, minutes...): sysadmin@localhost:~$ ls -t --full-time /etc/ssh
total 160 -rw------- 1 root root _key
668 2015-01-29 03:17:33.000000000 +0000 ssh_host_dsa
-rw-r--r-- 1 root root _key.pub
607 2015-01-29 03:17:33.000000000 +0000 ssh_host_dsa
-rw------- 1 root root sa_key
227 2015-01-29 03:17:33.000000000 +0000 ssh_host_ecd
-rw-r--r-- 1 root root sa_key.pub
179 2015-01-29 03:17:33.000000000 +0000 ssh_host_ecd
-rw------- 1 root root _key
1675 2015-01-29 03:17:33.000000000 +0000 ssh_host_rsa
-rw-r--r-- 1 root root _key.pub
399 2015-01-29 03:17:33.000000000 +0000 ssh_host_rsa
-rw-r--r-- 1 root root
2489 2015-01-29 03:17:33.000000000 +0000 sshd_config
-rw-r--r-- 1 root root 125749 2 014-04-29 23:58:51.000000000 +0000 m oduli-rw-r-r-- 1 root root 1669 2014-04-29 23:58:51.000000000 +0000 ssh_config -rw-r--r-- 1 root root d
302 2011-01-10 18:48:29.000000000 +0000 ssh_import_i
sysadmin@localhost:~$
The --full-time option will assume the -l option automatically. It is possible to perform a reverse sort with either the -S or -t options by using the -r option. The following command will sort files by size, smallest to largest: sysadmin@localhost:~$ ls -lrS /etc/ssh
total 160 -rw-r--r-- 1 root root
179 Jan 29
2015 ssh_host_ecdsa_key.pub
-rw------- 1 root root
227 Jan 29
2015 ssh_host_ecdsa_key
-rw-r--r-- 1 root root
302 Jan 10
2011 ssh_import_id
-rw-r--r-- 1 root root
399 Jan 29
2015 ssh_host_rsa_key.pub
-rw-r--r-- 1 root root
607 Jan 29
2015 ssh_host_dsa_key.pub
-rw------- 1 root root
668 Jan 29
2015 ssh_host_dsa_key
-rw-r--r-- 1 root root
1669 Apr 29
2014 ssh_config
-rw------- 1 root root
1675 Jan 29
2015 ssh_host_rsa_key
-rw-r--r-- 1 root root
2489 Jan 29
2015 sshd_config
-rw-r--r-- 1 root root 125749 Apr 29
2014 moduli
sysadmin@localhost:~$
The following command will list files by modification date, oldest to newest: sysadmin@localhost:~$ ls -lrt /etc/ssh
total 160 -rw-r--r-- 1 root root
302 Jan 10
2011 ssh_import_id
-rw-r--r-- 1 root root
1669 Apr 29
2014 ssh_config
-rw-r--r-- 1 root root 125749 Apr 29
2014 moduli
-rw-r--r-- 1 root root
2489 Jan 29
2015 sshd_config
-rw-r--r-- 1 root root
399 Jan 29
-rw------- 1 root root
1675 Jan 29
-rw-r--r-- 1 root root
179 Jan 29
2015 ssh_host_ecdsa_key.pub
-rw------- 1 root root
227 Jan 29
2015 ssh_host_ecdsa_key
-rw-r--r-- 1 root root
607 Jan 29
2015 ssh_host_dsa_key.pub
-rw------- 1 root root
668 Jan 29
2015 ssh_host_dsa_key
2015 ssh_host_rsa_key.pub 2015 ssh_host_rsa_key
sysadmin@localhost:~$
6.3.7 Listing With Globs In a previous chapter, we discussed the use of file globs to match filenames using wildcard characters. For example, we demonstrated that you can list all of the files in the /etc directory that begin with the letter e with the following command: sysadmin@localhost:~$ echo /etc/e*
/etc/encript.cfg /etc/environment /etc/ethers /etc/event.d /e tc/exports sysadmin@localhost:~$
Now that you know that the ls command is normally used to list files in a directory, using the echo command may seem to have been a strange choice. However, there is something about the ls command that might have caused confusion while we were discussing globs. This "feature" might also cause problems when you try to list files using glob patterns. Keep in mind that it is the shell, not the echo or ls command, that expands the glob pattern into corresponding file names. In other words, when you typed the echo /etc/e* command, what the shell did before executing the echo command was replace e* with all of the files and directories within the /etc directory that match the pattern. So, if you were to run the ls /etc/e* command, what the shell would really run would be this: ls /etc/encript.cfg /etc/environment /etc/ethers /etc/event.d /etc/exports
When the ls command sees multiple arguments, it performs a list operation on each item separately. In other words, the command ls /etc/encript.cfg /etc/environment is essentially the same as ls /etc/encript.cfg; ls /etc/environment.
Now consider what happens when you run the ls command on a file, such as encript.cfg: sysadmin@localhost:~$ ls /etc/enscript.cfg
/etc/enscript.cfg sysadmin@localhost:~$
As you can see, running the ls command on a single file results in the name of the file being printed. Typically this is useful if you want to see details about a specific file by using the -l option to the ls command: sysadmin@localhost:~$ ls -l /etc/enscript.cfg
-r--r--r--. 1 root root 4843 Nov 11 2 010 /etc/enscript.cfg sysadmin@localhost:~$
However, what if the ls command is given a directory name as an argument? In this case, the output of the command is different than if the argument was a file name: sysadmin@localhost:~$ ls /etc/event.d
ck-log-system-restart
ck-log-system-start
ck-log-system-stop
sysadmin@localhost:~$
If you give a directory name as an argument to the ls command, the command will display the content s of the directory (the names of the files in the directory), not just provide the directory name. The filenames you see in the example above are the names of the files in the /etc/event.d directory. Why is this a problem when using globs? Consider the following output: sysadmin@localhost:~$ ls /etc/e*
/etc/encript.cfg /etc/environment /etc/ethers /etc/event.d /etc/exports /etc/event.d: ck-log-system-restart
ck-log-system-start
ck-log-system-stop
sysadmin@localhost:~$
As you can see, when the ls command sees a filename as an argument, it just displays the filename. However, for any directory, it will display the contents of the directory, not just the directory name. This becomes even more confusing in a situation like the following: sysadmin@localhost:~$ ls /etc/ev*
ck-log-system-restart
ck-log-system-start
ck-log-system-stop
sysadmin@localhost:~$
In the previous example, it seems like the ls command is just plain wrong. But what really happened is that the only thing that matches the glob /etc/ev* is the /etc/event.d directory. So, the ls command only displayed the files in that directory! There is a simple solution to this problem: when you use glob arguments with the ls command, always use the -d option. When you use the -d option, then the ls command won't display the contents of a directory, but rather the name of the directory: sysadmin@localhost:~$ ls -d /etc/e*
/etc/encript.cfg /etc/environment /etc/ethers /etc/event.d /etc/exports sysadmin@localhost:~$
6.4 Copying Files The cp command is used to copy files. It requires that you specify a source and a destination. The structure of the command is as follows: cp [source] [destination]
The source is the file you wish to copy. The destination is where you want the copy to be located. When successful, the cp command will not have any output (no news is good news). The following command will copy the /etc/hosts file to your home directory: sysadmin@localhost:~$ cp /etc/hosts ~ sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates hosts
Documents
Music
Public
Videos
sysadmin@localhost:~$
Remember: The ~ character represents your home directory.
6.4.1 Verbose Mode The -v option will cause the cp command to produce output if successful. The -v option stands for verbose: sysadmin@localhost:~$ cp -v /etc/hosts ~
`/etc/hosts' -> `/home/sysadmin/hosts'
sysadmin@localhost:~$
When the destination is a directory, the resulting new file will have the same name as the original file. If you want the new file to have a different name, you must provide the new name as part of the destination: sysadmin@localhost:~$ cp /etc/hosts ~/hosts.copy sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates hosts
Documents
Music
Public
Videos
hosts.copy
sysadmin@localhost:~$
6.4.2 Avoid Overwriting Data The cp command can be destructive to existing data if the destination file already exists. In the case where the destination file exists , the cp command will overwrite the existing file's contents with the contents of the source file. To illustrate this potential problem, first a new file is created in the sysadmin home directory by copying an existing file: sysadmin@localhost:~$ cp /etc/skel/.bash_logout ~/example.txt sysadmin@localhost:~$
View the output of the ls command to see the file and view the contents of the file using the more command: sysadmin@localhost:~$ cp /etc/skel/.bash_logout ~/example.txt sysadmin@localhost:~$ ls -l example.txt
-rw-rw-r--. 1 sysadmin sysadmin 18 Sep 21 15:56 example.txt sysadmin@localhost:~$ more example.txt
# ~/.bash_logout: executed by bash(1) when login shell exits.
sysadmin@localhost:~$ cp -i /etc/hosts ~/example.txt
cp: overwrite `/home/sysadmin/example.txt'? n sysadmin@localhost:~$ ls -l example.txt
-rw-rw-r--. 1 sysadmin sysadmin 18 Sep 21 15:56 example.txt sysadmin@localhost:~$ more example.txt
# ~/.bash_logout: executed by bash(1) when login shell exits.
sysadmin@localhost:~$
In the next example, you will see that the cp command destroys the original contents of the example.txt file. Notice that after the cp command is complete, the size of the file is different (158 bytes rather than 18) from the original and the contents are different as well: sysadmin@localhost:~$ cp /etc/hosts ~/example.txt sysadmin@localhost:~$ ls -l example.txt
-rw-rw-r--. 1 sysadmin sysadmin 158 S ep 21 14:11 example.txt sysadmin@localhost:~$ cat example.txt
127.0.0.1
localhost localhost.localdomain localhost4 localhost4.localdomain4
::1
localhost localhost.localdomain localhost6 localhost6.localdomain6
sysadmin@localhost:~$
There are two options that can be used to safeguard against accidental overwrites. With the -i (interactive) option, the cp will prompt before overwriting a file. The following example will demonstrate this option, first restoring the content of the original file: sysadmin@localhost:~$ cp /etc/skel/.bash_logout ~/example.txt sysadmin@localhost:~$ ls -l example.txt
-rw-r--r-- 1 sysadmin sysadmin 18 Sep 21 15:56 example.txt sysadmin@localhost:~$ more example.txt
# ~/.bash_logout: executed by bash(1) when login shell exits.
sysadmin@localhost:~$ cp -i /etc/hosts ~/example.txt
cp: overwrite `/home/sysadmin/example.txt'? n sysadmin@localhost:~$ ls -l example.txt
-rw-r--r-- 1 sysadmin sysadmin 18 Sep
21 15:56 example.txt
sysadmin@localhost:~$ more example.txt
# ~/.bash_logout: executed by bash(1) when login shell exits.
sysadmin@localhost:~$
Notice that since the value of n (no) was given when prompted to overwrite the file, no changes were made to the file. If a value of y (yes) was given, then the copy process would have taken place. The -i option requires you to answer y or n for every copy that could end up overwriting an existing file's contents. This can be tedious when a bunch of overwrites could occur, such as the example demonstrated below:
sysadmin@localhost:~$ cp -i /etc/skel/.* ~
cp: omitting directory `/etc/skel/.' cp: omitting directory `/etc/skel/..' cp: overwrite `/home/sysadmin/.bash_logout'? n cp: overwrite `/home/sysadmin/.bashrc'? n cp: overwrite `/home/sysadmin/.profile'? n cp: overwrite `/home/sysadmin/.selected_editor'? n sysadmin@localhost:~$
As you can see from the example above, the cp command tried to overwrite four existing files, forcing the user to answer three prompts. If this situation happened for 100 files, it could become very annoying, very quickly. If you want to automatically answer n to each prompt, use the -n option. It essentially stands for "no rewrite”.
6.4.3 Copying Directories In a previous example, error messages were given when the cp command attempted to copy directories: sysadmin@localhost:~$ cp -i /etc/skel/.* ~
cp: omitting directory `/etc/skel/.' cp: omitting directory `/etc/skel/..' cp: overwrite `/home/sysadmin/.bash_logout'? n cp: overwrite `/home/sysadmin/.bashrc'? n cp: overwrite `/home/sysadmin/.profile'? n cp: overwrite `/home/sysadmin/.selected_editor'? n sysadmin@localhost:~$
Where the output says ...omitting directory... , the cp command is saying that it cannot copy this item because the command does not copy directories by default. However, the -r option to the cp command will have it copy both files and directories. Be careful with this option: the entire directory structure will be copied. This could res ult in copying a lot of files and directories!
6.5 Moving Files To move a file, use the mv command. The syntax for the mv command is much like the cp command:
mv [source] [destination]
In the following example, the hosts file that was generated earlier is moved from the current directory to the Videos directory: sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates
example.txt
Documents
Music
Public
Videos
hosts
hosts.copy
sysadmin@localhost:~$ mv hosts Videos sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates
Documents
Music
Public
Videos
example.txt
hosts.copy
sysadmin@localhost:~$ ls Videos
hosts sysadmin@localhost:~$
When a file is moved, the file is removed from the origina l location and placed in a new location. This can be somewhat tricky in Linux because users need specific permissions to remove files from a directory. If you don't have the right permissions, you will receive a "Permission denied " error message: sysadmin@localhost:~$ mv /etc/hosts .
mv: cannot move `/etc/hosts' to `./hosts': Permission denied sysadmin@localhost:~$
A detailed discussion of permissions is provided in a later chapter.
6.6 Moving Files While Renaming If the destination for the mv command is a directory, the file will be moved to the directory specified. The file name will change only if a destination file name is also specified. If a destination directory is not specified, the file will be renamed using the destination file name and remain in the source directory. sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates example.txt
Documents
Music
Public
Videos
sysadmin@localhost:~$ mv example.txt Videos/newexample.txt sysadmin@localhost:~$ ls
Desktop
Downloads
Pictures
Templates
Documents
Music
Public
Videos
sysadmin@localhost:~$ ls Videos
hosts
newexample.txt
sysadmin@localhost:~$
6.6.1 Renaming Files The mv command is not just used to move a file, but also to rename a file. For example, the following commands will rename the newexample.txt file to myexample.txt: sysadmin@localhost:~$ cd Videos sysadmin@localhost:~/Videos$ ls
hosts
newexample.txt
sysadmin@localhost:~/Videos$ mv newexample.txt myexample.txt sysadmin@localhost:~/Videos$ ls
hosts
myexample.txt
sysadmin@localhost:~/Videos$
Think of the previous mv example to mean "move the newexample.txt file from the current directory back into the current directory and give the new file the name myexample.txt”.
6.6.2 Additional mv Options Like the cp command, the mv command provides the following options: Option
Meaning
-i
Interactive move: ask if a fi le is to be overwritten.
-n
Do not overwrite a destination files' contents
-v
Verbose: show the resulting move
Important: There is no -r option as the mv command will by default move directories.
6.7 Creating Files There are several ways of creating a new file, including using a program designed to edit a file (a text editor). In a later chapter, text editors will be covered.
There is also a way to simply create a file that can be populated with data at a later time. This is a useful feature since for some operating system features, the very existence of a file could alter how a command or service works. It is also useful to create a file as a "placeholder" to remind you to create the file contents at a later time. To create an empty file, use the touch command as demonstrated below: sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos sysadmin@localhost:~$ touch sample sysadmin@localhost:~$ ls -l sample
-rw-rw-r-- 1 sysadmin sysadmin 0 Nov
9 16:48 sample
sysadmin@localhost:~$
Notice the size of the new file is 0 bytes. As previously mentioned, the touch command doesn't place any data within the new file.
6.8 Removing Files To delete a file, use the rm command: sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates sample
Documents
Music
Public
Videos
sysadmin@localhost:~$ rm sample sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos sysadmin@localhost:~$
Note that the file was deleted with no questions asked. This could cause problems when deleting multiple files by using glob characters, for example: rm *.txt. Because these files are deleted without question, a user could end up deleting files that were not intended to be deleted. Additionally, the files are permanently deleted. There is no command to undelete a file and no "trash can" from which to recover deleted files. As a precaution, users should use the -i option when deleting multiple files: sysadmin@localhost:~$ touch sample.txt example.txt test.txt sysadmin@localhost:~$ ls
Desktop
Downloads
Pictures
Templates example.txt
Documents
Music
Public
Videos
test.txt
sample.txt
sysadmin@localhost:~$ rm -i *.txt
rm: remove regular empty file `example.txt'? y rm: remove regular empty file `sample.txt'? n rm: remove regular empty file `test.txt'? y sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates sample.txt
Documents
Music
Public
Videos
sysadmin@localhost:~$
6.9 Removing Directories You can delete directories using the rm command. However, the default usage (no options) of the rm command will fail to delete a directory: sysadmin@localhost:~$ rm Videos
rm: cannot remove `Videos': Is a directory sysadmin@localhost:~$
If you want to delete a directory, use the -r option to the rm command: sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates sample.txt
Documents
Music
Public
Videos
sysadmin@localhost:~$ rm -r Videos sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
sample.txt sysadmin@localhost:~$
Important: When a user deletes a directory, all of the fi les and subdirectories are deleted without any interactive question. It is best to use the -i option with the rm command.
You can also delete a directory with the rmdir command, but only if the directory is empty.
6.10 Making Directories To create a directory, use the mkdir command: sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
sample.txt sysadmin@localhost:~$ mkdir test sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates
Documents
Music
Public
sample.txt
test
sysadmin@localhost:~$
Lab 6 6.1 Introduction
This This is Lab 6: Listing Listing Files Files and Direc Directories. tories. By perform pe rforming ing this this lab, students s tudents will will learn how to navigate navigate and manage files files and directories. directo ries. In this this lab, you wil will perform per form the foll follow owing ing tasks: task s:
List file f iless and directories directories Copy, move and delete files and directories directories
6.2 Files and Directories In this this task t ask you will will explore e xplore the concepts conc epts of files files and directories. O n a Linux Linux OS, OS , data da ta is stored store d in files and fil files es are stored in directories. You may be used to the term folders to descri desc ribe be directories. directories. Directories are actually actually fil files, too; too ; the data da ta that that they hold are the names of o f the fi files that have been bee n entered entere d into into the them, them, along with the inode inode numbe numberr (a uni unique que identifie identifie r numbe numberr assigned to each fil file) e) for where the data d ata for for that fi file exists on the the disk. As a Linux Linux user, you will will want to know how to manipu manipulate late these files files and directories, direc tories, inclu includ d ing how to list files files in a directory, directo ry, copy, cop y, del de lete and a nd move files. files. Warning: Fil Warning: File e and directory directory names in Linux are case sensitive. This This means that a file named ABC is not the same as as a file fi le named abc.
6.2.1 Step 1 Type the foll followin ow ing g com co mmand to print the working directory: pwd sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$
The working work ing directory directory is the director direc tory y that that your termina terminall window is currently currently "in" "in".. This This is also called the current directory. This This will will be impo importa rtant nt for when you are running future comm co mmands ands as they will will behave be have diff differe ere ntly nt ly based ba sed on the directory directo ry you are currently in. in. The output of the pwd comm co mmand and (/home/sysadmin in the example above) abo ve) is called called the path . The fi first slash represe rep resents nts the root roo t directory, directo ry, the top level of the the directory directo ry structure. In the output above, home is a directory under under the root roo t directory directory and sysadmin is a directory under the home directory. When you first first open op en a terminal terminal window, window, you wil will be placed in your your home directory. This This is a directory directo ry where you have ful fulll acces ac cesss and other users normally have no access acc ess by
defaul de fault. t. To see the path pa th to your home directory, you can execute the foll follow owing ing comm co mmaa nd to vi view the value value of the the HOME variable: echo ech o $HOME sysadmin@localhost:~$ echo $HOME
/home/sysadmin sysadmin@localhost:~$
6.2.2 Step 2 You can use the cd com co mmand with with a path to a directory directory to change change your current di d irectory. Type the foll followin ow ing g com co mmand to make the root roo t directory directo ry your current workin work ing g di d irectory rec tory and verify verify with with the pwd command: cd / pwd sysadmin@localhost:~$ cd / sysadmin@localhost:/$ pwd
/ sysadmin@localhost:/$
6.2.3 Step 3 To change change back b ack to your home home directory, d irectory, the cd command command can be executed executed withou withoutt a path. Change Cha nge back bac k to your home home direc directory tory and verify by typing typing the foll following ow ing commands: cd pwd sysadmin@localhost:/$ cd sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$
Notice Notice the the change change in the the prompt . The The tilde tilde ~ character represents your home directory. This This part p art of the prompt pro mpt wil will tell you what direc directory tory you are currently in. in.
6.2.4 Step 4 The cd com co mmand may be entered with a path pa th to a directory directo ry specif spe cifiied as an argument . Execute the cd comm co mmand and with the /home directory directo ry as an argumen argumentt by typing typing the following:
cd /home pwd sysadmin@localhost:~$ cd /home sysadmin@localhost:/home$ pwd
/home sysadmin@localhost:/home$
When the path pa th that is provided as an argument argument to the cd comm co mmand and starts with the forwa forward rd slash /, that path is is referred to as an “abso “absollute path”. Absolute Absolute paths are always always complete complete paths paths from the the root direct directory ory to a sub-direct sub-directory ory or fi file.
6.2.5 Step 5 Change Cha nge back bac k to your home home direc directory, tory, using using the cd comm co mmand and with the tilde tilde ~ as an argument: cd ~ pwd sysadmin@localhost:/home$ cd ~ sysadmin@localhost:~$ pwd
/home/sysadmin sysadmin@localhost:~$
When the path that that is provided as an argument argument to cd comm co mmand and starts with with a tilde tilde ~ character chara cter,, the termina terminall will will expand the characte chara cterr to the home home directory directo ry of a user with with an account on the system. If either either no other characters charac ters or a forward slash follows follows the tilde, tilde, then it will will expand expa nd to the the use r currently currently active in the home directory of the use the shell shell.. If a user name name imm immed ia tely te ly foll follows the tilde tilde charac c haracter, ter, then the the shell will will expand expa nd the til tilde and user name to the the home home directory directo ry of that user name. For F or example, example, ~bob would would be expanded to /home/bob. Paths that start wi with a til tilde are considered considered absol abso lute paths because bec ause after the the shell shell expands the tilde path, an absolute path is formed.
6.2.6 Step 6 Use the echo com co mmand bel be low to display some other examples examples of using using the tilde tilde as part pa rt of path: echo ech o
~ ~sysadmin ~sysadmin ~root ~mail ~nobody
sysadmin@localhost:~$ echo ~ ~sysadmin ~root ~mail ~nobody
/home/sysadmin /home/sysadmin /root /var/mail /nonexistent sysadmin@localhost:~$
6.2.7 Step 7 Attempt to change to the home directory d irectory of the root roo t user by typing typing the foll followin ow ing g command: cd ~root sysadmin@localhost:~$ cd ~root
-bash: cd: /root: Permission denied sysadmin@localhost:~$
Notice Notice the the error messag message; e; it indicates ndicates that that the the shell shell attem attempted pted to execu execute te cd with /root as an argument argument and itit fail failed ed due to permi pe rmission ssion bein be ing g deni de nied ed.. You will will learn more about abo ut file file and a nd directory directo ry permi pe rmission ssionss in a later lab.
6.2.8 Step 8 Using Using an a n absolute ab solute path, change to the /usr/bin directory directo ry and a nd display display the workin work ing g directory directo ry by b y using using the followin following g comm co mmands ands:: cd /usr/bin /usr/bin pwd sysadmin@localhost:~$ cd /usr/bin sysadmin@localhost:/usr/bin$ pwd
/usr/bin sysadmin@localhost:/usr/bin$
6.2.9 Step 9 Use an absolute absolute path to change change the /usr directory directo ry and a nd display the worki work ing directory directo ry by issuing the following commands: cd /usr pwd sysadmin@localhost:/usr/bin$ cd /usr sysadmin@localhost:/usr$ pwd
/usr
sysadmin@localhost:/usr$
6.2.10 Step 10 Use an absolute absolute path the change change to the /usr/share/doc directory directo ry and display the workin work ing g directory direc tory by issuing issuing the followin following g com co mmands: cd /usr/share/doc /usr/share/doc pwd sysadmin@localhost:/usr$ cd /usr/share/doc sysadmin@localhost:/usr/share/doc$ pwd
/usr/share/doc sysadmin@localhost:/usr/share/doc$
Absolute vs. Relative Rel ative pathnames pathnames
Suppose you are in the /usr/share/doc directory and you want to go to the /usr/share/doc/bash directory. directo ry. Typing Typing the com co mmand cd /usr/share/doc/bash results in a fair fair amount of typing. typing. In cases cas es like like this, you want to use relative relative pathnam pa thnames. es. With relative relative pathnam pat hnames es you provide provide "direc "directio tions" ns" of where you want to go from the the current directory. The following examples will illustrate using relative pathnames.
6.2.11 Step 11 Using Using a relative relative path, path, change to the the /usr/share/doc/bash directory directo ry and display display the workin work ing g directory direc tory by issuing issuing the followin following g comm co mmands ands:: cd bash pwd sysadmin@localhost:/usr/share/doc$ cd bash sysadmin@localh sys admin@localhost:/usr/share/ ost:/usr/share/doc/bash$ doc/bash$ pwd
/usr/share/doc/bash sysadmin@localh sys admin@localhost:/usr/share/ ost:/usr/share/doc/bash$ doc/bash$
Note: If Note: If there wasn't a bash directory under the current directory, the p revious command would would fail.
6.2.12 Step 12 Use a relative relative path to change change to the directory above abo ve the the curren c urrentt directory:
cd .. pwd sysadmin@localhost:/usr/share/doc/bash$ cd .. sysadmin@localhost:/usr/share/doc$ pwd
/usr/share/doc sysadmin@localhost:/usr/share/doc$
The .. represents one level above your current directory location.
6.2.13 Step 13 Use a relative path to change up one level from the current directory and then down into the dict directory: cd ../dict pwd sysadmin@localhost:/usr/share/doc$ cd ../dict sysadmin@localhost:/usr/share/dict$ pwd
/usr/share/dict sysadmin@localhost:/usr/share/dict$
6.3 Listing Files and Directories In this task, you will explore the how to list files and directories.
6.3.1 Step 1 To list the contents of the current directory, use the ls command: cd ls
Your output should be similar to the following: sysadmin@localhost:/usr/share/dict$ cd sysadmin@localhost:~$ ls Desktop
Documents
Downloads
sysadmin@localhost:~$
Music
Pictures
Public
Templates
Videos
In the output of the previous ls command the file names were placed in a light blue color. This is a feature that many distributions of Linux automatically provide through a feature called an alias (more on this feature in a later lab). The color indi cates what type the item is. The foll owing table describes some of the more common colors: Color
Type of File
Black or White Regular file Blue
Directory file
Cyan
Symbolic link file (a file that points to another file)
Green
Executable file (AKA, a program)
6.3.2 Step 2 Not all files are displayed by default. There are files, called hidden files, that are not displayed by default. To display all files, including hidden files, use the -a option to the ls command: ls -a sysadmin@localhost:~$ ls -a .
.bashrc
.selected_editor
Downloads
Public
..
.cache
Desktop
Music
Templates
.bash_logout
.profile
Documents
Pictures
Videos
sysadmin@localhost:~$
Hidden files begin with a period (a dot character). Typically these files and often directories are hidden because they are not files you normally want to see. For example, the .bashrc file shown in the example above contains configuratio n information for the bash shell. This is a file that you normally don't need to view on a regular basis. Two important "dot files" exist in every directory: . (which represents the current directory) and .. (which represents the directory above the current directory).
6.3.3 Step 3 By itself, the ls command just provided the names of the files and directories within the specified (or current) directory. Execute the following command to see how the -l option provides more information about a file:
ls -l /etc/hosts
Your output should be similar to the following: sysadmin@localhost:~$ ls -l /etc/hosts
-rw-r--r-- 1 root root 150 Jan 22 15:18 /etc/hosts sysadmin@localhost:~$
So, what does all of this extra output mean? The following table provides a brief breakdown of what each part of the output of ls -l means: -
The first character, a - in the previous example, indicates what type of "fil e" this is. A - character is for plain file while a d character would be for a di rectory.
rw-r--rThis represents the permissions of the file. Permissions are discussed in a later lab. 1
This represents something called a hard link count (discussed later).
root
The user owner of the file.
root
The group owner of the file.
150
The size of the file in bytes
Jan 22 15:18
The date/time when the fi le was last modified.
6.3.4 Step 4 Sometimes you want to see not only the contents of a directory, but also the contents of the subdirectories. You can use the -R option to accomplish this: ls -R /etc/udev sysadmin@localhost:~$ ls -R /etc/udev
/etc/udev: rules.d udev.conf
/etc/udev/rules.d: 70-persistent-cd.rules README sysadmin@localhost:~$
The -R option stands for "recursive". All of the files in the /etc/udev directory will be displayed as well as all of the files in each subdirectory, in this case the rules.d subdirectory. Be careful of the -R option. Some directories are very, very large!
6.3.5 Step 5 You can use file globbing (wildcards) to limit which files or directories you see. For example, the * character can match "zero or more of any characters" in a filename. Execute the following command to display only the files that begin with the letter s in the /etc directory: ls -d /etc/s*
Your output should be similar to the following: sysadmin@localhost:~$ ls -d /etc/s*
/etc/securetty
/etc/sgml
/etc/shells
/etc/ssl
/etc/sysctl.conf
/etc/security
/etc/shadow
/etc/skel
/etc/sudoers
/etc/sysctl.d
/etc/services
/etc/shadow-
/etc/ssh
/etc/sudoers.d
/etc/systemd
sysadmin@localhost:~$
Note that the -d option prevents files from subdirectories from being displayed. It should always be used with the ls command when you are using file globbing.
6.3.6 Step 6 The ? character can be used to match exactly 1 character in a file name. Execute the following command to display all of the files in the /etc directory that are exactly four characters long: ls -d /etc/????
Your output should be similar to the following: sysadmin@localhost:~$ ls -d /etc/???? /etc/bind
/etc/init
/etc/motd
/etc/perl
/etc/skel
/etc/dpkg
/etc/ldap
/etc/mtab
/etc/sgml
/etc/udev
sysadmin@localhost:~$
6.3.7 Step 7
By using square brackets [ ] you can specify a single character to match from a set of characters. Execute the following command to display all of the files in the /etc directory that begin with the letters a, b, c or d: ls
–d
/etc/[abcd]*
Your output should be similar to the following: sysadmin@localhost:~$ ls -d /etc/[abcd]*
/etc/adduser.conf
/etc/blkid.conf
/etc/cron.weekly
/etc/adjtime
/etc/blkid.tab
/etc/crontab
/etc/alternatives
/etc/ca-certificates
/etc/dbus-1
/etc/apparmor.d
/etc/ca-certificates.conf
/etc/debcon f.conf
/etc/apt
/etc/calendar
/etc/debian_version
/etc/bash.bashrc
/etc/cron.d
/etc/default
/etc/bash_completion.d
/etc/cron.daily
/etc/deluser.conf
/etc/bind
/etc/cron.h ourly
/etc/depmod .d
/etc/bindresvport.blacklist
/etc/cron.monthly
/etc/dpkg
sysadmin@localhost:~$
6.4 Copying, Moving and Renaming Files and Directories In this task, you will copy, move, and remove files and directories.
6.4.1 Step 1 Make a copy of the /etc/hosts file and place it in the current directory. Then list the contents of the current directory before and after the copy: ls cp /etc/hosts hosts ls
Your output should be similar to the following: sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
sysadmin@localhost:~$ cp /etc/hosts hosts sysadmin@localhost:~$ ls
Public
Templates
Videos
Desktop
Downloads
Pictures
Templates hosts
Documents
Music
Public
Videos
sysadmin@localhost:~$
Notice how the second ls command displays a copy of the hosts file.
6.4.2 Step 2 Next you will remove the file, then copy it again, but have the system tell you what is being done. This can be achieved using the -v or --verbose option. Enter the following commands: rm hosts ls cp
–v
/etc/hosts hosts
ls
Note that the rm command is used to delete a file. More information on this command will be provided later in this lab. Your output should be similar to the following: sysadmin@localhost:~$ rm hosts sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos
sysadmin@localhost:~$ cp -v /etc/hosts hosts
`/etc/hosts' -> `hosts' sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates hosts
Documents
Music
Public
Videos
sysadmin@localhost:~$
Note that the -v switch displays the source and target when the cp command is executed.
6.4.3 Step 3 Enter the following commands to copy the /etc/hosts file, using the period . character to indicate the current directory as the target: rm hosts
ls cp
–v
/etc/hosts .
ls
Your output should be similar to the following: sysadmin@localhost:~$ rm hosts sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos
sysadmin@localhost:~$ cp -v /etc/hosts .
`/etc/hosts' -> `hosts' sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates hosts
Documents
Music
Public
Videos
sysadmin@localhost:~$
The period . character is a handy way to say "the current directory". It can be used with all Linux commands, not just the cp command.
6.4.4 Step 4 Enter the following commands to copy from the source directory and preserve file attributes by using the -p option: rm hosts ls cd /etc ls -l hosts cp
–p
hosts /home/sysadmin
–l
hosts
cd ls
Your output should be similar to the following: sysadmin@localhost:~$ rm hosts sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
sysadmin@localhost:~$ cd /etc sysadmin@localhost:/etc$ ls -l hosts
Pictures
Public
Templates
Videos
-rw-r--r-- 1 root root 150 Jan 22 15:18 hosts sysadmin@localhost:/etc$ cp -p hosts /home/sysadmin sysadmin@localhost:/etc$ cd sysadmin@localhost:~$ ls -l hosts
-rw-r--r-- 1 sysadmin sysadmin 150 Jan 22 15:18 hosts sysadmin@localhost:~$
Notice that the date and permission modes were preserved . Note that the timesta mp in the output above is the same for both the original and the copy (Jan 22 15:18 ) in the example provided above. Your output may vary.
6.4.5 Step 5 Type the following commands to copy using a different target name: rm
hosts
cp -p /etc/hosts ~ cp hosts newname ls
–l
hosts newname
rm hosts newname sysadmin@localhost:~$ rm hosts sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos
sysadmin@localhost:~$ cp -p /etc/hosts ~ sysadmin@localhost:~$ cp hosts newname sysadmin@localhost:~$ ls -l hosts newname
-rw-r--r-- 1 sysadmin sysadmin 150 Jan 22 15:18 hosts -rw-r--r-- 1 sysadmin sysadmin 150 Jan 22 16:29 newname sysadmin@localhost:~$ rm hosts newname sysadmin@localhost:~$
The first copy with the -p option preserved the original timestamp. Recall that the tilde ~ represents your home directory (/home/sysadmin). The second copy specified a different filename (newname) as the target. Because it was issued without the -p option, the system used the current date and time for the target, thus, it did not preserve the original timesta mp found in the source file /etc/hosts. Finally, note that you can remove more than one file at a time as shown in the last rm command.
6.4.6 Step 6 To copy all files in a directory use the -R option. For this task, you will copy the /etc/udev directory and display the contents of the copied directory: mkdir Myetc cp
–R
/etc/udev Myetc
ls
–l
Myetc
ls
– lR
Myetc
sysadmin@localhost:~$ mkdir Myetc sysadmin@localhost:~$ cp -R /etc/udev Myetc sysadmin@localhost:~$ ls -l Myetc
total 0 drwxr-xr-x 1 sysadmin sysadmin 32 Jan 22 16:35 udev sysadmin@localhost:~$ ls -lR Myetc
Myetc: total 0 drwxr-xr-x 1 sysadmin sysadmin 32 Jan 22 16:35 udev
Myetc/udev: total 4 drwxr-xr-x 1 sysadmin sysadmin
56 Jan 22 16:35 rules.d
-rw-r--r-- 1 sysadmin sysadmin 218 Jan 22 16:35 udev.conf
Myetc/udev/rules.d: total 8 -rw-r--r-- 1 sysadmin sysadmin
306 Jan 22 16:35 70-persistent-c d.rules
-rw-r--r-- 1 sysadmin sysadmin 1157 Jan 22 16:35 README sysadmin@localhost:~$
6.4.7 Step 7 To remove a directory use the -r option to the rm command: ls rm -r Myetc ls
Your output should be similar to the following: sysadmin@localhost:~$ ls Desktop
Downloads
Myetc
Public
Videos
Documents
Music
Pictures
Templates
sysadmin@localhost:~$ rm -r Myetc sysadmin@localhost:~$ ls Desktop
Documents
Downloads
Music
Pictures
Public
Templates
Videos
sysadmin@localhost:~$
Note that the rmdir command can also be used to delete directories, but only if the directory is empty (if it contains no files). Also note the -r option. This option removes directories and their contents recursively.
6.4.8 Step 8 Moving a file is analogous to a "cut and paste". The file is “cut” (removed) from the original location and “pasted” to the specified destination. Move a file in the local directory by executing the following commands: touch premove ls mv premove postmove ls rm postmove
Linux Command
Description
touch premove
Creates an empty file called premove
mv premove postmove
This command “cuts” the premove file and “pastes” it to a file called
rm postmove
Removes postmove file
postmove
Your output should be similar to the following: sysadmin@localhost:~$ touch premove sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates
Documents
Music
Public
Videos
premove
sysadmin@localhost:~$ mv premove postmove sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates postmove
Documents
Music
Public
Videos
sysadmin@localhost:~$
Chapter 7 7.1 Introduction In this chapter, we discuss how to manage archive files at the command line.
File archiving is used when one or more files need to be transmitted or stored as efficiently as possible. There are two aspects to this:
Archiving – Combining multi ple files into one, which eliminates the overhead in individual files and makes it e asier to transmit Compressing – Making the files smaller by removing redundant information
You can archive multiple files into a single archive and then compress it, or you can compress an individual file. The former is still referred to as archiving, while the latter is just called compression. When you take an archive, decompress it and extract one or more files, you are un-archiving it. Even though disk space is relatively cheap, archiving and compression still has value:
If you want to make a large number of file s available, such as the source code to an application or a collection of documents, it is easier for people to downl oad a compressed archive than it is to download indi vidual files. Log fil es have a habit of fi lli ng disks so it is helpf ul to split them by date and compress older versions. When you back up directories, it is easier to kee p them all i n one archive than it i s to version each file. Some streaming devi ces such as tapes perform better if you’re sending a stream of data rather than individual files. It can often be faster to compress a file before you send it to a tape drive or over a slower network and decompress it on the other end than it would be to send it uncompressed.
As a Linux administrator, you should become familiar with the tools for archiving and compressing files.
7.2 Compressing files Compressing files makes them smaller by removing duplication from a file and storing it such that the file can be restored. A file with human readable text might have frequently used words replaced by something smaller, or an image with a solid background might represent patches of that color by a code. You generally don’t use the compressed version of the file, instead you decompress it before use. The compression algorithm is a procedure the computer does to encode the original file, and as a result make it smaller. Computer scientists research these algorithms and come up with better ones that can work faster or make the input file smaller.
When talking about compression, there are two types:
Lossless: No information is removed from the file. Compressing a file and decompressing it leaves something identical to the original. Lossy: Information might be removed from the file as it is compressed so that uncompressing a file will result in a file that is sli ghtly different than the original. For instance, an image with two subtly dif ferent shades of green might be made smaller by treating those two shades as the same. Often, the eye can’t pick out the difference anyway.
Generally human eyes and ears don’t notice slight imperfections in pictures and audio, especially as they are displayed on a monitor or played over speakers. Lossy compression often benefits media because it results in smaller file sizes and people can’t tell the difference between the original and the version with the changed data. For things that must remain intact, such as documents, logs, and software, you need lossless compression. Most image formats, such as GIF, PNG, and JPEG, implement some kind of lossy compression. You can generally decide how much quality you want to preserve. A lower quality results in a smaller file, but after decompression you may notice artifacts such as rough edges or discolorations. High quality will look much like the original image, but the file size will be closer to the original.
Compressing an already compressed file will not make it smaller. This is often forgotten when it comes to images, since they are already stored in a compressed format. With lossless compression, this multiple compression is not a pro blem, but if you compress and decompress a file several times using a lossy algorithm you will eventually have something that is unrecognizab le. Linux provides several tools to compress files, the most common is gzip. Here we show a log file before and after compression. bob:tmp $ ls -l access_log*
-rw-r--r-- 1 sean sean 372063 Oct 11 21:24 access_log bob:tmp $ gzip access_log bob:tmp $ ls -l access_log*
-rw-r--r-- 1 sean sean 26080 Oct 1 1 21:24 access_log.gz
In the example above, there is a file called access_log that is 372,063 bytes. The file is compressed by invoking the gzip command with the name of the file as the only argument. After that command completes, the original file is gone and a compressed version with a file extension of .gz is left in its place. The file size is now 26,080 bytes, giving a compression ratio of about 14:1, which is common with log files. Gzip will give you this informat ion if you ask, by using the l parameter, as shown here: –
bob:tmp $ gzip -l access_log.gz
compressed
uncompressed
26080
372063
ratio uncompressed_name 93.0% access_log
Here, you can see that the compression ratio is given as 93%, which is the inverse of the 14:1 ratio, i.e. 13/14. Additionally, when the file is decompressed it will be called access_log. bob:tmp $ gunzip access_log.gz bob:tmp $ ls -l access_log*
-rw-r--r-- 1 sean sean 372063 Oct 11 21:24 access_log
The opposite of the gzip command is gunzip. Alternatively, gzip d does the same thing (gunzip is just a script that calls gzip with the right parameters). After gunzip does its work you can see that the access_log file is back to its original size. –
Gzip can also act as a filter which means it doesn’t read or write anything to disk but instead receives data through an input channel and writes it out to an output channel. You’ll learn more about how this works in the next chapter, so the next example just gives you an idea of what you can do by being able to compress a stream.
bob:tmp $ mysqldump -A | gzip > database_backup.gz bob:tmp $ gzip -l database_backup.gz
compressed
uncompressed
76866
1028003
ratio uncompressed_name 92.5% database_backup
The mysqldump A command outputs the contents of the local MySQL databases to the console. The | character (pipe) says “redirect the output of the previous command into the input of the next one”. The program to receive the output is gzip, which recognizes that no filenames were given so it should operate in pipe mode. Finally, the > database_backup.gz means “redirect the output of the previous command into a file called database_backup.gz. Inspecting this file with gzip l shows that the compressed version is 7.5% of the size of the original, with the added benefit that the larger file never had to be written to disk. –
–
There is another pair of commands that operate virtually identically to gzip and gunzip. These are bzip2 and bunzip2. The bzip utilities use a different compression algorithm (called Burrows-Wheeler block sorting, versus Lempel-Ziv coding used by gzip) that can compress files smaller than gzip at the expense of more CPU time. You can recognize these files because they have a .bz or bz2 extension instead of .gz.
7.3 Archiving Files If you had several files to send to someone, you could compress each one individually. You would have a smaller amount of data in total than if you sent uncompressed files, but you would still have to deal with many files at one time. Archiving is the solution to this problem. The traditional UNIX utility to archive files is called tar, which is a short form of TApe aRchive. Tar was used to stream many files to a tape for backups or file transfer. Tar takes in several files and creates a single output file that can be split up again into the original files on the other end of the transmission.
Tar has 3 modes you will want to be familiar with:
Create: make a new archive out of a series of files Extract: pull one or more files out of an archive List: show the contents of the archive without extracting
Remembering the modes is key to figuring out the command line options necessary to do what you want. In addition to the mode, you will also want to make sure you remember where to specify the name of the archive, as you may be entering multiple file names on a command line. Here, we show a tar file, also called a tarball, being created from multiple access logs. bob:tmp $ tar -cf access_logs.tar access_log* bob:tmp $ ls -l access_logs.tar
-rw-rw-r-- 1 sean sean 542720 Oct 12 21:42 access_logs.tar
Creating an archive requires two named options. The first, c, specifies the mode. The second, f, tells tar to expect a file name as the next argument. The first argument in the example above creates an archive called access_logs.tar. The remaining arguments are all taken to be input file names, either as a wildcard, a list of files, or both. In this example, we use the wildcard option to include all files that begin with access_log. The example above does a long directory listing of the created file. The final size is 542,720 bytes which is slightly larger than the input files. Tarballs can be compressed for easier transport, either by gzipping the archive or by having tar do it with the z flag as follows: bob:tmp $ tar -czf access_logs.tar.gz
access_log*
bob:tmp $ ls -l access_logs.tar.gz
-rw-rw-r-- 1 sean sean 46229 Oct 12 2 1:50 access_logs.tar.gz bob:tmp $ gzip -l access_logs.tar.gz
compressed
uncompressed
46229
542720
ratio uncompressed_name 91.5% access_logs.tar
The example above shows the same command as the prior example, but with the addition of the z parameter. The output is much smaller than the tarball itself, and the resulting file is compatible with gzip. You can see from the last command that the uncompressed file is the same size as it would be if you tarred it in a separate step. While UNIX doesn’t treat file extensions specially, the convention is to use .tar for tar files, and .tar.gz or .tgz for compressed tar files. You can use bzip2 instead of gzip by substituting the letter j for z and using .tar.bz2, .tbz, or .tbz2 for a file extension (e.g. tar cjf file.tbz access_log* ). –
Given a tar file, compressed or not, you can see what’s in it by using the t command: bob:tmp $ tar -tjf access_logs.tbz
logs/ logs/access_log.3 logs/access_log.1 logs/access_log.4 logs/access_log logs/access_log.2
This example uses 3 options:
t: list files in the archive
j: decompress with bzip2 before reading f: operate on the given fi lename access_logs.tbz
The contents of the compressed archive are then displayed. You can see that a directory was prefixed to the files. Tar will recurse into subdirectories automatically when compressing and will store the path info inside the archive. Just to show that this file is still nothing special, we will list the contents of the file in two steps using a pipeline. bob:tmp $ bunzip2 -c access_logs.tbz | tar -t
logs/ logs/access_log.3 logs/access_log.1 logs/access_log.4 logs/access_log logs/access_log.2
The left side of the pipeline is bunzip c access_logs.tbz , which decompresses the file, but the -c option sends the output to the screen. The output is redirected to tar t. If you don’t specify a file with f then tar will read from the standard input, which in this case is the uncompressed file. –
–
–
Finally you can extract the archive with the x flag: –
bob:tmp $ tar -xjf access_logs.tbz bob:tmp $ ls -l
total 36 -rw-rw-r-- 1 sean s ean 30043 Oct 14 13:27 access_logs.tbz drwxrwxr-x 2 sean sean
4096 Oct 14 13:26 logs
bob:tmp $ ls -l logs
total 536 -rw-r--r-- 1 sean sean 372063 Oct 11 21:24 access_log -rw-r--r-- 1 sean sean
362 Oct 12 21:41 access_log.1
-rw-r--r-- 1 sean sean 153813 Oct 12 21:41 access_log.2 -rw-r--r-- 1 sean sean
1136 Oct 12 21:41 access_log.3
-rw-r--r-- 1 sean sean
784 Oct 12 21:41 access_log.4
The example above uses the similar pattern as before, specifying the operation (eXtract), the compression (the j flag, meaning bzip2), and a file name (-f access_logs.tbz). The original file is untouched and the new logs directory is created. Inside the directory are the files.
Add the v flag and you will get verbose output of the files processed. This is helpful so you can see what’s happening: –
bob:tmp $ tar -xjvf access_logs.tbz
logs/ logs/access_log.3 logs/access_log.1 logs/access_log.4 logs/access_log logs/access_log.2
It is important to keep the f flag at the end, as tar assumes whatever follows it is a filename. In the next example, the f and v flags were transposed, leading to tar interpreting the command as an operation on a file called "v" (the relevant message is in italics.) –
bob:tmp $ tar -xjfv access_logs.tbz
tar (child): v: Cannot open: No such file or directory tar (child): Error is not recoverable: exiting now tar: Child returned status 2 tar: Error is no t recoverable: exiting now
If you only want some files out of the archive you can add their names to the end of the command, but by default they must match the name in the archive exactly or use a pattern: bob:tmp $ tar -xjvf access_logs.tbz logs/access_log
logs/access_log
The example above shows the same archive as before, but extracting only the logs/access_log file. The output of the command (as verbose mode was requested with the v flag) shows only the one file has been extracted. Tar has many more features, such as the ability to use patterns when extracting files, excluding certain files, or outputting the extracted files to the screen instead of disk. The documentation for tar has in depth information.
7.4 ZIP files The de facto archiving utility in the Microsoft world is the ZIP file. It is not as prevalent in Linux but is well supported by the zip and unzip commands. With tar and gzip/gunzip the same commands and options can be used to do the creation and
extraction, but this is not the case with zip. The same option has different meanings for the two different commands. The default mode of zip is to add files to an archive and compress it. bob:tmp $ zip logs.zip logs/* adding: logs/access_log (deflated 93%) adding: logs/access_log.1 (deflated 62%) adding: logs/access_log.2 (deflated 88%) adding: logs/access_log.3 (deflated 73%) adding: logs/access_log.4 (deflated 72%)
The first argument in the example above is the name of the archive to be operated on, in this case it is logs.zip. After that, is a list of files to be added. The output shows the files and the compression ratio. It should be noted that tar requires the f option to indicate a filename is being passed, while zip and unzip require a filename and therefore don’t need you to explicitly say a filena me is being passed. –
Zip will not recurse into subdirectories by default, which is different behavior than tar. That is, merely adding logs instead of logs/* will only add the empty directory and not the files under it. If you want tar like behavior, you must use the r command to indicate recursion is to be used: –
bob:tmp $ zip -r logs.zip logs
adding: logs/ (stored 0%) adding: logs/access_log.3 (deflated 73%) adding: logs/access_log.1 (deflated 62%) adding: logs/access_log.4 (deflated 72%) adding: logs/access_log (deflated 93%) adding: logs/access_log.2 (deflated 88%)
In the example above, all files under the logs directory are added because it uses the r option. The first line of output indicates that a directory was added to the archive, but otherwise the output is similar to the previous example. –
Listing files in the zip is done by the unzip command and the l option (list): –
bob:tmp $ unzip -l logs.zip
Archive:
logs.zip
Length
Date
Time
Name
---------
---------- -----
----
0
10-14-2013 14:07
logs/
1136
10-14-2013 14:07
logs/access_log.3
362
10-14-2013 14:07
logs/access_log.1
784
10-14-2013 14:07
logs/access_log.4
90703
10-14-2013 14:07
logs/access_log
153813
10-14-2013 14:07
logs/access_log.2
---------
-------
246798
6 files
Extracting the files is just like creating the archive, as the default operation is to extract: bob:tmp $ unzip logs.zip
Archive:
logs.zip
creating: logs/ inflating: logs/access_log.3 inflating: logs/access_log.1 inflating: logs/access_log.4 inflating: logs/access_log inflating: logs/access_log.2
Here, we extract all the files in the archive to the current directory. Just like tar, you can pass filenames on the command line: bob:tmp $ unzip logs.zip access_log
Archive:
logs.zip
caution: filename not matched:
access_log
bob:tmp $ unzip logs.zip logs/access_log
Archive:
logs.zip
inflating: logs/access_log bob:tmp $ unzip logs.zip logs/access_log.*
Archive:
logs.zip
inflating: logs/access_log.3 inflating: logs/access_log.1 inflating: logs/access_log.4 inflating: logs/access_log.2
The example above shows three different attempts to extract a file. First, just the name of the file is passed without the directory component. Like tar, the file is not matched.
The second attempt passes the directory component along with the filename, which extracts just that file. The third version uses a wildcard, which extracts the 4 files matching the pattern, just like tar. The zip and unzip man pages describe the other things you can do with these tools, such as replace files within the archive, use different compression levels, and even use encryption.
Lab 7 7.1 Introduction This is Lab 7: Archiving and Unarchiving Files. By performing this lab, students will learn how to work with archive files. In this lab, you will perform the following tasks:
Create archive files using tar with and without compression Compress and uncompress files into a gzip archive f ile Compress and uncompress files into a bzip2 archive fil e Use zip and unzip to compress and uncompress archive files
7.2 Archiving Commands In this task, we will use gzip, bzip2, tar and zip/unzip to archive and restore files. These commands are designed to either merge multiple files into a single file or compress a large file into a smaller one. In some cases the commands will perform both functions. The task of archiving data is important for several reasons, including but not limited to the following: a. Large files may be diff icult to transfer. Making these fi les smaller helps make transfer quicker. b. Transferring multiple files from one system to another can become tedious when there are many files. Merging them into a single file for transport makes this process easier. c. Files can quickly take up a lot of s pace, especiall y on smaller removable media like thumb drives. Archiving reduces this problem.
One potential area of confusion that a beginning Linux user may experience stems from the following question: why are there so many different archiving commands? The answer is that these commands have differe nt features (for example, some of them allow you to password protect the archive file) and compression techniques used. The most important things to know for now is how these different commands function. Over time you will learn to pick the correct archive tool for any given situation.
7.2.1 Step 1
Use the following tar command to create an archive of the /etc/udev directory. Save the backup in the ~/mybackups directory: cd mkdir m ybackups tar
–cvf
mybackups/udev.tar /etc/udev
ls mybackups
Your output should be similar to the following: sysadmin@localhost:~$ cd sysadmin@localhost:~$ mkdir mybackups sysadmin@localhost:~$ tar -cvf mybackups/udev.tar /etc/udev
tar: Removing leading `/' from member na mes /etc/udev/ /etc/udev/rules.d/ /etc/udev/rules.d/70-persistent-cd.rules /etc/udev/rules.d/README /etc/udev/udev.conf sysadmin@localhost:~$ ls mybackups/ udev.tar sysadmin@localhost:~$
The tar command is used to merge multiple files into a single file. By default it does not compress the data. The -c option tells the tar command to create a tar file. The -v option stands for "verbose", which instructs the tar command to demonstrate what it is doing. The -f option is used to specify the name of the tar file. FYI: tar stands for Tape ARchive. This command was originally used to create tape backups, but today it is more commonly used to create archive files. Important: You are not requi red to use the .tar extension to the archive fi le name, however it is helpful for determining the file type. It is considered "good style" when sending an archive fil e to another person.
7.2.2 Step 2 Display the contents of a tar file (t = list contents, v = verbose, f = filename):
tar
–tvf
mybackups/udev.tar
Your output should be similar to the following: sysadmin@localhost:~$ tar -tvf mybackups/udev.tar
drwxr-xr-x root/root
0 2015-01-28 16:32 etc/udev/
drwxr-xr-x root/root
0 2015-01-28 16:32 etc/udev/rules.d/
-rw-r--r-- root/root -cd.rules -rw-r--r-- root/root -rw-r--r-- root/root
306 2015-01-28 16:32 etc/udev/rules.d/70-persistent 1157 2012-04-05 19:18 etc/udev/rules.d/README 218 2012-04-05 19:18 etc/udev/udev.conf
sysadmin@localhost:~$
Notice that files were backed up recursively using relative path names. This is important because when you extract the files, they will be placed in your current directory, not override the current files.
7.2.3 Step 3 To create a tar file that is compressed use -z option: tar ls
–zcvf
– lh
mybackups/udev.tar.gz /etc/udev
mybackups
Your output should be similar to the following: sysadmin@localhost:~$ tar -zcvf mybackups/udev.tar.gz / etc/udev
tar: Removing leading `/' from member names /etc/udev/ /etc/udev/rules.d/ /etc/udev/rules.d/70-persistent-cd.rules /etc/udev/rules.d/README /etc/udev/udev.conf sysadmin@localhost:~$ ls -lh mybackups/
total 16K -rw-rw-r-- 1 sysadmin sysadmin
10K Jan 25 04:00 udev.tarf
-rw-rw-r-- 1 sysadmin sysadmin 1.2K Jan 25 04:34 udev.tar.gz sysadmin@localhost:~$
Notice the difference in size; first backup (10 Kbytes) is larger than the second backup (1.2 Kbytes). The -z option makes use of the gzip utility to perform compression.
7.2.4 Step 4 Extract the contents of an archive. Data is restored to the current directory by default: cd mybackups tar
–xvf
udev.tar.gz
ls ls etc ls etc/udev ls etc/udev/rules.d
Your output should be similar to the following: sysadmin@localhost:~$ cd mybackups sysadmin@localhost:~/mybackups$ ls udev.tar
udev.tar.gz
sysadmin@localhost:~/mybackups$ tar -xvf ud ev.tar.gz
etc/udev/ etc/udev/rules.d/ etc/udev/rules.d/70-persistent-cd.rules etc/udev/rules.d/README etc/udev/udev.conf sysadmin@localhost:~/mybackups$ ls etc
udev.tar
udev.tar.gz
sysadmin@localhost:~/mybackups$ ls etc
udev sysadmin@localhost:~/mybackups$ ls etc/udev
rules.d
udev.conf
sysadmin@localhost:~/mybackups$ ls etc/udev/rules.d
70-persistent-cd.rules README sysadmin@localhost:~/mybackups$
If you wanted the files to "go back" into their original location, you could first cd to the / directory and then run the tar command. However, in this example, this would require
you to be logged in as an administrator because creating files in the /etc directory can only be done by the administrator.
7.2.5 Step 5 To add a file to an existing archive, use the -r option to the tar command. Execute the following commands to perform this action and verify the existence of the new file in the tar archive: tar -rvf udev.tar /etc/hosts tar
–tvf
udev.tar
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ tar - rvf udev.tar /etc/hosts
tar: Removing leading `/' from member names /etc/hosts sysadmin@localhost:~/mybackups$ tar -tvf udev.tar
drwxr-xr-x root/root
0 2015-01-28 16:32 etc/udev/
drwxr-xr-x root/root
0 2015-01-28 16:32 etc/udev/rules.d/
-rw-r--r-- root/root -c
306 2015-01-28 16:32 etc/udev/rules.d/70-persistent
d.rules -rw-r--r-- root/root
1157 2012-04-05 19:18 etc/udev/rules.d/README
-rw-r--r-- root/root
218 2012-04-05 19:18 etc/udev/udev.conf
sysadmin@localhost:~/mybackups$
7.2.6 Step 6 In the following examples, you will use gzip and gunzip to compress and uncompress a file. Execute the following commands to compress a copy of the words file: cp /usr/share/dict/words . ls -l words gzip words ls -l words.gz
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ cp /usr/share/dict/words .
sysadmin@localhost:~/mybackups$ ls -l words
-rw-r--r-- 1 sysadmin sysadmin 938848 Jan 25 07:39 words sysadmin@localhost:~/mybackups$ gzip words sysadmin@localhost:~/mybackups$ ls -l words.gz
-rw-r--r-- 1 sysadmin sysadmin 255996 Jan 25 07:39 words.gz sysadmin@localhost:~/mybackups$
Notice the size of the zipped file (255996 bytes in the example above) is much smaller than the original file (938848 bytes in the example above).
Very important: When you use gzip, the original file is replaced by the zipped file. In the example above, the file words was replaced with words.gz. When you unzip the file, the zipped file will be replaced with the original file.
7.2.7 Step 7 Execute the following commands to uncompress the words.gz file: ls -l words.gz gunzip words.gz ls -l words
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ ls -l words.gz
-rw-r--r-- 1 sysadmin sysadmin 255996 Jan 25 07:39 words.gz sysadmin@localhost:~/mybackups$ gunzip words.gz sysadmin@localhost:~/mybackups$ ls -l words
-rw-r--r-- 1 sysadmin sysadmin 938848 Jan 25 07:39 words sysadmin@localhost:~/mybackups$
Linux provides a large number of compression utilities in addition to gzip/gunzip. Each of them have pros and cons (faster compression, better compression rates, more flexible, more portable, faster decompression, etc.). The gzip/gunzip commands are very popular in Linux, but you should be aware that bzip2/bunzip2 are also popular on some Linux distributions. It is fortunate that most of the functionality (the way you run the commands) and options are the same as gzip/gunzip.
7.2.8 Step 8 Using bzip2 and bunzip2 to compress and uncompress a file is very similar to using gzip and gunzip. The compressed file is created with a .bz2 extension. The extension is removed when uncompressed. Execute the following commands to compress a copy of the words file: ls -l words bzip2 words ls -l words.bz2
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ ls -l words
-rw-r--r-- 1 sysadmin sysadmin 938848 Jan 25 07:39 words sysadmin@localhost:~/mybackups$ bzip2 words sysadmin@localhost:~/mybackups$ ls -l words.bz2
-rw-r--r-- 1 sysadmin sysadmin 335405 Jan 25 07:39 words.bz2 sysadmin@localhost:~/mybackups$
If you compare the resulting .bz2 file size (335405) to the .gz file size (255996) from Step #7, you will notice that gzip did a better job compressing this particular file.
7.2.9 Step 9 Execute the following commands to uncompress the words.bz2 file: ls -l words.bz2 bunzip2 words.bz2 ls -l words sysadmin@localhost:~/mybackups$ ls -l words.bz2
-rw-r--r-- 1 sysadmin sysadmin 335405 Jan 25 07:39 words.bz2 sysadmin@localhost:~/mybackups$ bunzip2 words.bz2 sysadmin@localhost:~/mybackups$ ls -l words
-rw-r--r-- 1 sysadmin sysadmin 938848 Jan 25 07:39 words
While gzip and bzip archive files are commonly used in Linux, the zip archive type is more commonly used by other operating systems, such as Windows. In fact, the Windows Explorer application has built-in support to extract zip archive files.
Therefore, if you are planning to share an archive with Windows users, it is usually preferred to use the zip archive type. Unlike gzip and bzip2, when a file is compressed with the zip command, a copy of the original file is compressed and the original remains uncompressed.
7.2.10 Step 10 Use the zip command to compress the words file: zip words.zip words ls -l words.zip
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ zip words.zip words
adding: words (deflated 73%) sysadmin@localhost:~/mybackups$ ls -l words.zip
-rw-rw-r-- 1 sysadmin sysadmin 256132 Jan 25 21:25 words.zip sysadmin@localhost:~/mybackups$
The first argument (words.zip in the example above) of the zip command is the file name that you wish to create. The remaining arguments (words in the example above) are the files that you want placed in the compressed file. Important: You are not requi red to use the .zip extension to the compressed file name; however, it is helpf ul for determining the file type. It is also considered "good style" when sending an archive f ile to another person.
7.2.11 Step 11 Compress the /etc/udev directory and its contents with zip compression: zip -r udev.zip /etc/udev ls -l udev.zip
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ zip - r udev.zip /etc/udev
adding: etc/udev/ (stored 0%) adding: etc/udev/rules.d/ (stored 0%) adding: etc/udev/rules.d/70-persistent-cd.rules (deflated 29%) adding: etc/udev/rules.d/README (deflated 50%)
adding: etc/udev/udev.conf (deflated 24%) sysadmin@localhost:~/mybackups$ ls -l udev.zip
-rw-rw-r-- 1 sysadmin sysadmin 1840 Jan 25 21:33 udev.zip sysadmin@localhost:~/mybackups$
The tar command discussed earlier in this lab automatically descends through any subdirectories of a directory specified to be archived. With the bzip2, gzip, and zip commands the -r option must be specified in order to perform recursion into subdirectories.
7.2.12 Step 12 To view the contents of a zip archive, use with the -l option with the unzip command: unzip -l udev.zip
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ unzip -l udev.zip
Archive:
udev.zip
Length
Date
Time
Name
---------
---------- -----
----
0
2015-01-28 16:32
etc/udev/
0
2015-01-28 16:32
etc/udev/rules.d/
306
2015-01- 28 16:32
etc/ udev/rules.d/70 -persistent- cd.rules
1157
2012-04-05 19:18
etc/udev/rules.d/README
218
2012-04-05 19:18
etc/udev/udev.conf
---------
-------
1681
5 files
sysadmin@localhost:~/mybackups$
7.2.13 Step 13 To extract the zip archive, use the unzip command without any options. In this example we first need to delete the files that were created in the earlier tar example: rm -r etc unzip udev.zip
Your output should be similar to the following: sysadmin@localhost:~/mybackups$ rm -r etc sysadmin@localhost:~/mybackups$ unzip udev.zip
Archive:
udev.zip
creating: etc/udev/ creating: etc/udev/rules.d/ inflating: etc/udev/rules.d/70-persistent-cd.rules inflating: etc/udev/rules.d/README inflating: etc/udev/udev.conf sysadmin@localhost:~/mybackups$
Chapter 8 8.1 Introduction A large number of the files in a typical filesystem are text files. Text files contain simply text, no formatting features that you might see in a word processing file. Because there are so many of these files on a typical Linux system, a great number of commands exist to help users manipulate text files. There are commands to both view and modify these files in various ways. In addition, there are features available for the shell to control the output of commands, so instead of having the output placed in the terminal window, the output can be redirected into another file or another command. These redirection features provide users with a much more flexible and powerful environment to work within.
8.2 Command Line Pipes Previous chapters discussed how to use individual commands to perform actions on the operating system, including how to create/move/delete files and move around the system. Typically, when a command has output or generates an error, the output is displayed to the screen; however, this does not have to be the case. The pipe | character can be used to send the output of one command to another. Instead of being printed to the screen, the output of one command becomes input for the next command. This can be a powerful tool, especially when looking for specific data; piping is often used to refine the results of an initial command. The head and tail commands will be used in many examples below to illustrate the use of pipes. These commands can be used to display only the first few or last few lines of a file (or, when used with a pipe, the output of a previous command). By default the head and tail commands will display ten lines. For example, the following command will display the first ten lines of the /etc/sysctl.conf file: sysadmin@localhost:~$ head /etc/sysctl.conf
# # /etc/sysctl.conf - Configuration file for setting system variables
# See /etc/sysctl.d/ for additional system variables # See sysctl.conf (5) for information. #
#kernel.domainname = example.com
# Uncomment the following to stop low-level messages on console #kernel.printk = 3 4 1 3 sysadmin@localhost:~$
In the next example, the last ten lines of the file will be displayed: sysadmin@localhost:~$ tail /etc/sysctl.conf
# Do not send ICMP redirects (we are not a router) #net.ipv4.conf.all.send_redirects = 0 # # Do not accept IP source route packets (we ar e not a router) #net.ipv4.conf.all.accept_source_route = 0 #net.ipv6.conf.all.accept_source_route = 0 # # Log Martian Packets #net.ipv4.conf.all.log_martians = 1 # sysadmin@localhost:~$
The pipe character will allow users to utilize these commands not only on files, but on the output of other commands. This can be useful when listing a large directory, for example the /etc directory: ca-certificates
insserv
nanorc
services
ca-certificates.conf
insserv.conf
network
sgml
calendar
insserv.conf.d
networks
shadow
cron.d
iproute2
nologin
shadow-
cron.daily
issue
nsswitch.conf
shells
cron.hourly
issue.net
opt
skel
cron.monthly
kernel
os-release
ssh
cron.weekly
ld.so.cache
pam.conf
ssl
crontab
ld.so.conf
pam.d
sudoers
dbus-1
ld.so.conf.d
passwd
sudoers.d
debconf.conf
ldap
passwd-
sysctl.conf
debian_version
legal
perl
sysctl.d
default
locale.alias
pinforc
systemd
deluser.conf
localtime
ppp
terminfo
depmod.d
logcheck
profile
timezone
dpkg
login.defs
profile.d
ucf.conf
environment
logrotate.conf
protocols
udev
fstab
logrotate.d
python2.7
ufw
fstab.d
lsb-base
rc.local
update-motd.d
gai.conf
lsb-base-logging.sh
rc0.d
updatedb.conf
groff
lsb-release
rc1.d
vim
group
magic
rc2.d
wgetrc
group-
magic.mime
rc3.d
xml
sysadmin@localhost:~$
If you look at the output of the previous command, you will note that first filename is ca-certificates. But there are other files listed "above" that can only be viewed if the user uses the scroll bar. What if you just wanted to list the first few files of the /etc directory? Instead of displaying the full output of the above command, piping it to the head command will display only the first ten lines: sysadmin@localhost:~$ ls /etc | head
adduser.conf adjtime alternatives apparmor.d apt bash.bashrc bash_completion.d bind bindresvport.blacklist blkid.conf sysadmin@localhost:~$
The full output of the ls command is passed to the head command by the shell instead of being printed to the screen. The head command takes this output (from ls) as "input data" and the output of head is then printed to the screen. Multiple pipes can be used consecutively to link multiple commands together. If three commands are piped together, the first command's output is passed to the second command. The output of the second command is then passed to the third command. The output of the third command would then be printed to the screen. It is important to carefully choose the order in which commands are piped, as the third command will only see input from the output of the second. The examples below illustrate this using the nl command. In the first example, the nl command is used to number the lines of the output of a previous command: sysadmin@localhost:~$ ls -l /etc/ppp | nl
1
total 44 2
-rw------- 1 root root
78 Aug 22
3
-rwxr-xr-x 1 root root
386 Apr 27
4
-rwxr-xr-x 1 root root 3262 Apr 27
2012 ip-down.ipv6to4
5
-rwxr-xr-x 1 root root
2012 ip-up
6
-rwxr-xr-x 1 root root 6517 Apr 27
2012 ip-up.ipv6to4
7
-rwxr-xr-x 1 root root 1687 Apr 27
2012 ipv6-down
8
-rwxr-xr-x 1 root root 3196 Apr 27
2012 ipv6-up
9
-rw-r--r-- 1 root root
5 Aug 22
2010 options
10
-rw------- 1 root root
77 Aug 22
11
drwxr-xr -x 2 root root 4096 Jun 22
430 Apr 27
2010 chap-secrets 2012 ip-down
2010 pap-secrets 2012 peers
sysadmin@localhost:~$
In the next example, note that the ls command is executed first and its output is sent to the nl command, numbering all of the lines from the output of the ls command. Then the tail command is executed, displaying the last five lines from the output of the nl command: sysadmin@localhost:~$ ls -l /etc/ppp | nl | tail -5
7
-rwxr-xr-x 1 root root 1687 Apr 27
2012 ipv6-down
8
-rwxr-xr-x 1 root root 3196 Apr 27
2012 ipv6-up
9
-rw-r--r-- 1 root root
5 Aug 22
2010 options
10
-rw------- 1 root root
77 Aug 22
11
drwxr-xr -x 2 root root 4096 Jun 22
sysadmin@localhost:~$
Compare the output above with the next example:
2010 pap-secrets 2012 peers
sysadmin@localhost:~$ ls -l /etc/ppp | tail -5 | nl
1
-rwxr-xr-x 1 root root 1687 Apr 27
2012 ipv6-down
2
-rwxr-xr-x 1 root root 3196 Apr 27
2012 ipv6-up
3
-rw-r--r-- 1 root root
5 Aug 22
2010 options
4
-rw------- 1 root root
77 Aug 22
5
drwxr-xr-x 2 root root 4096 Jun 22
2010 pap-secrets 2012 peers
sysadmin@localhost:~$
Notice how the line numbers are different. Why is this? In the second example, the output of the ls command is first sent to the tail command which "grabs" only the last five lines of the output. Then the tail command sends those five lines to the nl command, which numbers them 1- 5. Pipes can be powerful, but it is important to consider how commands are piped to ensure that the desired output is displayed.
8.3 I/O Redirection Input/Output (I/O) redirection allows for command line information to be passed to different streams. Before discussing redirection, it is important to understand standard streams.
8.3.1 STDIN Standard input, or STDIN, is information entered normally by the user via the keyboard. When a command prompts the shell for data, the shell provides the user with the ability to type commands that, in turn, are sent to the command as STDIN.
8.3.2 STDOUT Standard output, or STDOUT, is the normal output of commands. When a command functions correctly (without errors) the output it produce s is called STDOUT. By default, STDOUT is displayed in the terminal window (screen) where the command is executing.
8.3.3 STDERR Standard error, or STDERR, are error messages generated by commands. By default, STDERR is displayed in the terminal window (screen) where the command is executing.
I/O redirection allows the user to redirect STDIN so data comes from a file and STDOUT/STDERR so output goes to a file. Redirection is achieved by using the arrow characters: < and > .
8.3.4 Redirecting STDOUT STDOUT can be directed to files. To begin, observe the output of the following command which will display to the screen: sysadmin@localhost:~$ echo "Line 1"
Line 1 sysadmin@localhost:~$
Using the > character the output can be redirected to a file: sysadmin@localhost:~$ echo "Line 1" > example.txt sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates example.txt
Documents
Music
Public
Videos
test
sample.txt
sysadmin@localhost:~$ cat example.txt
Line 1 sysadmin@localhost:~$
This command displays no output, because STDOUT was sent to the file example.txt instead of the screen. You can see the new file with the output of the ls command. The newly-created file contains the output of the echo command when the file is viewed with the cat command. It is important to realize that the single arrow will overwrite any contents of an existing file: sysadmin@localhost:~$ cat example.txt
Line 1 sysadmin@localhost:~$ echo "New line 1" > example.txt sysadmin@localhost:~$ cat example.txt
New line 1 sysadmin@localhost:~$
The original contents of the file are gone, replaced with the output of the new echo command.
It is also possible to preserve the contents of an existing file by appending to it. Use "double arrow" >> to append to a file instead of overwriting it: sysadmin@localhost:~$ cat example.txt
New line 1 sysadmin@localhost:~$ echo "Another line" >> example.txt sysadmin@localhost:~$ cat example.txt
New line 1 Another line sysadmin@localhost:~$
Instead of being overwritten, the output of the most recent echo command is added to the bottom of the file.
8.3.5 Redirecting STDERR STDERR can be redirected in a similar fashion to STDOUT. STDOUT is also known as stream (or channel) #1. STDERR is assigned stream #2. When using arrows to redirect, stream #1 is assumed unless another stream is specified. Thus, stream #2 must be specified when redirecting STDERR. To demonstrate redirecting STDERR, first observe the following command which will produce an error because the specified directory does not exist: sysadmin@localhost:~$ ls /fake
ls: cannot access /fake: No such file or directory sysadmin@localhost:~$
Note that there is nothing in the example above that implies that the output is STDERR. The output is clearly an error message, but how could you tell that it is being sent to STDERR? One easy way to determine this is to redirect STDOUT: sysadmin@localhost:~$ ls /fake > output.txt
ls: cannot access /fake: No such file or directory sysadmin@localhost:~$
In the example above, STDOUT was redirected to the output.txt file. So, the output that is displayed can't be STDOUT because it would have been placed in the output.txt file. Because all command output goes either to STDOUT or STDERR, the output displayed above must be STDERR. The STDERR output of a command can be sent to a file:
sysadmin@localhost:~$ ls /fake 2> error.txt sysadmin@localhost:~$ more error.txt
ls: cannot access /fake: No such file or directory sysadmin@localhost:~$
In the command above, the 2> indicates that all error messages should be sent to the file error.txt.
8.3.6 Redirecting Multiple Streams It is possible to direct both the STDOUT and STDERR of a command at the same time. The following command will produce both STDOUT and STDERR because one of the specified directories exists and the other does not: sysadmin@localhost:~$ ls /fake /etc/ppp
ls: cannot access /fake: No such file or directory /etc/ppp: chap-secrets
ip-down
ip-down.ipv6to4
ip-up
ip-up.ipv6to4
ipv6-down
ipv6-up
options
pap-secrets
peers
If only the STDOUT is sent to a file, STDERR will still be printed to the screen: sysadmin@localhost:~$ ls /fake /etc/ppp > example.txt
ls: cannot access /fake: No such file or directory sysadmin@localhost:~$ cat example.txt
/etc/ppp: chap-secrets ip-down ip-down.ipv6to4 ip-up ip-up.ipv6to4 ipv6-down ipv6-up options pap-secrets peers sysadmin@localhost:~$
If only the STDERR is sent to a file, STDOUT will still be printed to the screen:
sysadmin@localhost:~$ ls /fake /etc/ppp 2> error.txt
/etc/ppp: hap-secrets
ip-down
ip-down.ipv6to4
ip-up
ip-up.ipv6to4
ipv6-down
ipv6-up
options
pap-secrets
peers
sysadmin@localhost:~$ cat error.txt
ls: cannot access /fake: No such file or directory sysadmin@localhost:~$
Both STDOUT and STDERR can be sent to a file by using &>, a character set that means "both 1> and 2>”: sysadmin@localhost:~$ ls /fake /etc/ppp &> all.txt sysadmin@localhost:~$ cat all.txt
ls: cannot access /fake: No such file or directory /etc/ppp: chap-secrets ip-down ip-down.ipv6to4 ip-up ip-up.ipv6to4 ipv6-down ipv6-up options pap-secrets peers sysadmin@localhost:~$
Note that when you use &>, the output appears in the file with all of the STDERR messages at the top and all of the STDOUT messages below all STDERR messages: sysadmin@localhost:~$ ls /fake /etc/ppp /junk /etc/sound &> all.txt sysadmin@localhost:~$ cat all.txt
ls: cannot access /fake: No such file or directory ls: cannot access /junk: No such file or directory /etc/ppp: chap-secrets ip-down ip-down.ipv6to4
ip-up ip-up.ipv6to4 ipv6-down ipv6-up options pap-secrets peers
/etc/sound: events sysadmin@localhost:~$
If you don't want STDERR and STDOUT to both go to the same file, they can be redirected to different files by using both > and 2> . For example: sysadmin@localhost:~$ rm error.txt example.txt sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates all.txt
Documents
Music
Public
Videos
sysadmin@localhost:~$ ls /fake /etc/ppp > example.txt 2> error.txt sysadmin@localhost:~$ ls Desktop
Downloads
Pictures
Templates
all.txt
Documents
Music
Public
Videos
error.txt
sysadmin@localhost:~$ cat error.txt
ls: cannot access /fake: No such file or directory sysadmin@localhost:~$ cat example.txt
/etc/ppp: chap-secrets ip-down ip-down.ipv6to4 ip-up ip-up.ipv6to4 ipv6-down ipv6-up options pap-secrets peers
example.txt
sysadmin@localhost:~$
The order the streams are specified in does not matter.
8.3.7 Redirecting STDIN The concept of redirecting STDIN is a difficult one because it is more difficult to understand why you would want to redirect STDIN. With STDOUT and STDERR, the answer to why is fairly easy: because sometimes you want to store the output into a file for future use. Most Linux users end up redirecting STDOUT routinely, STDERR on occasion and STDIN...well, very rarely. There are very few commands that require you to redirect STDIN because with most commands if you want to read data from a file into a command, you can just specify the filename as an argument to the command. The command will then look into the file. For some commands, if you don't specify a filename as an argument, they will revert to using STDIN to get data. For example, consider the following cat command: sysadmin@localhost:~$ cat
hello hello how are you? how are you? goodbye goodbye sysadmin@localhost:~$
In the example above, the cat command wasn't provided a filename as an argument. So, it asked for the data to display on the screen from STDIN. The user typed hello and then the cat command displayed hello on the screen. Perhaps this is useful for lonely people, but not really a good use of the cat command. However, perhaps if the output of the cat command were redirected to a file, then this method could be used either to add to an existing file or to place text into a new file: sysadmin@localhost:~$ cat > new.txt
Hello How are you? Goodbye sysadmin@localhost:~$ cat new.txt
Hello
How are you? Goodbye sysadmin@localhost:~$
While the previous example demonstrates another advantage of redirecting STDOUT, it doesn't address why or how STDIN can be directed. To understand this, first consider a new command called tr. This command will take a set of characters and translate them into another set of characters. For example, suppose you wanted to capitalize a line of text. You could use the tr command as follows: sysadmin@localhost:~$ tr 'a-z' 'A-Z'
watch how this works WATCH HOW THIS WORKS sysadmin@localhost:~$
The tr command took the STDIN from the keyboard (watch how this works ) and converted all lower case letters before sending STDOUT to the screen (WATCH HOW THIS WORKS ). It would seem that a better use of the tr command would be to perform translation on a file, not keyboard input. However, the tr command does not support filename arguments: sysadmin@localhost:~$ more example.txt
/etc/ppp: chap-secrets ip-down ip-down.ipv6to4 ip-up ip-up.ipv6to4 ipv6-down ipv6-up options pap-secrets peers sysadmin@localhost:~$ tr 'a-z' 'A-Z' example.txt
tr: extra operand `example.txt' Try `tr --help' for more information
sysadmin@localhost:~$
You can, however, tell the shell to get STDIN from a file instead of from the keyboard by using the < character: sysadmin@localhost:~$ tr 'a-z' 'A-Z' < example.txt
/ETC/PPP: CHAP-SECRETS IP-DOWN IP-DOWN.IPV6TO4 IP-UP IP-UP.IPV6TO4 IPV6-DOWN IPV6-UP OPTIONS PAP-SECRETS sysadmin@localhost:~$
This is fairly rare because most commands do accept filenames as arguments. But, for those that do not, this method could be used to have the shell read from the file instead of relying on the command to have this ability. One last note: In most cases you probably want to take the resulting output and place it back into another file: sysadmin@localhost:~$ tr 'a-z' 'A-Z' < example.txt > newexample.txt sysadmin@localhost:~$ more newexample.txt
/ETC/PPP: CHAP-SECRETS IP-DOWN IP-DOWN.IPV6TO4 IP-UP IP-UP.IPV6TO4 IPV6-DOWN IPV6-UP OPTIONS PAP-SECRETS sysadmin@localhost:~$
8.4 Searching for Files Using the Find Command One of the challenges that users face when working with the filesyste m, is trying to recall the location where files are stored. There are thousands of files and hundreds of directories on a typical Linux filesystem, so recalling where these files are located can pose challenges. Keep in mind that most of the files that you will work with are ones that you create. As a result, you often will be looking in your own home directory to find files. However, sometimes you may need to search in other places on the filesyste m to find files created by other users. The find command is a very powerful tool that you can use to search for files on the filesyste m. This command can search for files by name, including using wildcard characters for when you are not certain of the exact filename. Additionally, you can search for files based on file metadata, such as file type, file size and file ownership. The syntax of the find command is: find [starting directory] [search option] [ se arch criteria] [result option]
A description of all of these components: Component
Description
This is where the user specifies where to start searching. The find command will search this directory and all of its subdirectories. If no [starting directory] starting directory is provided, then the current directory is used for the starting point.
[search option]
This is where the user specifies an option to determine what sort of metadata to search for; there are options for file name, file size and many other file attributes.
[search criteria]
This is an argument that compliments the search option. For example, if the user uses the option to search for a file name, the search criteria would be the fil ename.
[result option]
This option is used to specify what action should be taken once the fil e is found. If no option is provide d, the file name will be printed to STDOUT.
8.4.1 Search by File Name To search for a file by name, use the -name option to the find command: sysadmin@localhost:~$ find /etc -name hosts
find: ` /etc/dhcp': Permission denied
find: `/etc/cups/ssl': Permission denied find: `/etc/pki/CA/private': Permission denied find: `/etc/pki/rsyslog': Permission denied find: `/etc/audisp': Permission denied find: ` /etc/named': Permission denied find: `/etc/lvm/cache': P ermission denied find: ` /etc/lvm/backup': Permission denied find: `/etc/lvm/archive': Permission denied /etc/hosts find: ` /etc/ntp/crypto': Permission denied find: `/etc/polkit-l/localauthority': Permission denied find: `/etc/sudoers.d': P ermission denied find: `/etc/sssd': Permission denied /etc/avahi/hosts find: `/etc/selinux/targeted/modules/active': Permission denied find: `/etc/audit': Permission denied sysadmin@localhost:~$
Note that two files were found: /etc/hosts and /etc/avahi/hosts. The rest of the output was STDERR messages because the user who ran the command didn't have the permission to access certain subdirectories. Recall that you can redirect STDERR to a file so you don't need to see these error messages on the screen: sysadmin@localhost:~$ find /etc -name hosts 2> errors.txt
/etc/hosts /etc/avahi.hosts sysadmin@localhost:~$
While the output is easier to read, there really is no purpose to storing the error messages in the error.txt file. The developers of Linux realized that it would be good to have a "junk file" to send unnecessary data; any file that you send to the /dev/null file is discarded: sysadmin@localhost:~$ find /etc -name hosts 2> /dev/null
/etc/hosts /etc/avahi/hosts sysadmin@localhost:~$
8.4.2 Displaying File Detail It can be useful to obtain file details when using the find command because just the file name itself might not be enough information for you to find the correct file. For example, there might be seven files named hosts; if you knew that the host file that you needed had been modified recently, then the modificatio n timesta mp of the file would be useful to see. To see these file details, use the -ls option to the find command: sysadmin@localhost:~$ find /etc -name hosts -ls 2> /dev/null
41
4 -rw-r--r--
1 root
root
158 Jan 12 2010 /etc/hosts
6549
4 -rw-r--r--
1 root
root
1130 Jul 19 2011 /etc/avahi/hosts
sysadmin@localhost:~$
Note: The first two columns of the output above are the inode number of the fil e and the number of blocks that the fil e is using for storage. Both of these are b eyond the scope of the topic at hand. The rest of the columns are typi cal output of the ls -l command: file type, permissions, hard link count, user owner, group owne r, file si ze, modification timestamp and file name.
8.4.3 Searching for Files by Size One of the many useful searching options is the option that allows you to search for files by size. The -size option allows you to search for files that are either larger than or smaller then a specified size as well as search for an exact file size. When you specify a file size, you can give the size in bytes (c), kilobytes (k), megabytes (M) or gigabytes (G). For example, the following will search for files in the /etc directory structure that are exactly 10 bytes large: sysadmin@localhost:~$ find /etc -size 10c -ls 2>/dev/null
432
4 -rw-r--r--
1 root
root
10 Jan 28
2015 /etc/adjtim
e 8814 d
0 drwxr-xr-x
1 root
root
10 Jan 29
2015 /etc/ppp/ip-
own.d 8816
0 drwxr-xr-x
1 root
root
10 Jan 29
2015 /etc/ppp/ip-u
p.d 8921 t
0 lrwxrwxrwx
1 root
root
10 Jan 29
2015 /etc/ssl/cer
s/349f2832.0 -> EC-ACC.pem 9234 rt
0 lrwxrwxrwx
1 root
root
10 Jan 29
2015 /etc/ssl/ce
s/aeb67534.0 -> EC-ACC.pem 73468 me
4 -rw-r--r--
1 root
root
10 Nov 16 20:42 /etc/hostna
sysadmin@localhost:~$
If you want to search for files that are larger than a specified size, you place a + character before the size. For example, the following will look for all files in the /usr directory structure that are over 100 megabytes in size: sysadmin@localhost:~$ find /usr -size +100M -ls 2> /dev/null
574683 104652 -rw-r--r-1 root hare/icons/oxygen/icon-theme.cache
root
107158256 Aug
7 11:06 /usr/s
sysadmin@localhost:~$
To search for files that are smaller than a specified size, place a - character before the file size.
8.4.4 Additional Useful Search Options There are many search options. The following table illustrates a few of these options: Option
Meaning Allows the user to specify how deep in the directory structure to search. For
example, -maxdepth 1 would mean only search the specified directory and its maxdepth immediate subdirectories.
-group
Returns files owned by a specified group. For example, -group payroll would return files owned by the payroll group.
-iname
Returns files that match specified filename, but unlike -name, -iname is case insensi tive. For example, -iname hosts would match files named hosts, Hosts, HOSTS, etc.
-mmin
Returns files that were modified based on modification time in minutes. For example, -mmin 10 would match files that were modified 10 minutes ago.
-type
Returns fil es that match file type. For example, -type f would return files that are regular files.
-user
Returns files owned by a specified user. For example, -user bob would return files owned by the bob user.
8.4.5 Using Multiple Options
If you use multiple options, they act as an "and", meaning for a match to occur, all of the criteria must match, not just one. For example, the following command will display all files in the /etc directory structure that are 10 bytes in size and are plain files: sysadmin@localhost:~$ find /etc -size 10c -type f -l s 2>/dev/null
432
4 -rw-r--r--
73468 e
4 -rw-r--r--
1 root
root
1 root
root
10 Jan 28
2015 /etc/adjtime
10 Nov 16 20:42 /etc/hostnam
sysadmin@localhost:~$
8.5 Viewing Files Using the less Command While viewing small files with the cat command poses no problems, it is not an ideal choice for large files. The cat command doesn't provide any way to easily pause and restart the display, so the entire file contents are dumped to the screen. For larger files, you will want to use a pager command to view the contents. Pager commands will display one page of data at a time, allowing you to move forward and backwards in the file by using movement keys. There are two commonly used pager commands:
The less command: This command provides a very advanced paging capability. It is normally the default pager used by commands like the man command. The more command: This command has been around since the early days of UNIX . While it has fewer features than the less command, it does have one important advantage: The less command isn't always included with all Linux distributions (and on some distributions, it isn't installed by default). The more command is always available.
When you use the more or less commands, they will allow you to "move around" a document by using keystroke commands. Because the developers of the less command based the command from the functionality of the more command, all of the keystroke commands available in the more command also work in the less command. For the purpose of this manual, the focus will be on the more advanced command (less). The more command is still useful to remember for times when the less command isn't available. Remember that most of the keystroke commands provided work for both commands.
8.5.1 Help Screen in less When you view a file with the less command, you can use the h key to display a help screen. The help screen allows you to see which other commands are availab le. In the following example, the less /usr/share/dict/words command is executed. Once the document is displayed, the h key was pressed, displaying the help screen:
SUMMARY OF LESS COMMANDS Commands marked with * may be preceded by a nu mber, N. Notes in parentheses indicate the behavior if N is given.
h
H
q
:q
Display this help. Q
:Q
ZZ
Exit.
-----------------------------------------------------------------------MOVING
e
^E
j
^N
CR
*
Forward on e line
(or N lines).
y
^Y
k
^K
^P
*
Backward one line
(or N lines).
f
^F
^V
SPACE
*
Forward
b
^B
ESC-v
*
Backward one window (or N lines).
z
*
Forward
w
*
Backward one window (and set window to N).
ESC-SPACE
*
Forward
one window, but don't stop at end-of-file.
d
^D
*
Forward
one half-window (and set half-window to N).
u
^U
*
Backward one half-window (and set half-window to N).
ESC-)
RightArrow *
Left
ESC-(
LeftArrow
Right one half screen width (or N positions).
*
one window (or N lines).
one window (and set window to N).
one half screen width (or N positions).
HELP -- Press RETURN for more, or q when do ne
8.5.2 less Movement Commands There are many movement commands for the less command, each with multiple possible keys or key combinations. While this may seem intimidating, remember you don't need to memorize all of these movement commands; you can always use the h key whenever you need to get help. The first group of movement commands that you may want to focus upon are the ones that are most commonly used. To make this even easier to learn, the keys that are identical in more and less will be summarized. In this way, you will be learning how to move in more and less at the same time: Movement
Key
Window forward Spacebar Window backward b Line forward
Enter
Movement
Key
Exit
q
Help
h
When simply using less as a pager, the easiest way to advance forward a page is to press the spacebar.
8.5.3 less Searching Commands There are two ways to search in the less command: you can either search forward or backwards from your current position using patterns called regular expressions. More details regarding regular expressions are provided later in this chapter. To start a search to look forward from your current position, use the / key. Then, type the text or pattern to match and press the Enter key. If a match can be found, then your cursor will move in the document to the match. For example, in the following graphic the expression "frog" was searched for in the /usr/share/dict/words file: bullfrog bullfrog's bullfrogs bullheaded bullhorn bullhorn's bullhorns bullied bullies bulling bullion bullion's bullish bullock bullock's bullocks bullpen bullpen's bullpens
bullring bullring's bullrings bulls :
Notice that "frog" didn't have to be a word by itself. Also notice that while the less command took you to the first match from the current position, all matches were highlighted. If no matches forward from your current position can be found, then the last line of the screen will report “Pattern not found “: Pattern not found
(press RETURN)
To start a search to look backwards from your current position, press the ? key, then type the text or pattern to match and press the Enter key. Your cursor will move backward to the first match it can find or report that the pattern cannot be found. If more than one match can be found by a search, then using the n key will allow you to move to the next match and using the N key will allow you to go to a previous match.
8.6 Revisiting the head and tail Commands Recall that the head and tail commands are used to filter files to show a limited number of lines. If you want to view a select number of lines from the top of the file, you use the head command and if you want to view a select number of lines at the bottom of a file, then you use the tail command. By default, both commands display ten lines from the file. The following table provides some examples: Command Example
Explanation of Displayed Text
head /etc/passwd
First ten lines of /etc/passwd
head -3 /etc/group
First three li nes of /etc/group
head -n 3 /etc/group First three li nes of /etc/group help | head
First ten lines of output piped from the help command
tail /etc/group
Last ten li nes of /etc/group
tail -5 /etc/passwd
Last five lines of /etc/passwd
Command Example
Explanation of Displayed Text
tail -n 5 /etc/passwd Last five lines of /etc/passwd help | tail
Last ten lines of output piped from the help command
As seen from the above examples, both commands will output text from either a regular file or from the output of any command sent through a pipe. They both use the -n option to indicate how many lines to output.
8.6.1 Negative Value with the -n Option Traditionally in UNIX, the number of lines to output would be specified as an option with either command, so -3 meant show three lines. For the tail command, either -3 or -n -3 still means show three lines. However, the GNU version of the head command recognizes -n -3 as show all but the last three lines , and yet the head command still recognizes the option -3 as show the first three lines.
8.6.2 Positive Value With the tail Command The GNU version of the tail command allows for a variation of how to specify the number of lines to be printed. If you use the -n option with a number prefixed by the plus sign, then the tail command recognizes this to mean to display the contents starting at the specified line and continuing all the way to the end. For example, the following will display line #22 to the end of the output of the nl command: sysadmin@localhost:~$ nl /etc/passwd | tail -n +22
22
sshd:x:103:65534::/var/run/sshd:/usr/sbin/nologin
23
operator:x:1000:37::/root:/bin/sh
24
sysadmin:x:1001:1001:System Administrator,,,,:/home/sysadmin:/bin/bash
sysadmin@localhost:~$
8.6.3 Following Changes to a File You can view live file changes by using the -f option to the tail command. This is useful when you want to see changes to a file as they are happening. A good example of this would be when viewing log files as a system administrator. Log files can be used to troubleshoot problems and administrators will often view them "interactively" with the tail command as they are performing the commands they are trying to troubleshoot in a separate window.
For example, if you were to log in as the root user, you could troubleshoot issues with the email server by viewing live changes to its log file with the following command: tail -f /var/log/mail.log
8.7 Sorting Files or Input The sort command can be used to rearrange the lines of files or input in either dictionary or numeric order based upon the contents of one or more fields. Fields are determined by a field separator contained on each line, which defaults to whitespace (spaces and tabs). The following example creates a small file, using the head command to grab the first 5 lines of the /etc/passwd file and send the output to a file called mypasswd. sysadmin@localhost:~$ head -5 /etc/passwd > mypasswd sysadmin@localhost:~$ sysadmin@localhost:~$ cat mypasswd
root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync sysadmin@localhost:~$
Now we will sort the mypasswd file: sysadmin@localhost:~$ sort mypasswd
bin:x:2:2:bin:/bin:/bin/sh daemon:x:1:1:daemon:/usr/sbin:/bin/sh root:x:0:0:root:/root:/bin/bash sync:x:4:65534:sync:/bin:/bin/sync sys:x:3:3:sys:/dev:/bin/sh sysadmin@localhost:~$
8.7.1 Fields and Sort Options In the event that the file or input might be separated by another delimiter like a comma or colon, the -t option will allow for another field separator to be specified. To specify fields to sort by, use the -k option with an argument to indicate the field number (starting with 1 for the first field).
The other commonly used options for the sort command are the -n to perform a numeric sort and -r to perform a reverse sort. In the next example, the -t option is used to separate fields by a colon character and performs a numeric sort using the third field of each line: sysadmin@localhost:~$ sort -t: -n -k3 mypasswd
root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync sysadmin@localhost:~$
Note that the -r option could have been used to reverse the sort, making the higher numbers in the third field appear at the top of the output: sysadmin@localhost:~$ sort -t: -n -r -k3 mypasswd
sync:x:4:65534:sync:/bin:/bin/sync sys:x:3:3:sys:/dev:/bin/sh bin:x:2:2:bin:/bin:/bin/sh daemon:x:1:1:daemon:/usr/sbin:/bin/sh root:x:0:0:root:/root:/bin/bash sysadmin@localhost:~$
Lastly, you may want to perform more complex sorts, such as sort by a primary field and then by a secondary field. For example, consider the following data: bob:smith:23 nick:jones:56 sue:smith:67
You might want to sort first by the last name (field #2) and then first name (field #1) and then by age (field #3). This can be done with the following command: sysadmin@localhost:~$ sort -t: -k2 -k1 -k3n filename
8.8 Viewing File Statistics With the wc Command
The wc command allows for up to three statistics to be printed for each file provided, as well as the total of these statistics if more than one filename is provided. By default, the wc command provides the number of lines, words and bytes (1 byte = 1 character in a text file): sysadmin@localhost:~$ wc /etc/passwd /etc/passwd-
35
56 1710 /etc/passwd
34
55 1665 /etc/passwd-
69
111 3375 total
sysadmin@localhost:~$
The above example shows the output from executing: wc /etc/passwd /etc/passwd-. The output has four columns: number of lines in the file, number of words in the file, number of bytes in the file and the file name or total. If you are interested in viewing just specific statistics, then you can use -l to show just the number of lines, -w to show just the number of words and -c to show just the number of bytes. The wc command can be useful for counting the number of lines output by some other command through a pipe. For example, if you wanted to know the total number of files in the /etc directory, you could execute ls /etc | wc -l : sysadmin@localhost:~$ ls /etc/ | wc -l
136 sysadmin@localhost:~$
8.9 Using the cut Command to Filter File Contents The cut command can extract columns of text from a file or standard input. A primary use of the cut command is for working with delimited database files. These files are very common on Linux systems. By default, it considers its input to be separated by the Tab character, but the -d option can specify alternative delimiters such as the colon or comma. Using the -foption, you can specify which fields to display, either as a hyphenated range or a comma separated list. In the following example, the first, fifth, sixth and seventh fields from mypasswd database file are displayed: sysadmin@localhost:~$ cut -d: - f1,5-7 mypasswd
root:root:/root:/bin/bash
daemon:daemon:/usr/sbin:/bin/sh bin:bin:/bin:/bin/sh sys:sys:/dev:/bin/sh sync:sync:/bin:/bin/sync sysadmin@localhost:~$
Using the cut command, you can also extract columns of text based upon character position with the -c option. This can be useful for extracting fields from fixed-width database files. For example, the following will display just the file type (character #1), permissions (characters #2-10) and filename (characters #50+) of the output of the ls -l command: sysadmin@localhost:~$ ls -l | cut -c1-11,50-
total 12 drwxr-xr-x Desktop drwxr-xr-x Documents drwxr-xr-x Downloads drwxr-xr-x Music drwxr-xr-x Pictures drwxr-xr-x Public drwxr-xr-x Templates drwxr-xr-x Videos -rw-rw-r-- errors.txt -rw-rw-r-- mypasswd -rw-rw-r-- new.txt sysadmin@localhost:~$
8.10 Using the grep Command to Filter File Contents The grep command can be used to filter lines in a file or the output of another command based on matching a pattern. That pattern can be as simple as the exact text that you want to match or it can be much more advanced through the use of regular expressions (discussed later in this chapter). For example, you may want to find all the users who can login to the system with the BASH shell, so you could use the grep command to filter the lines from the /etc/passwd file for the lines containing the characters bash: sysadmin@localhost:~$ grep bash /etc/passwd
root:x:0:0:root:/root:/bin/bash
sysadmin:x:1001:1001:System Administrator,,,,:/home/sysadmin:/bin/bash sysadmin@localhost:~$
To make it easier to see what exactly is matched, use the --color option. This option will highlight the matched items in red: sysadmin@localhost:~$ grep --color bash /etc/passwd
root:x:0:0:root:/root:/bin/ bash sysadmin:x:1001:1001:System Administrator,,,,:/home/sysadmin:/bin/ bash sysadmin@localhost:~$
In some cases you don't care about the specific lines that match the pattern, but rather how many lines match the pattern. With the -c option, you can get a count of how many lines that match: sysadmin@localhost:~$ grep -c bash /etc/passwd
2 sysadmin@localhost:~$
When you are viewing the output from the grep command, it can be hard to determine the original line numbers. This information can be useful when you go back into the file (perhaps to edit the file) as you can use this information to quickly find one of the matched lines. The -n option to the grep command will display original line numbers: sysadmin@localhost:~$ grep -n bash /etc/passwd
1:root:x:0:0:root:/root:/bin/bash 24:sysadmin:x:1001:1001:System Administrator,,,,:/home/sysadmin:/bin/bas sysadmin@localhost:~$
Some additional useful grep options: Examples
Output
grep -v nologin /etc/passwd
All lines not containing nologin in the /etc/passwd file
grep -l linux /etc/*
List of files in the /etc directory containing linux
grep -i linux /etc/*
Listing of lines from files in the /etc directory containing any case (capital or l ower) of the character pattern linux
Examples
grep -w linux /etc/*
Output Listing of lines from files in the /etc directory containing the word pattern linux
8.11 Basic Regular Expressions A Regular Expression is a collection of "normal" and "special" characters that are used to match simple or complex patterns. Normal characters are alphanumeric characters which match themselves. For example, an a would match an a. Some characters have special meanings when used within patterns by commands like the grep command. There are both Basic Regular Expressions (available to a wide variety of Linux commands) and Extended Regular Ex pressions (availab le to more advanced Linux commands). Basic Regular Expressions include the following: Regular Expression
Matches
.
Any single character
[ ]
A list or range of characters to match one character, unless the first character is the caret ^, and then it means any character not in the list
*
Previous character repeated zero or more times
^
Following text must appear at beginning of line
$
Preceding text must appear at the end of the line
The grep command is just one of many commands that support regular expressions. Some other commands include the more and less commands. While some of the regular expressions are unnecessar ily quoted with single quotes, it is a good practice to use single quotes around your regular expressions to prevent the shell from trying to interpret special meaning from them.
8.11.1 Basic Regular Expressions - the . Character In the example below, a simple file is first created using redirection. Then the grep command is used to demonstrate a simple pattern match: sysadmin@localhost:~$ echo 'abcddd' > example.txt sysadmin@localhost:~$ cat example.txt
abcddd sysadmin@localhost:~$ grep --color 'a..' example.txt abcddd
sysadmin@localhost:~$
In the previous example, you can see that the pattern a.. matched abc . The first . character matched the b and the second matched the c. In the next example, the pattern a..c won't match anything, so the grep command will not product any output. For the match to be successful, there would need to be two characters between the a and the c in example.txt: sysadmin@localhost:~$ grep --color 'a..c' example.txt sysadmin@localhost:~$
8.11.2 Basic Regular Expressions - the [ ] Characters If you use the . character, then any possible character could match. In some cases you want to specify exactly which characters you want to match. For example, maybe you just want to match a lower-case alpha character or a number character. For this, you can use the [ ] Regular Expression characters and specify the valid characters inside the [ ] characters. For example, the following command matches two characters, the first is either an a or a b while the second is either an a, b, c or d: sysadmin@localhost:~$ grep --color '[ab][a-d]' example.txt abcddd sysadmin@localhost:~$
Note that you can either list out each possible character [abcd] or provide a range [ad] as long as the range is in the correct order. For example, [d-a] wouldn't work because it isn't a valid range: sysadmin@localhost:~$ grep --color '[d-a]' example.txt
grep: I nvalid range end sysadmin@localhost:~$
The range is specified by a standard called the ASCII table. This table is a collection of all printable characters in a specific order. You can see the ASCII table with the man ascii command. A small example: 041
33
21
!
141
97
61
a
042
34
22
“
142
98
62
b
043
35
23
#
143
99
63
c
044
36
24
$
144
100 64
d
045
37
25
%
145
101 65
e
046
38
26
&
146
102 66
f
Since a has a smaller numeric value (141) then d (144), the range a-d includes all characters from a to d. What if you want to match a character that can be anything but an x, y or z? You wouldn't want to have to provide a [ ] set with all of the characters except x, y or z. To indicate that you want to match a character that is not one of the listed characters, start your [ ] set with a ^ symbol. For example, the following will demonstrate matching a pattern that includes a character that isn't an a, b or c followed by a d: sysadmin@localhost:~$ grep --color '[^abc]d' example.txt
abcdd d sysadmin@localhost:~$
8.11.3 Basic Regular Expressions - the * Character The * character can be used to match "zero or more of the previous character". For example, the following will match zero or more d characters: sysadmin@localhost:~$ grep --color 'd*' example.txt
abcddd sysadmin@localhost:~$
8.11.4 Basic Regular Expressions - the ^ and $ Characters When you perform a pattern match, the match could occur anywhere on the line. You may want to specify that the match occurs at the beginning of the line or the end of the line. To match at the beginning of the line, begin the pattern with a ^ symbol. In the following example, another line is added to the example.txt file to demonstrate the use of the ^ symbol: sysadmin@localhost:~$ echo "xyzabc" >> example.txt sysadmin@localhost:~$ cat example.txt
abcddd xyzabc
sysadmin@localhost:~$ grep --color "a" example.txt abcddd
xyza bc sysadmin@localhost:~$ grep --color "^a" example.txt abcddd sysadmin@localhost:~$
Note that in the first grep output, both lines match because they both contain the letter a. In the second grep output, only the line that began with the letter a matched. In order to specify the match occurs at the end of line, end the pattern with the $ character. For example, in order to only find lines which end with the letter c: sysadmin@localhost:~$ grep "c$" example.txt
xyzabc sysadmin@localhost:~$
8.11.5 Basic Regular Expressions - the \ Character In some cases you may want to match a character that happens to be a special Regular Expression character. For example, consider the following: sysadmin@localhost:~$ echo "abcd*" >> example.txt sysadmin@localhost:~$ cat example.txt
abcddd xyzabc abcd* sysadmin@localhost:~$ grep --color "cd*" example.txt
abcddd xyzabc abcd * sysadmin@localhost:~$
In the output of the grep command above, you will see that every line matches because you are looking for a c character followed by zero or more d characters. If you want to look for an actual * character, place a \ character before the * character: sysadmin@localhost:~$ grep --color "cd\*" example.txt
abcd*
sysadmin@localhost:~$
8.12 Extended Regular Expressions The use of Extended Regular Expressions often requires a special option be provided to the command to recognize them. Historically, there is a command called egrep, which is similar to grep, but is able to understand their usage. Now, the egrep command is deprecated in favor of using grep with the -E option. The following regular expressions are considered "extended": RE
Meaning
? Matches previous character zero or one time, so it is an optional character + Matches previous character repeated one or more times | Alternation or like a logical or operator
Some extended regular expressions examples: Command
Meaning
Matches
grep -E 'colou?r' 2.txt
Match colo following by zero or one u character
color colour
grep -E 'd+' 2.txt
Match one or more d characters
d dd ddd dddd
grep -E 'gray|grey' 2.txt
Match either gray or grey
gray grey
8.13 xargs Command The xargs command is used to build and execute command lines from standard input. This command is very helpful when you need to execute a command with a very long list of arguments, which in some cases can result in an error if the list of arguments is too long. The xargs command has an option -0 which disables the end- of-file string, allowing the use of arguments containing spaces, quotes, or backslashes. The xargs command is useful for allowing commands to be executed more efficiently. Its goal is to build the command line for a command to execute as few times as possible with as many arguments as possible, rather than to execute the command many times with one argument each time.
The xargs command functions by breaking up the list of arguments into sublists and executing the command with each sublist. The number of arguments in each sublist will not exceed the maximum number of argments for the command being executed and therefore avoids an “Argument list too long ” error. The following example shows a scenario where the xargs command allowed for many files to be removed, where using a normal wildcard (glob) character failed : sysadmin@localhost:~/many$ rm *
bash: /bin/rm: Argument list too l ong sysadmin@localhost:~/many$ ls | xargs rm sysadmin@localhost:~/many$
Lab 8 8.1 Introduction This is Lab 8: Pipes, Redirection and REGEX. By performing this lab, students will learn how to redirect text streams, use regular expressions, and commands for filtering text files. In this lab, you will perform the following tasks: 1. Learn how to redi rect and pipe standard input, output and error channels. 2. Use regular expressions to filter output of commands or file content. 3. View large files or command output with programs for paging, and viewing selected portions.
8.2 Command Line Pipes and Redirection
Normally, when you execute a command, the output is displayed in the terminal window. This output (also called a channel) is called standard output, symbolized by the term stdout. The file descriptor number for this channel is 1. Standard error (stderr) occurs when an error occurs during the execution of a command; it has a file descriptor of 2. Error messages are also sent to the terminal window by default. In this lab, you will use characters that redirect the output from standard output (stdout) and standard error (stderr) to a file or to another command instead of the terminal screen. Standard input, stdin, usually is provided by you to a command by typing on the keyboard; it has a file descriptor of 0. However, by redirecting standard input, files can also be used as stdin.
8.2.1 Step 1 Use the redirection symbol > to redirect the output from the normal output of stdout (terminal) to a file. Type the following: echo "Hello World" echo "Hello World" > m ymessage cat mymessage
Your output should be similar to the following: sysadmin@localhost:~$ echo "Hello World"
Hello World sysadmin@localhost:~$ echo "Hello World" > mymessage sysadmin@localhost:~$ cat mymessage
Hello World sysadmin@localhost:~$
The first command echos the message (stdout) to the terminal. The second command redirects the output; instead of sending it to the terminal, the output is sent to a file called mymessage. The last command displays the contents of the mymessage file.
8.2.2 Step 2
When you use the > symbol to redirect stdout, the contents of the file are first destroyed. Type the following commands to see a demonstration: cat mymessage echo Greetings > mymessage cat mymessage
Your output should be similar to the following: sysadmin@localhost:~$ cat mymessage
Hello World sysadmin@localhost:~$ echo Greetings > mymessage sysadmin@localhost:~$ cat mymessage
Greetings sysadmin@localhost:~$
Notice that using one redirection symbol overwrites an existing file. This is called "clobbering" a file.
8.2.3 Step 3 You can avoid clobbering a file by using >> instead of >. By using >> you append to a file. Execute the following commands to see a demonstration of this: cat mymessage echo "How are you?" >> mymessage cat mymessage
Your output should be similar to the following: sysadmin@localhost:~$ cat mymessage
Greetings sysadmin@localhost:~$ echo "How are you?" >> mymessage sysadmin@localhost:~$ cat mymessage
Greetings How are you? sysadmin@localhost:~$
Notice that by using >> all existing data is preserved and the new data is appended at the end of the file.
8.2.4 Step 4 The find command is a good command to demonstrate how stderr works. This command searches the filesystem for files based on criteria such as filename. Run the following command and observe the output: find /etc -name hosts
Your output should be similar to the following: sysadmin@localhost:~$ find /etc -name hosts
find: `/etc/ssl/private': Permission denied /etc/hosts sysadmin@localhost:~$
Notice the error message indicating you do not have permissi on to access certain files/directories. This is because as a regular user, you don't have right to "look i nside" some directories. These types of error messages are sent to stderr, not stdout.
The find command will be covered in greater detail later. The command is just being used now to demonstrate the difference between stdout and stderr.
8.2.5 Step 5 To redirect stderr (error messages) to a file issue the following command: find /etc -name hosts 2> err.txt cat err.txt
Your output should be similar to the following: sysadmin@localhost:~$ find /etc -name hosts 2> err.txt
/etc/hosts sysadmin@localhost:~$ cat err.txt
find: `/etc/ssl/private': Permission denied sysadmin@localhost:~$
Recall that the file descriptor for stderr is the number 2, so it is used along with the > symbol to redirect the sdterr output to a file called err.txt. Note that 1> is the same as >.
Note: The previous example demonstrates why knowing redirection is important. If you want to "ignore" the errors that the find command displays, you can redirect those messages into a file and look at them later, making it easier to focus on the rest of the output of the command.
8.2.6 Step 6 You can also redirect stdout and stderr into two separate files. find /etc -name hosts > std.out 2> std.err cat std.err cat std.out
Your output should be similar to the following: sysadmin@localhost:~$ find /etc -name hosts > std.out 2> std.err sysadmin@localhost:~$ cat std.err
find: `/etc/ssl/private': Permission denied sysadmin@localhost:~$ cat std.out
/etc/hosts sysadmin@localhost:~$
Notice that a space is permitted but not required after the > redirection symbol.
8.2.7 Step 7 To redirect both standard output (stdout) and standard error (stderr) to one file, first redirect stdout to a file and then redirect stderr to that same file by using the notation 2>&1. find /etc -name hosts >find.out 2>&1 cat find.out
Your output should be similar to the following: sysadmin@localhost:~$ find /etc -name hosts > find.out 2>&1 sysadmin@localhost:~$ cat find.out
find: `/etc/ssl/private': Permission denied /etc/hosts sysadmin@localhost:~$
The 2>&1 part of the command means send the stderr (channel 2) to the same place where stdout (channel 1) is going.
8.2.8 Step 8 Standard input (stdin) can also be redirected. Normally stdin comes from the keyboard, but sometimes you want it to come from a file instead. For example, the tr command translates characters, but it only accepts data from stdin, never from a filename given as an argument. This is great when you want to do something like capitalize data that is inputted from the keyboard (Note: Press Control+d, to signal the tr command to stop processing standard input): tr a-z A-Z this is interesting how do I stop this? ^D
Your output should be similar to the following: sysadmin@localhost:~$ tr a-z A-Z
this is interesting THIS IS INTERESTING how do I stop this? HOW DO I STOP THIS? sysadmin@localhost:~$
Note: ^D symbolizes Control+d
8.2.9 Step 9 The tr command accepts keyboard input (stdin), translates the characters and then sends the output to stdout. To create a file of all lower case characters, execute the following: tr A-Z a-z > myfile Wow, I SEE NOW This WORKS!
Your output should be similar to the following: sysadmin@localhost:~$ tr A-Z a-z > myfile
Wow, I SEE NOW
This WORKS! sysadmin@localhost:~$
Press the Enter key to make sure your cursor is on the line below "This works!", then use Control+d to stop input. To verify you created the file, execute the following command: cat myfile
Your output should be similar to the following: sysadmin@localhost:~$ cat myfile
wow, i see now this works! sysadmin@localhost:~$
8.2.10 Step 10 Execute the following commands to use the tr command by redirecting stdin from a file: cat myfile tr a-z A-Z < myfile
Your output should be similar to the following: sysadmin@localhost:~$ cat myfile
wow, i see now this works! sysadmin@localhost:~$ tr a-z A-Z < myfile
WOW, I SEE NOW THIS WORKS! sysadmin@localhost:~$
8.2.11 Step 11 Another popular form of redirection is to take the output of one command and send it into another command as input. For example, the output of some commands can be massive, resulting in the output scrolling off the screen too quickly to read. Execute the following command to take the output of the ls command and send it into the more command which displays one page of data at a time:
ls -l /etc | more
Your output should be similar to the following: sysadmin@localhost:~$ ls -l /etc | more
total 372 -rw-r--r-- 1 root root
2981 Jan 28
2015 adduser.conf
-rw-r--r-- 1 root root
10 Jan 28
drwxr-xr-x 1 root root
900 Jan 29
2015 alternatives
drwxr-xr-x 1 root root
114 Jan 29
2015 apparmor.d
drwxr-xr-x 1 root root
168 Oct
1
2014 apt
-rw-r--r-- 1 root root
2076 Apr
3
2012 bash.bashrc
2015 adjtime
drwxr-xr-x 1 root root
72 Jan 28
2015 bash_completion.d
drwxr-sr-x 1 root bind
342 Jan 29
2015 bind
-rw-r--r-- 1 root root
356 Apr 19
2012 bindresvport.blacklist
-rw-r--r-- 1 root root
321 Mar 30
2012 blkid.conf
lrwxrwxrwx 1 root root
15 Jun 18
2014 blkid.tab -> /dev/.blkid.tab
drwxr-xr-x 1 root root
16 Jan 29
2015 ca-certificates
-rw-r--r-- 1 root root
7464 Jan 29
drwxr-xr-x 1 root root
14 Jan 29
2015 calendar
drwxr-xr-x 1 root root
24 Jan 29
2015 cron.d
drwxr-xr-x 1 root root
134 Jan 29
drwxr-xr-x 1 root root
24 Jan 29
2015 cron.hourly
drwxr-xr-x 1 root root
24 Jan 29
2015 cron.monthly
-rw-r--r-- 1 root root
2969 Mar 15
2012 debconf.conf
2015 ca-certificates.conf
2015 cron.daily
--More--
You will need to press the spacebar to continue or you can also press CTRL+c to escape this listing. The cut command is useful for extracting fields from files that are either delimited by a character, like the colon (:) in /etc/passwd, or that have a fixed width. It will be used in the next few examples as it typically provides a great deal of output that we can use to demonstrate using the | character.
8.2.12 Step 12 In the following example, you will use a command called cut to extract all of the usernames from a database called /etc/passwd (a file that contains user account information). First, try running the command cut by itself:
cut -d: -f1 /etc/passwd
A portion of the command output is shown in the graphi c below. sysadmin@localhost:~$ cut -d: -f1 /etc/passwd
root daemon bin sys sync games man lp mail news uucp proxy www-data backup list irc gnats nobody libuuid syslog bind sshd operator
8.2.13 Step 13 The output in the previous example was unordered and scrolled off the screen. In the next step you are going to take the output of the cut command and send it into the sort command to provide some order to the output: cut -d: -f1 /etc/passwd | sort
A portion of the command output is shown in the graphic below.
sysadmin@localhost:~$ cut -d: -f1 /etc/passwd | sort
backup bin bind daemon games gnats irc libuuid list lp mail man news nobody operator proxy root sshd sync sys
8.2.14 Step 14 Now the the output output is sorted, but it stil still scroll scrolls off off the screen. screen. Send the the ou o utput tput of the sort comm co mmand and to the more command to solve this this problem: pro blem: cut -d: -f1 /etc/passwd /etc/passwd | sort so rt | more sysadmin@localhost:~$ cut -d: -f1 /etc/passwd | sort | more
backup bin bind daemon games gnats irc libuuid
list lp mail man news nobody operator proxy root sshd sync sys sysadmin syslog uucp
--More--
8.3 Using find to Find Files In this this task, task , you wil will use the find comm co mmand and to find find files. files. The find comm co mmand and is a very flexible flexible command with with a host of options op tions allowin allowing g users to locate files files based ba sed on a vast array arra y of criteria such as fil file name, na me, size, size, date, dat e, type and permi permissio ssio n. The The basic basic comm command and constr construc uctt is: find -criteria
The command command wil will begin the search sea rch in in the direc directory tory specif spe cified ied and recursively search sea rch all all of the subdirectories.
8.3.1 Step 1 Searc Se arch h for fil files beginning in your home directory directo ry containing c ontaining the name bash. find ~ -name "*bash*"
Your output shoul s hould d be similar similar to the the ffoll ollow owing ing:: sysadmin@localhost:~$ find ~ -n ame "*bash*" "*bash*"
/home/sysadmin/.bash_logout
/home/sysadmin/.bashrc sysadmin@localhost:~$
Remember that ~ is used to represent rep resent your home home di d irectory.
8.3.2 Step 2 Find fi files that were mod modifi ified ed (or created cre ated)) less less than 5 minutes minutes ago in in the speci spec ified fied directory directo ry by b y using using the followin following g comm co mmands ands:: find ~/Music -mmin -5 touch tou ch ~ /Musi /Music/mysong c/mysong find ~/Music -mmin -5
Your output shoul s hould d be similar similar to the the ffoll ollow owing ing:: sysadmin@localhost:~$ find ~/Music -mmin -5 sysadmin@localhost:~$ touch ~/Music/mysong sysadmin@localhost:~$ find ~/Music -mmin -5
/home/sysadmin/Music /home/sysadmin/Music/mysong sysadmin@localhost:~$
The first find com co mmand does do es not find find fil files that were mod modifi ified ed within within the time time spec sp eciified fied.. You then create cre ated d a fil file by usin using g the touch comm co mmand and and ran the find comm co mmand and again, resulting resulting in the find com co mmand discovering discoveri ng the new fil file. The Music directory directo ry was displayed displayed with with the second seco nd find command command because beca use the the directory directo ry was mod modif ifiied, ed , the result of o f a fil file bei be ing add a dded ed to the the directo directory. ry.
8.3.3 Step 3 Execute Execute the follow followin ing g comm co mmand and to find find files files in the /usr directory that are larger than 2MB in size: find /usr -size +2M
Your output shoul s hould d be similar similar to the the ffoll ollow owing ing:: sysadmin@localhost:~$ find /usr -size +2M
/usr/bin/python2.7
/usr/lib/perl/5.14.2/auto/Encode/JP/JP.so /usr/lib/perl/5.14.2/auto/Encode/KR/KR.so /usr/share/GeoIP/GeoIPv6.dat /usr/share/file/magic.mgc sysadmin@localhost:~$
8.3.4 Step 4 Find fi files of type “directory” “directo ry” in the the specif spe cif ied location. find fin d /usr/sh /u sr/share/bug are/bug -type d
Your output shoul s hould d be similar similar to the the ffoll ollow owing ing:: sysadmin@localhost:~$ find /usr/share/bug -type d
/usr/share/bug /usr/share/bug/apt /usr/share/bug/gnupg /usr/share/bug/initramfs-tools /usr/share/bug/procps /usr/share/bug/cron /usr/share/bug/file /usr/share/bug/libmagic1 /usr/share/bug/logrotate /usr/share/bug/man-db /usr/share/bug/vim-tiny sysadmin@localhost:~$
8.3.5 Step 5 To verify verify that the the output displays displays directories, directo ries, use the -ls opti op tion. on. The find com co mmand uses the -print option op tion by defaul de faultt which which displays displays just fil filee names. The -ls option provi provides file detail details: find fin d /usr/sh /u sr/share/bug are/bug -type d -ls
Your output shoul s hould d be similar similar to the the ffoll ollow owing ing:: sysadmin@localhost:~$ find /usr/share/bug /usr/share/bug -type d -l s
2280 bug
0 drwxr-xr-x
1 root
root
138 Jan 29
2015 /usr/share/
2281 bu
0 drwxr-xr-x
1 root
root
12 Jan 28
2015 /usr/share/
2283 0 drwxr-xr-x bug/gnupg
1 root
root
1 4 Jan 28
2015 /usr/share/
2285 0 drwxr-xr-x bug/initramfs-tools
1 root
root
12 Jan 28
2015 /usr/share/
2287 0 drwxr-xr-x bug/procps
1 root
root
14 Jan 28
2015 /usr/share/
10784 bug/cron
0 drwxr-xr-x
1 root
root
26 Jan 29
2015 /usr/share/
10787 bug/file
0 drwxr-xr-x
1 root
root
28 Jan 29
2015 /usr/share/
10790 0 drwxr-xr-x bug/libmagic1
1 root
root
28 Jan 29
2015 /usr/share/
10793 0 drwxr-xr-x bug/logrotate
1 root
root
12 Jan 29
2015 /usr/share/
g/apt
10795 u
0 drwxr-xr-x
1 root
root
14 Jan 29
2015 /usr/share/b
g/man-db 10797 0 drwxr-xr-x bug/vim-tiny
1 root
root
26 Jan 29
2015 /usr/share/
sysadmin@localhost:~$
Recall that the d character before the permissions rwxr-x-r-x indicates that the file is actually a directory.
8.4 Viewing Large Text Files Although large text files can be viewed with the cat command, it is inconvenie nt to scroll backwards towards the top of the file. Additionally, really large files can't be displayed in this manner because the terminal window only stores a specific number of lines of output in memory. The use of the more or less commands allows for the user to the view data a "page" or a line at a time. These "pager" commands also permit other forms of navigation and searching that will be demonstrated in this section. Note: examples are given using both the more and less commands. Most of the time, the commands work the same, however the less command is more advanced and has more features. The more command is still important to know because some Linux distributio ns don't have the less command, but all Linux distributions have the more command.
If you are not interested in viewing the entire file or output as a command, then there are numerous commands that are able to filter the contents of the file or output. In this section, you will learn the use of the head and tail commands to be able to extract information from the top or bottom of the output of a command or file contents.
8.4.1 Step 1 The /etc/passwd is likely too large to be displayed on the screen without scrolling the screen. To see a demonstration of this, use the cat command to display the entire contents of the /etc/passwd file: cat /etc/passwd
Your output should be similar to the following: daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/bin/sh man:x:6:12:man:/var/cache/man:/bin/sh lp:x:7:7:lp:/var/spool/lpd:/bin/sh mail:x:8:8:mail:/var/mail:/bin/sh news:x:9:9:news:/var/spool/news:/bin/sh uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh proxy:x:13:13:proxy:/bin:/bin/sh www-data:x:33:33:www-data:/var/www:/bin/sh backup:x:34:34:backup:/var/backups:/bin/sh list:x:38:38:Mailing List Manager:/var/list:/bin/sh irc:x:39:39:ircd:/var/run/ircd:/bin/sh gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh nobody:x:65534:65534:nobody:/nonexistent:/bin/sh libuuid:x:100:101::/var/lib/libuuid:/bin/sh syslog:x:101:103::/home/syslog:/bin/false bind:x:102:105::/var/cache/bind:/bin/false sshd:x:103:65534::/var/run/sshd:/usr/sbin/nologin operator:x:1000:37::/root:/bin/sh sysadmin:x:1001:1001:System sysadmin@localhost:~$
Administrator,,,,:/home/sysadmin:/bin/bash
8.4.2 Step 2 Use the more command to display the entire contents of the /etc/passwd file: more
/etc/passwd
Your output should be similar to the following: root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/bin/sh man:x:6:12:man:/var/cache/man:/bin/sh lp:x:7:7:lp:/var/spool/lpd:/bin/sh mail:x:8:8:mail:/var/mail:/bin/sh news:x:9:9:news:/var/spool/news:/bin/sh uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh proxy:x:13:13:proxy:/bin:/bin/sh www-data:x:33:33:www-data:/var/www:/bin/sh backup:x:34:34:backup:/var/backups:/bin/sh list:x:38:38:Mailing List Manager:/var/list:/bin/sh irc:x:39:39:ircd:/var/run/ircd:/bin/sh gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh nobody:x:65534:65534:nobody:/nonexistent:/bin/sh libuuid:x:100:101::/var/lib/libuuid:/bin/sh syslog:x:101:103::/home/syslog:/bin/false bind:x:102:105::/var/cache/bind:/bin/false sshd:x:103:65534::/var/run/sshd:/usr/sbin/nologin operator:x:1000:37::/root:/bin/sh
--More--(92%)
Note: The --More--(92%) indicates you are "in" the more command and 92% through the current data.
8.4.3 Step 3
While you are in the more command, you can view the help screen by pressing the h key: h
Your output should be similar to the following: Most commands optionally preceded by integer argument k.
Defaults in brackets
Star (*) indicates argument becomes new default. -----------------------------------------------------------------------------
Display next k lines of text [current screen size]
z
Display next k lines of text [current screen size]*
Display next k lines of text [1]*
d or ctrl-D
Scroll k lines [current scroll size, initially 11]*
q or Q or
Exit from more
s
Skip forward k lines of text [1]
f
Skip forward k screenfuls of text [1]
b or ctrl-B
Skip backwards k screenfuls of text [1]
'
Go to place where previous search started
=
Display current line number
/
Search for kth occurrence of regular expression [1]
n
Search for kth occurrence of last r.e [1]
! or :!
Execute in a subshell
v
Start up /usr/bin/vi at current line
ctrl-L
Redraw screen
:n
Go to kth next file [1]
:p
Go to kth previous file [1]
:f
Display current file name and line number
.
Repeat previous command
---------------------------------------------------------- --------------------
--More--(92%)
8.4.4 Step 4 Press the Spacebar to view the rest of the document:
In the next example, you will learn how to search a document using either the more or less commands. Searching for a pattern within both the more and less commands is done by typing the slash, / , followed by the pattern to find. If a match is found, the screen should scroll to the first match. To move forward to the next match, press the n key. With the less command you can also move backwards to previous matches by pressing the N (capital n) key.
8.4.5 Step 5 Use the less command to display the entire contents of the /etc/passwd file. Then search for the word bin, use n to move forward, and N to move backwards. Finally, quit the less pager by typing the letter q: less /etc/passwd /bin nnnNNNq
Important: Unlike the more command which automatically exits when you reach the end of a file, you must press a quit key such as q to quit the less program.
8.4.6 Step 6 You can use the head command to display the top part of a file. By default, the head command will display the first ten lines of the file: head /etc/passwd
Your output should be similar to the following: sysadmin@localhost:~$ head /etc/passwd
root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/bin/sh man:x:6:12:man:/var/cache/man:/bin/sh lp:x:7:7:lp:/var/spool/lpd:/bin/sh mail:x:8:8:mail:/var/mail:/bin/sh news:x:9:9:news:/var/spool/news:/bin/sh
sysadmin@localhost:~$
8.4.7 Step 7 Use the tail command to display the last ten lines of the /etc/passwd file: tail /etc/passwd
Your output should be similar to the following: sysadmin@localhost:~$ tail /etc/passwd
list:x:38:38:Mailing List Manager:/var/list:/bin/sh irc:x:39:39:ircd:/var/run/ircd:/bin/sh gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh nobody:x:65534:65534:nobody:/nonexistent:/bin/sh libuuid:x:100:101::/var/lib/libuuid:/bin/sh syslog:x:101:103::/home/syslog:/bin/false bind:x:102:105::/var/cache/bind:/bin/false sshd:x:103:65534::/var/run/sshd:/usr/sbin/nologin operator:x:1000:37::/root:/bin/sh sysadmin:x:1001:1001:System
Administrator,,,,:/home/sysadmin:/bin/bash
sysadmin@localhost:~$
8.4.8 Step 8 Use the head command to display the first two lines of the /etc/passwd file: head -2 /etc/passwd
Your output should be similar to the following: sysadmin@localhost:~$ head -2 /etc/passwd
root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh sysadmin@localhost:~$
8.4.9 Step 9
Execute the following command line to pipe the output of the ls command to the tail command, displaying the last five file names in the /etc directory: ls /etc | tail -5
Your output should be similar to the following: sysadmin@localhost:~$
ls /etc | tail -5
update-motd.d updatedb.conf vim wgetrc xml sysadmin@localhost:~$
As you've seen, both head and tail commands output ten lines by default. You could also use an option -# (or you can use the option -n #, where # is a number of lines to output). Both commands can be used to read standard input from a pipe that receives output from a command. You have also seen where head and tail commands are different: the head command starts counting its lines to output from the top of the data, whereas the tail command counts the number of lines to output from the bottom of the data. There are some additional differences between these two commands as demonstrated in the next few tasks.
8.4.10 Step 10 Another way to specify how many lines to output with the head command is to use the option -n -#, where # is number of lines counted from the bottom of the output to exclude. Notice the minus symbol - in front of the #. For example, if the /etc/passwd contains 24 lines and the following command will display lines 1-4, excluding the last twenty lines: head -n -20 /etc/passwd sysadmin@localhost:~$ head -n -20 /etc/passwd
root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sysadmin@localhost:~$
8.5 Searching Text Using Regular Expressions In this task, you will use the grep family of commands with regular expressions in order to search for a specific string of characters in a stream of data (for example, a text file). The grep command uses basic regular expressions, special characters like wildcards that match patterns in data. The grep command returns the entire line containing the pattern that matches. The -E option to the grep command can be used to perform searches with extended regular expressions, essentially more powerful regular expressions. Another way to use extended regular expressions is to use the egrep command. The fgrep command is used to match literal characters, ignoring the special meaning of regular expression characters.
8.5.1 Step 1 The use of grep in its simplest form is to search for a given string of characters, such as sshd in the /etc/passwd file. The grep command will print the entire line containing the match: cd /etc grep sshd passwd
Your output should be similar to the following: sysadmin@localhost:~$ cd /etc sysadmin@localhost:/etc$ grep sshd passwd sshd :x:103:65534::/var/run/ sshd :/usr/sbin/nologin sysadmin@localhost:/etc$
8.5.2 Step 2 Regular expressions are "greedy" in the sense that they will match every single instance of the specified pattern: grep root passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep root passwd root :x:0:0:root:/ root:/bin/bash
operator:x:1000:37::/ root :/bin/sh sysadmin@localhost:/etc$
Note the red highlights indicate what exactly was matched. You can also see that all occurrences of root were matched on each line.
8.5.3 Step 3 To limit the output, you can use regular expressions to specify a more precise pattern. For example, the caret ( ^ ) character can be used to match a pattern at the beginning of a line; so when you execute the following command line, only lines that begin with root should be matched and displayed : grep '^root' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep '^root' passwd root :x:0:0:root:/root:/bin/bash sysadmin@localhost:/etc$
Note that there are two additional instances of the word root but only the one appearing at the beginning of the line is matched (displayed in red). Best Practice: Use single quotes (not double quotes) around regular expressions to prevent the shell program from trying to interpret them.
8.5.4 Step 4 Match the pattern sync anywhere on a line: grep 'sync' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep 'sync' passwd sync :x:4:65534: sync :/bin:/bin/ sync sysadmin@localhost:/etc$
8.5.5 Step 5 Use the $ symbol to match the pattern sync at the end of a line:
grep 'sync$' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep 'sync$' passwd
sync:x:4:65534:sync:/bin:/bin/ sync sysadmin@localhost:/etc$
The first command matches every instance; the second only matches the instance at the end of the line.
8.5.6 Step 6 Use the period character . to match any single character. For example, execute the following command to match any character followed by a 'y': grep '.y' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep '.y' passwd sys:x:3:3: sys:/dev:/bin/sh sync:x:4:65534:sy nc:/bin:/bin/ sy nc
proxy :x:13:13:proxy :/bin:/bin/sh gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh nobo dy:x:65534:65534:nobo dy:/nonexistent:/bin/sh syslog:x:101:103::/home/ sy slog:/bin/false sysadmin:x:1001:1001:Sy stem Administrator,,,,:/home/ sysadmin:/bin/bash sysadmin@localhost:/etc$
8.5.7 Step 7 The pipe character, |, or "alternat ion operator", acts as an "or" operator. For example, execute the following to attempt to match either sshd, root or operator: grep 'sshd|root|operator' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep 'sshd|root|operator' passwd
sysadmin@localhost:/etc$
Observe that the grep command does not recognize the pipe as the alternation operator by default. The grep command is actually including the pipe as a plain character in the pattern to be matched. The use of either grep -E or egrep will allow the use of the extended regular expressions including alternation.
8.5.8 Step 8 Use the -E switch to allow grep to operate in extended mode in order to recognize the alternation operator: grep -E 'sshd|root|operator' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep -E ' sshd|root|operator' passwd root :x:0:0:root:/ root:/bin/bash sshd :x:103:65534::/var/run/ sshd :/usr/sbin/nologin operator :x:1000:37::/ root :/bin/sh sysadmin@localhost:/etc$
8.5.9 Step 9 Use another extended regular expression, this time with egrep with alternation in a group to match a pattern. The strings nob and non will match: egrep 'no(b|n)' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ egrep 'no(b|n)' passwd nobody:x:65534:65534:nob ody:/non existent:/bin/sh sysadmin@localhost:/etc$
Note: The parenthesis, ( ), were used to limit the "scope" of the | character. Without them, such as nob|n, the pattern would have meant "match either nob or n”
8.5.10 Step 10
The [ ] characters can also be used to match a single character, however unlike the period character ., the [ ] characters are used to specify exactly what character you want to match. For example, if you want to match a numeric character, you can specify [0-9]. Execute the following command for a demonstration: head passwd | grep '[0-9]'
Your output should be similar to the following: sysadmin@localhost:/etc$ head passwd | grep '[0-9]'
root:x: 0:0 :root:/root:/bin/bash daemon:x:1 :1:daemon:/usr/sbin:/bin/sh bin:x:2 :2:bin:/bin:/bin/sh sys:x:3 :3:sys:/dev:/bin/sh sync:x: 4:65534:sync:/bin:/bin/sync games:x: 5: 60:games:/usr/games:/bin/sh man:x:6 :12 :man:/var/cache/man:/bin/sh lp:x:7: 7:lp:/var/spool/lpd:/bin/sh mail:x: 8:8 :mail:/var/mail:/bin/sh news:x: 9:9 :news:/var/spool/news:/bin/sh sysadmin@localhost:/etc$
Note: The head command was used to limit the output of the grep command.
8.5.11 Step 11 Suppose you want to search for a pattern containing a sequence of three digits. You can use { } characters with a number to express that you want to repeat a pattern a specific number of times; for example: {3} The use of the numeric qualifier requires the extended mode of grep: grep -E '[0-9]{3}' passwd
Your output should be similar to the following: sysadmin@localhost:/etc$ grep -E '[0-9]{3}' passwd
sync:x:4:655 34:sync:/bin:/bin/sync nobody:x:655 34: 65534:nobody:/nonexistent:/bin/sh libuuid:x: 100:101 ::/var/lib/libuuid:/bin/sh syslog:x:101 : 103 ::/home/syslog:/bin/false