CSCI-1200 Data Structures — Spring 2017 Lecture 1 — Introduction to C++, STL, & Strings Co-Instructors
email: ds
[email protected]
Professor Herbert Holzbauer Materials Research Center(MRC) 304 x8114
Professor William Thompson Amos Eaton(AE) 205 x6861
[email protected]
[email protected]
Today • Discussion of Website & Syllabus: http://www.cs.rpi.edu/acad http://www.cs .rpi.edu/academics/courses emics/courses/spring17/ds/ /spring17/ds/
• Getting Started in C++ & STL, C++ Syntax, STL Strings
1.1
Transitioni ransitioning ng from Python Python to C++ (from (from CSCI-1100 CSCI-1100 Computer Computer Science Science 1)
• Python is a great language to learn the power and flexibility of programming and computational problem solving. This semester we will work in C++ and study lower level programming concepts, focusing on details including including efficiency and memory usage. • Outside of this class, when working on large programming projects, you will find it is not uncommon to use a mix of programming programming languages languages and libraries. libraries. The individual individual advant advantages ages of Python and C++ (and Java, Java, and Perl, and C, and UNIX bash scripts, scripts, and ... ) can be combined combined into an elegant (or terrifying terrifyingly ly complex) complex) masterpiece. • Here are a few excellent references recommended to help you transition from Python to C++: http://cs.slu.edu/~goldwasser/pu goldwasser/publications/py blications/python2cpp.pdf thon2cpp.pdf http://www4.wittenberg.edu http://www4.w ittenberg.edu/academics/ma /academics/mathcomp/shelbu thcomp/shelburne/comp255/n rne/comp255/notes/Python2C otes/Python2Cpp.pdf pp.pdf
1.2
Compiled Compiled Languages Languages vs. Interprete Interpreted d Languag Languages es
a compiled language , which means your code is processed (compiled & linked) to produce a low• C/C++ is a compiled level level mac machine hine language executable executable that can be run on your specific hardware. hardware. You must re-compile re-compile & re-link re-link after you edit any of the files – although a smart development environment or Makefile will figure out what portions need to be recompiled and save some time (especially on large programming projects with many lines of code and many files). Also, if you move move your code to a di ff erent erent computer you will usually need to recompile. Generally the extra work of compilation produces an e fficient and optimized executable that will run fast.
• In contrast, many newer languages including Python, Java, & Perl are interpreted languages , that favor incremental development where you can make changes to your code and immediately run all or some of your code without waiting for compilation. However, an interpreted program will often run slower than a compiled program. • These days, the process of compilation is almost instantaneous for simple programs, and in this course we encourage you to follow the same incremental editing & frequent testing development strategy that is employed with interpreted languages. • Finally, many interpreted languages have a Just-In-Time-Compiler (JIT) that can run an interpreted programming language and perform optimization on-the-fly resulting in program performance that rivals optimized compiled code. Thus, the diff erences erences between compiled and interpreted languages are somewhat blurry. practice the cycle of coding & compilatio compilation n & testing testing during Lab 1. You are encouraged encouraged to try out • You will practice diff eerent rent development environments (code editor & compiler) and quickly settle on one that allows you to be most productive. Ask the your lab TAs & mentors about their favorite programming environments! The course website website includes many helpful links as well. well.
• As you see in today’s handout, C++ has more required punctuation than Python, and the syntax is more restrictiv restrictive. e. The compiler will proofread your code in detail detail and complain about any mistakes mistakes you mak make. e. Even long-time C++ programmers make mistakes in syntax, and with practice you will become familiar with the compiler’s error messages and how to correct your code.
1.3
A Sample Sample C++ Progr Program: am: Find Find the Roots Roots of a Qua Quadra dratic tic Poly Polynom nomial ial
#inclu #include de
am> #inc #inclu lude de h> #inc #inclu lude de >
// library library for reading reading & writin writing g from from the console console/ke /keybo yboard ard // lib libra rary ry wit with h the the squa square re roo root t func functi tion on & abs absol olut ute e valu value e // libr librar ary y with with the exit exit func functi tion on
// Retu Return rns s true true if the the cand candid idat ate e root root is inde indeed ed a root root of the the poly polyno nomi mial al a*x* a*x*x x + b*x b*x + c = 0 bool bool check_ check_roo root(i t(int nt a, int b, int c, float float root) root) { // plug plug the value into the formul formula a float check = a * root * root + b * root + c; // see if the absolut absolute e value value is zero zero (withi (within n a small small tolera tolerance nce) ) if (fabs( (fabs(che check) ck) > 0.0001 0.0001) ) { std: std::c :cer err r << "ERROR "ERROR: : " << root << " is not a root root of this this formul formula. a." " << std::e std::end ndl; l; return return false; false; } else { return return true; } } /* Use Use the the quad quadra rati tic c form formul ula a to find find the two two real real roots roots of polyno polynomi mial al. . Retu Return rns s true true if the the root roots s are are real real, , retu return rns s fals false e if the the root roots s are are imag imagin inar ary. y. If the the root roots s are real, real, they they are return returned ed throug through h the refere reference nce parame parameter ters s root_p root_pos os and root_n root_neg. eg. */ bool bool find_r find_root oots(i s(int nt a, int b, int c, float float &root_ &root_pos pos, , float float &root_ &root_neg neg) ) { // comput compute e the quanti quantity ty under under the radical radical of the quadra quadratic tic formul formula a int int radi radica cal l = b*b b*b - 4*a* 4*a*c; c; // if the radical radical is negati negative, ve, the roots roots are imaginar imaginary y if (radi (radica cal l < 0) { std::c std::cerr err << "ERROR "ERROR: : Imagin Imaginary ary roots" roots" << std::e std::endl ndl; ; return return false; false; } float sqrt_radic sqrt_radical al = sqrt(radic sqrt(radical); al); // comput compute e the two roots roots root_p root_pos os = (-b + sqrt_r sqrt_radi adical cal) ) / float( float(2*a 2*a); ); root_n root_neg eg = (-b - sqrt_r sqrt_radi adical cal) ) / float( float(2*a 2*a); ); return return true; true; } int main() main() { // We will will loop loop until until we are given given a polyno polynomia mial l with with real real roots roots while while (true) (true) { std::c std::cout out << "Enter "Enter 3 intege integer r coeffi coefficie cients nts to a quadra quadratic tic function function: : a*x*x a*x*x + b*x + c = 0" << std::e std::endl ndl; ; int my_a, my_a, my_b, my_b, my_c; my_c; std::c std::cin in >> my_a my_a >> my_b my_b >> my_c; my_c; // create create a place place to store store the roots roots float root_1, root_2; root_2; bool success success = find_root find_roots(my_ s(my_a,my_ a,my_b,my b,my_c, _c, root_1,ro root_1,root_2) ot_2); ; // If the polynomi polynomial al has imagina imaginary ry roots, roots, skip the rest of this this loop loop and start start over over if (!success (!success) ) continue; continue; std: std::c :cou out t << "The "The root roots s are: are: " << root root_1 _1 << " and and " << root root_2 _2 << std: std::e :end ndl; l; // Check Check our work.. work... . if (check_ro (check_root(my ot(my_a,my _a,my_b,my _b,my_c, _c, root_1) root_1) && check_roo check_root(my_ t(my_a,my_ a,my_b,my_ b,my_c, c, root_2)) root_2)) { // Verifi Verified ed roots, roots, break break out of the while while loop loop break; } else { std::c std::cerr err << "ERROR "ERROR: : Unable Unable to verify verify one or both both roots. roots." " << std::end std::endl; l; // if the the prog progra ram m has has an erro error, r, we choo choose se to exit exit with with a // non-ze non-zero ro error error code code exit(1); } } // by conven conventio tion, n, main main should should return return zero zero when when the progra program m finish finishes es normal normally ly return return 0; }
2
1.4 1.4
Some Some Ba Basi sicc C++ C++ Syn Synta tax x
• Comments are indicated using // for single line comments and /* and */ for multi-line comments. • #include asks the compiler for parts of the standard library and other code that we wish to use (e.g. the input/output stream function std::cout ). main() is a necessary component of all C++ programs; it returns a value (integer in this case) • int main() may have parameters.
and
it
between them as a unit. • { }: the curly braces indicate to C++ to treat everything between
1.5
The C++ C++ Stan Standar dard d Librar Library y, a.k.a. a.k.a. “STL” “STL”
• The standard library contains types and functions that are important extensions to the core C++ language. We will use the standard library to such a great extent that it will feel like part of the C++ core language. std is a namespace that that contains the standard library. • I/O streams are the first component of the standard library that we see. std::cout (“console output”) and std::endl (“end line”) are defined in the standard library header file, iostream
1.6
Variabl ariables es and Types Types
• A variable is variable is an object with a name. A name is a C++ identifier such as “ a”, “root_1”, or “success”. • An object An object is is computer memory that has a type. A type (e.g., int , float, and bool ) is a memory structure and a set of operations. • For example, float is a type and each float variable is assigned to 4 bytes of memory, and this memory is formatted formatted according according IEEE floating floating point standards standards for what represents represents the exponent exponent and mantissa. mantissa. There are many many operations operations defined on floats, floats, including including addition, subtraction, subtraction, printing to the screen, etc. • In C++ and Jav Java the program programmer mer must must specify specify the data data type type when when a new varia variable ble is declar declared. ed. The C++ compiler enforces type checking (a.k.a. static typing ). ). In contrast, contrast, the programmer programmer does not specify the type of variables variables in Python Python and Perl. Perl. These languages languages are are dynamically-typed — — the interpreter will deduce the data type at runtime.
1.7
Expressio Expressions, ns, Assignmen Assignments ts and State Statemen ments ts
root_pos os = (-b + sqrt_r sqrt_radi adical cal) ) / float( float(2*a 2*a); ); Consider the statement the statement : root_p
expression. You should review the definition definition of C++ • The calculation on the right hand side of the = is an expression. arithmetic arithmetic expressions expressions and operator operator precedenc precedencee from any reference reference textbook. textbook. The rules are pretty much much the same in C++ and Java and Python.
• The value of this expression is assigned to the memory location of the float variable root_pos . Note also that if all expression values are type int we need a cast from cast from int to float to prevent the truncation of integer division. The float(2*a) expression expression casts the integer value 2*a to 2*a to the proper float representation. • The float(2*a)
1.8
Condit Condition ionals als and and IF statem statemen ents ts
• The general form of an if-else statement is if (conditional-expression) (conditional-expression) statement; else statement;
• Each statement may be a single statement, such as the cout statement above, a structured statement, or a compound statement delimited by { . . .}.
3
1.9
Functio unctions ns and and Argu Argume ment ntss
• Functions are used to: – Break
code up into modules for ease of programming and testing, and for ease of reading by other people (never, ever, under-estimate the importance of this!).
– Create
code that is reusable at several places in one program and by several programs.
function has a sequence sequence of parameters parameters and a return return type. The function function prototype below has has a return • Each function bool and five parameters. type of bool bool bool find_r find_root oots(i s(int nt a, int b, int c, float float &root_ &root_pos pos, , float float &root_ &root_neg neg); );
parameters in the calling function function (the main function function in this example) example) must match • The order and types of the parameters the order and types of the parameters in the function prototype.
1.10
Value Parame Parameters ters and and Reference Reference Paramete Parameters rs
• What’s with the & symbol on the 4th and 5th parameters in the find_roots function prototype? function, we haven’t haven’t yet stored anything in those two two root variables. variables. • Note that when we call this function, float root_1, root_2; root_2; bool success success = find_roots find_roots(my_a (my_a,my_b ,my_b,my_c ,my_c, , root_1,roo root_1,root_2); t_2);
• The first first three parameters to this function are value parameters . – These
are essentially local variables (in the function) whose initial values are copies of copies of the values of the corresponding argument in the function call.
– Thus,
the value of my_a from the main function is used to initialize a in function find_roots .
– Changes
to value parameters within the called function do NOT change the corresponding argument in the calling function.
• The final two parameters are reference parameters , as indicated by the &. –
Reference Reference parameters parameters are just aliases aliases for their corresponding corresponding arguments. arguments. No new objects are created. created.
– As
a result, changes to reference parameters are changes to the corresponding variables (arguments) in the calling function.
“Rules of Thumb” Thumb” for using value and reference reference parameters parameters:: • In general, the “Rules – When
a function (e.g., check_root ) needs to provide just one simple result, make that result the return value of the function and pass other parameters parameters by value.
– When
a function needs to provide more than one result (e.g., find_roots , these results should be returned using multiple reference parameters.
• We’ll see more examples of the importance of value vs. reference parameters as the semester continues.
1.11
for
& while Loops
• Here is the basic form of a for loop: for (expr1; (expr1; expr2; expr2; expr3) expr3) statement; – expr1 is
the initial expression executed at the start before the loop iterations begin;
– expr2 is
the test applied before the beginning of each loop iteration, the loop ends when this expression evaluates to false or 0 ;
– expr3 is
evaluated at the very end of each iteration;
– statement is
the “loop body”
• Here is the basic form of a while loop: while (expr) (expr) statement;
expr is checked checked before entering entering the loop and after after each iteration. iteration. If expr ever evaluates the false the loop is
finished. 4
1.12 1.12
C-sty C-style le Array Arrayss
• An array is a fixed-length, consecutive sequence of objects all of the same type. The following declares an array with space for 15 double values. Note the spots in the array are currently uninitialized uninitialized . double double a[15]; a[15];
• The values values are accessed accessed through subscripting subscripting operations. operations. The following following code assigns assigns the value value 3.14159 to location i=5 of the array. Here i is the subscript the subscript or or index . int i = 5; a[i] a[i] = 3.1415 3.14159; 9;
• In C/C++, array indexing starts at 0. about its own size. The programmer must keep track • Arrays are fixed size, and each array knows NOTHING about of the size of each each array. array. (Note: (Note: C++ STL has generalizat generalization ion of C-style arrays, arrays, called vectors , which do not have these restrictions. More on this in Lecture 2!)
1.13 1.13
Pyth Python on Stri String ngss vs. vs. C chars vs. C-style C-style Strings Strings vs. C++ STL Strings Strings
• Strings in Python are immutable, and there is no di ff erence erence between a string and a char in Python. Thus, ’a’ and "a" are both strings in Python, not individual characters. In C++ & Java, single quotes create a character type (exactly one character) and double quotes create a string of 0, 1, 2, or more characters. chars that ends with the special char ’ \0’. C-style strings (char* or char[] ) • A “C-style” string is an array of char can be edited, and there are a number of helper functions to help with common operations. However...
• The “C++-style” STL string type has a wider array of operations and functions, which are more convenient and more powerful.
1.14 1.14
About About STL STL Stri String ng Object Objectss
• A string is an object type defined in the standard library to contain a sequence of characters. • The string string type, type, like like all types types (inclu (includin dingg int, double , char , float ), defines defines an interface, interface, which which includes includes construction (initialization), operations, functions (methods), and even other types(!). • When an object is created, a special function is run called a “constructor”, whose job it is to initialize the object. There are several ways ways of constructing constructing string objects: – By
default to create an empty string:
std::string my_string_var; my_string_var;
– With
a specified number of instances of a single char:
– From
another string:
std::str std::string ing my_string_ my_string_var2 var2(10, (10, ' ');
std::string my_string_var3(my_string_v my_string_var3(my_string_var2); ar2);
member function size that is defined as a member • The notation my_string_var.size() is a call to a function of the string class. There is an equivalent member function called length .
function
• Input to string objects through streams streams (e.g. reading reading from the keyboard keyboard or a file) includes includes the following following steps: 1. The computer inputs and discards white-space white-space characters, one at a time, until a non-white-space non-white-space character is found. 2. A sequence of non-white-spac non-white-spacee characters characters is input and stored in the string. This overwrite overwritess anything anything that was already in the string. 3. Reading Reading stops either at the end of the input or upon reaching reaching the next white-space white-space character character (without (without reading it in).
• The (overloaded) operator ’+’ is defined on strings. It concatenates two strings to create a third string, without changing either of the original two strings. • The assignment operation ’=’ on strings overwrites the current contents of the string. • The individual characters of a string can be accessed using the subscript operator [] (similar to arrays). – Subscript
0 corresponds to the first character.
– For
std::string ing a = "Susan"; "Susan"; example, given std::str a[0 [0] ] == 'S' 'S' and a a[1 [1] ] == 'u' 'u' and a a[4 [4] ] == 'n' 'n' . Then a
5
string::size_type , which is the type returned by the string function size() • Strings define a special type string::size_type (and length() ). –
The :: notation means that size type is defined within the scope of the string type.
– string::size_type is
unsigned int. generally equivalent to unsigned
– You
may see have compiler warnings and potential compatibility problems if you compare an int variable to a.size() .
This seems like a lot to remember. Do I need to memorize this? Where can I find all the details on string objects?
1.15 1.15
Proble Problem: m: Writ Writing ing a Name Name Along Along a Diag Diagona onall
• Let’s study a simple program to read in a name using std::cin and then output a fancier version to std::cout , written along a diagonal inside a box of asterisks. Here’s how the program should behave: What What is your your first first name? name? Bob ******* * * * B * * o * * b * * * *******
• There are two main di fficulties: – Making
sure that we can put the characters in the right places on the right lines.
– Getting
the asterisks in the right positions and getting the right number of blanks on each line.
#include #include int main() main() { std::c std::cout out << "What "What is your your first first name? name? "; std::string first; std::cin std::cin >> first; first; const std::string star_line(first.size()+4, '*'); std::strin std::string g middle_li middle_line ne = "*" + std::strin std::string(fir g(first.si st.size()+ ze()+2,' 2,' ') + "*"; std::c std::cout out << '\n' '\n' << star_l star_line ine << '\n' '\n' << middle middle_li _line ne << std::e std::endl ndl; ; // Output Output the interi interior or of the greeting greeting, , one line at a time. time. for for (uns (unsig igne ned d int int i = 0; i < firs first. t.si size ze() (); ; ++i ++i ) { // Create Create the output output line line by overwr overwriti iting ng a single single charac character ter from from the // first name name in location location i+2. After After printing printing it restor restore e the blank. blank. middle_line[ i+2 ] = first[i]; std::cout std::cout << middle_li middle_line ne << '\n'; middle_line[ i+2 ] = ' '; } std::c std::cout out << middle middle_li _line ne << '\n' '\n' << star_l star_line ine << std::e std::endl ndl; ; return return 0; }
6
CSCI-1200 Data Structures — Spring 2017 Collaboration Policy & Academic Integrity iClicker Lecture exercises
Responses Responses to iClicker iClicker lecture lecture exercises will b e used to earn incentive incentivess for the Data Structures Structures course. Discussion cussion of collaborative collaborative iClicker iClicker lecture lecture exercises exercises with those seated around you is encouraged encouraged.. Howeve However, r, if we find anyone using an iClicker that is registered to another individual or using more than one iClicker, we will confiscate all iClickers involved and report the incident to the Dean of Students. Academic Integrity for Exams
All exams for this course course will be completed completed individually individually.. Copying, Copying, communicatin communicating, g, or using disallowed disallowed materials materials during an exam is cheating, cheating, of course. Students Students caught caught cheating cheating on an exam will receive receive an F in the course and will be reported to the Dean of Students for further disciplinary action. Collaboration Policy for Programming Labs
Collaboration is encouraged during the weekly programming labs. Students are allowed to talk through and assist each other with these programming exercises. Students may ask for help from each other, the graduate lab TA, and undergraduate undergraduate programming programming mentors. mentors. But each student student must write up and debug their own lab solutions on their own laptop and be prepared to present and discuss this work with the TA to receive credit for each checkpoint. As a genera generall guidel guideline ine,, studen students ts ma may y look over over each each other’ other’ss should shoulders ers at their their labmat labmate’s e’s laptop laptop screen screen during lab — this is the best way to learn about IDEs, code development strategies, testing, and debugging. However, looking should not lead to line-by-line copying. Furthermore, each student should retain control of their own keyboard. While being assisted by a classmate or a TA, the student should remain fully engaged on problem solving and ask plenty of questions. Finally, other than the specific files provided by the instructor, electronic files or file excerpts should not be shared or copied (by email, text, Dropbox, or any other means). Homework Collaboration Policy
Academic integrity is a complicated issue for individual programming assignments, but one we take very seriously. Students naturally want to work together, and it is clear they learn a great deal by doing so. Getting help is often the best way to interpret error messages and find bugs, even for experienced programmers. Furthermore, in-depth discussions about problem solving, algorithms, and code e fficiency are invaluable and make us all better software engineers. In response to this, the following rules will be enforced for programming assignments: •
•
•
•
Students may read through the homework assignment together and discuss what is asked by the assignment, examples of program input & expected output, the overall approach to tackling the assignment, possible high level algorithms to solve the problem, and recent concepts from lecture that might be helpful in the implementation. Students Students are not allowed allowed to work together together in writing writing code or pseudocode. Detailed Detailed algorithms algorithms and implemen implementatio tation n must must be done individually individually.. Students Students may not discuss discuss homework homework code in detail detail (lineby-line by-line or loop-by-loop) loop-by-loop) while it is being written written or afterward afterwards. s. In general, students students should not look at each other’s computer screen (or hand-written or printed assignment design notes) while working on homew homework ork.. As a guidel guideline ine,, if an alg algori orithm thm is too com comple plex x to descri describe be orally orally (without (without dictatin dictatingg line-by-line), then sharing that algorithm is disallowed by the homework collaboration policy. Students are allowed allowed to ask each other for help in interpreting error messages and in discussing strategies for testing and finding bugs. First, ask for help orally, by describing the symptoms of the problem. For each homework, many students will run into similar problems and after hearing a general description of a problem, another student might have suggestions for what to try to further diagnose or fix the issue. If that doesn’t work, and if the compiler error message or flawed flawed output is particular particularly ly lengthy lengthy,, it is okay to ask another student to briefly look at the computer screen to see the details of the error messag messagee and the correspon correspondin dingg line of code. code. Please Please see a TA during during o ffice hours if a more in-depth examination of the code is necessary. Students Students may not share or copy code or pseudocode. Homework Homework files or file excerpts excerpts should never be shared electronically (by email, text, LMS, Dropbox, etc.). Homework solution files from previous years
(either (either instructor instructor or student solutions) solutions) should not b e used in any way. way. Student Studentss must must not leave leave their code (either electronic electronic or printed) in publicly-ac publicly-accessib cessible le areas. areas. Students Students may not share computers computers in any way when there is an assignment pending. Each student is responsible for securing their homework materials materials using all reasonable reasonable precautions. precautions. These precautions precautions include: Students Students should password password lock the screen when they step away from their computer. Homework files should only be stored on private accounts/ accounts/compu computers ters with strong passwords. passwords. Homework Homework notes and printouts printouts should b e stored in a locked drawer/room. •
•
•
Students may not show their code or pseudocode to other students as a means of helping them. Wellmeaning homework help or tutoring can turn into a violation of the homework collaboration policy when stressed with time constraints from other courses and responsibilities. Sometimes good students who feel sorry for struggling students are tempted to provide them with “just a peek” at their code. Such Such “peeks” often turn into extensive extensive copying, copying, despite despite prior claims of good intentio intentions. ns. Students may not receive detailed help on their assignment code or pseudocode from individuals outside the course. This restriction restriction includes tutors, students students from prior terms, friends and family family mem members, bers, internet resources, etc. All collaborators (classmates, TAs, ALAC tutors, upperclassmen, students/instructor via LMS, etc.), and all of the resources (books, online reference material, etc.) consulted in completing this assignment must be listed in the README.txt file submitted with the assignment.
These rules are in place for each homework assignment and extends two days after the submission deadline. Homework Plagiarism Detection and Academic Dishonesty Penalty
We use an automatic code comparison tool to help spot homework assignments that have been submitted in violation violation of these rules. The tool takes takes all assignmen assignments ts from all sections sections and all prior terms and compares compares them, highlightin highlighting g regions of the code that are similar. The plagiarism plagiarism tool looks at core code structure structure and is not fooled by variable and function name changes or addition of comments and whitespace. The instructor checks flagged pairs of assignments very carefully, to determine which students may have violated the rules of collaboration and academic integrity on programming assignments. When it is believed that an incident of academic dishonesty has occurred, the involved students are contacted and a meeting is scheduled. scheduled. All students students caught caught cheating cheating on a programmi programming ng assignmen assignmentt (both the copier copier and the provider provider)) will be punished. punished. For undergraduate undergraduate students students,, the standard standard punishment punishment for the first o ff ense e nse is a 0 on the assignment and a full letter grade reduction on the final semester grade. Students whose violations are more flagrant will receive a higher penalty. Undergraduate students caught a second time will receive an immediate F in the course, regardless of circumstances. Each incident will be reported to the Dean of Students. Graduate students found to be in violation of the academic integrity policy for homework assignments on the first off ense ense will receive an F in the course and will be reported both to the Dean of Students and to the chair of their home department with the strong advisement that they be ineligible to serve as a teaching assistant for any course at RPI. Academic Dishonesty in the Student Handbook
Refer to the The Rensselaer Handbook of Student Rights and Responsibilities for for further discussion of academic dishonesty dishonesty.. Note that: “Student “Studentss found in violation violation of the academic academic dishonest dishonesty y policy are prohibited prohibited from dropping the course course in order to avoid avoid the academic academic penalt p enalty y.” Number of Students Found in Violation of the Policy
Historically, 5-10% of students are found to be in violation of the academic dishonesty policy each semester. Many of these students immediately admit to falling behind with the coursework and violating one or more of the rules above and if it is a minor first-time o ff ense ense may receive a reduced penalty. Read this document in its entirety entirety. If you have have any questions, questions, contact the instructor or the TAs immediately. immediately. Sign this form and give it to your TA during your first lab section. section. Name:
Section #:
Signature:
Date:
2
CSCI-1200 Data Structures — Spring 2017 Lecture 2 — STL Strings & Vectors Announcements •
•
HW 1 will be available on-line this afternoon through the website (on the “Calendar” “Calendar”). ). Be sure to read through this information as you start implementation of HW1: “Misc Programming Information” (a link at the bottom of the left bar of the website).
•
TA & instructor o ffice hours are posted on website (“W ( “Weekly eekly Schedule” Schedule”). ).
•
If you have not resolved issues with the C++ environment on your laptop, please do so immediately.
•
•
If you cannot access Piazza or the homework submission server, please email the instructor ASAP with your RCS ID and section number. Because many students were dealing with lengthy compiler/editor installation, registration confusion, etc., we will allow allow (for (for the first lab only!) studen students ts to get check checked ed o ff for any remaining Lab 1 checkpoints at the beginning of next week’s Lab 2 or in your grad TA’s normal o ffice hours.
Today •
STL Strings, char arrays (C-style Strings), & converting between these two types
•
L-values vs. R-values
•
STL Vectors as “smart arrays”
2.1 •
String String Concatenat Concatenation ion and Creatio Creation n of Temporar Temporary y String String Object The following statement creates a new string by “adding” (concatenating) other strings together: std::strin std::string g my_line my_line = "*" + std::stri std::string(fi ng(first.s rst.size() ize()+2,' +2,' ') + "*";
•
2.2 •
std::string(first.size()+2, , ' ') within this statement creates a temporary STL string The expression std::string(first.size()+2 but does not associate it with a variable.
Charac Character ter Arra Arrays ys and Stri String ng Liter Literals als In the line below "Hello!" is a string literal and it is also an array of characters (with no associated variable name). cout cout << "Hello "Hello!" !" << endl; endl;
•
A char array can be initialized as: or as:
char char h[] = {'H', {'H', 'e', 'e', 'l', 'l', 'l', 'o', '!', '!', '\0'}; '\0'};
char char h[] h[] = "Hell "Hello!" o!"; ;
In either case, array h has 7 characters, the last one being the null character. •
•
2.3 •
The C language provides many functions for manipulating these “C-style strings”. We don’t study them much anymo anymore re because because the “C++ style” style” STL string string library library is much much more logical logical and easier easier to use. If you want want http://www.cplusplus.com/ plusplus.com/ to find out more about functions for C-style strings look at the cstdlib library http://www.c reference/cstdlib/. One place we do use them is in file names and command-line arguments, which you will use in Homework 1.
Conve Conversion rsion Betw Between een Standard Standard Strings Strings and C-Style C-Style String String Literals Literals We regularly convert/cast between C-style & C++-style (STL) strings. For example: std::strin std::string g s1( "Hello!" "Hello!" ); std::s std::stri tring ng s2( h );
where h is as defined above. •
You can obtain the C-style string from a standard string using the member function c_str, as in s1.c_str() .
2.4 •
L-Valu L-Values es and R-Valu R-Values es Consider the simple code below. String a becomes "Tim" . No big deal, right? Wrong! std::s std::stri tring ng a = "Kim"; "Kim"; std::s std::stri tring ng b = "Tom"; "Tom"; a[0] a[0] = b[0]; b[0];
•
Let’s look closely at the line:
a[0] a[0] = b[0]; b[0];
and think about what happens. happens.
In particular, what is the di ff erence erence between the use of a[0] on the left hand side of the assignment statement and b[0] on the right hand side? •
Syntactically, they look the same. But, – The expression b[0] gets the char value, 'T' , from string location 0 in b . This is an r-value . – The expression a[0] gets a reference to the memory location associated with string location 0 in a . This is an l-value . – The assignment operator stores the value in the referenced memory location.
The diff erence erence between an r-value and and an l-value will will be especially significant when we get to writing our own operators operators later in the semester •
What’s wrong with this code? std::strin std::string g foo = "hello"; "hello"; foo[2] foo[2] = 'X'; 'X'; cout cout << foo; foo; 'X' = foo[3] foo[3]; ; cout cout << foo; foo;
non-lvalue e in assignme assignment nt ” Your C++ compiler will complain with something like: “non-lvalu
2.5 •
•
2.6
Standard Standard Templ Template ate Library Library (STL) (STL) Vecto Vectors: rs: Motiv Motivation Example Problem: Problem: Read an unknown unknown number of grades and compute compute some basic statistics statistics such such as the mean (average), standard deviation , median (middle (middle value), and mode (most (most frequently occurring value). Our solution to this problem will be much more elegant, robust, & less error-prone if we use the STL vector class. Why would it be more difficult/wasteful/buggy to try to write this using C-style (dumb) arrays?
STL Vector Vectors: s: a.k.a. a.k.a. “C++-St “C++-Style” yle”,, “Smart” “Smart” Arrays Arrays
•
Standard library “container class” to hold sequences.
•
A vector acts like a dynamically-sized, one-dimensional array.
•
Capabilities: – Holds objects of any type – Starts empty unless otherwise specified – Any number of objects may be added to the end — there is no limit on size. – It can be treated like an ordinary array using the subscripting operator. – A vector knows how many elements it stores! (unlike C arrays)
checking of subscript subscript bounds. – There is NO automatic checking •
Here’s how we create an empty vector of integers: std::vector scores;
•
Vectors are an example of a templated container class . The angle brackets < > are used to specify the type of object (the “template type”) that will be stored in the vector.
2
•
push back is a vector vector function function to append a value value to the end of the vector, vector, increasing increasing its size by one. This is an O (1) operation (on average). – There is NO corresponding push front operation for vectors.
•
size is a function defined by the vector type (the vector class) that returns the number of items stored in the
vector. •
After vectors are initialized and filled in, they may be treated just like arrays . – In the line sum += scores[i] scores[i]; ;
scores[i] is an “r-value”, accessing the value stored at location i of the vector. – We could also write statements like scores[4] scores[4] = 100;
to change a score. Here scores[4] is an “l-value”, providing the means of storing 100 at location 4 of the vector. – It is the job of the programmer programmer to ensure that any subscript subscript value value i that is used is legal —- at least 0 and scores.size() . strictly less than scores.size()
2.7
Initia Initializ lizing ing a Vect Vector or — The Use of Const Construc ructor torss
Here are several diff erent erent ways to initialize a vector: •
This “constructs” an empty vector of integers. Values must be placed in the vector using push_back . std::vector a;
•
This constructs constructs a vector of 100 doubles, each entry entry storing the value value 3.14. New entries entries can be created using push_back, but these will create entries 100, 101, 102, etc. int n = 100; std::vecto std::vector ble> b( 100, 3.14 );
•
This constructs constructs a vector vector of 10,00 10,000 0 ints, but provides provides no initial values for these integers. integers. Again, new entries can be created for the vector using push_back . These will create entries 10000, 10001, etc. std::vecto std::vector > c( n*n );
•
This constructs a vector that is an exact copy of vector b . std::vecto std::vector ble> d( b );
•
This is a compiler error because no constructor constructor exists to create an int vector vector from a double vector. vector. These are diff erent erent types. std::vecto std::vector > e( b );
2.8 2.8
Exer Exerci cise sess
1. After the above code constructing constructing the three vectors, vectors, what will be output output by the following following statement? statement? cout cout << a.size a.size() () << endl endl << b.size b.size() () << endl endl << c.size c.size() () << endl; endl;
2. Write code to construct a vector vector containing containing 100 doubles, doubles, each each having having the value value 55.5. 3. Write code to construct a vector vector containing containing 1000 doubles, doubles, containin containingg the values values 0, 1, Write it two ways, one that uses push_back and one that does not use push_back .
2.9
√ √ √ √ 2,
3,
4,
5, etc.
Example: Example: Using Vectors ectors to Comput Compute e Standard Standard Devia Deviation tion
Definition: If a0 , a1 , a2 , . . . , an−1 is a sequence of n values, and µ is the average of these values, then the standard
deviation is
P
n−1 i=0 ( ai
n
− µ)2
−1 3
1 2
// Comput Compute e the average average and standard standard deviatio deviation n of an input input set of grades grades. . #include #include #include #inc #inclu lud de // to acce ccess the the STL STL vect vecto or clas lass #inc #inclu lud de th> // to use sta standa ndard math math libr libra ary and and sqr sqrt int main(i main(int nt argc, argc, char* char* argv[] argv[]) ) { if (argc != 2) { std::c std::cerr err << "Usage "Usage: : " << argv[0 argv[0] ] << " grades grades-fi -file\ le\n"; n"; return return 1; } std::ifstream grades_str(argv[1]); if (!grades_s (!grades_str.go tr.good()) od()) { std::c std::cerr err << "Can "Can not open the grades grades file " << argv[1 argv[1] ] << "\n"; "\n"; return return 1; } std::v std::vect ector< or > scores scores; ; // Vector Vector to hold hold the input input scores scores; ; initia initially lly empty. empty. int x; // Input variable // Read Read the scores, scores, append appending ing each to the end of the vector vector while while (grade (grades_s s_str tr >> x) { scores.push_back(x); } // Quit Quit with with an error error messag message e if too few scores scores. . if (score (scores.s s.size ize() () == 0) { std::c std::cout out << "No scores scores entere entered. d. Please Please try again! again!" " << std::e std::endl ndl; ; retu return rn 1; // prog progra ram m exits exits with with erro error r code code = 1 } // Comput Compute e and output output the averag average e value. value. int sum = 0; for for (uns (unsig igne ned d int int i = 0; i < scor scores es.s .siz ize( e(); ); ++ i) { sum += scores[i]; scores[i]; } double double average average = double(su double(sum) m) / scores.si scores.size(); ze(); std::c std::cout out << "The "The averag average e of " << scores scores.si .size( ze() ) << " grades grades is " << std::setpr std::setprecis ecision(3 ion(3) ) << average average << std::endl; std::endl; // Exerci Exercise: se: comput compute e and output output the standar standard d deviat deviation ion. . double double sum_sq_di sum_sq_diff ff = 0.0; for (unsigned (unsigned int i=0; i
// ever everyt ythi hing ng ok
}
2.10 2.10 •
•
Standa Standard rd Libra Library ry Sort Sort Func Functio tion n
The standard library has a series of algorithms built to apply to container classes. The prototypes for these algorithms (actually the functions implementing these algorithms) are in header file algorithm.
•
One of the most important of the algorithms is sort.
•
It is accessed by providing the beginning and end of the container’s interval to sort.
4
•
As an example, the following code reads, sorts and outputs a vector of doubles: double double x; std::vector a; while while (std:: (std::cin cin >> x) a.push_back(x); std::sort(a.begin(), std::sort(a.begin() , a.end()); for (unsigne (unsigned d int i=0; i < a.size a.size(); (); ++i) std::c std::cout out << a[i] a[i] << '\n'; '\n';
•
a.begin() is an iterator referencing the first location in the vector, while a.end() is an iterator referencing
one past the last location in the vector. – We will learn much more about iterators in the next few weeks. – Every container has iterators: strings have begin() and end() iterators defined on them. •
The ordering of values by std::sort is least to greatest greatest (technically (technically,, non-decre non-decreasing) asing).. We will see ways ways to change this.
2.11 2.11
Examp Example: le: Comput Computing ing the Media Median n
The median value of a sequence is less than half of the values in the sequence, and greater than half of the values a 0 , a1 , a2 , . . . , an−1 is a sequence of n values AND if the sequence is sorted such that a0 ≤ a1 ≤ in the sequence. sequence. If a a2 ≤ ≤ an−1 then the median is · ··
a(n−1)/ 1)/2
n is odd if n
an/2−1 + a + an/2 2
if n n is even
// Comput Compute e the median median value value of an input input set of grades grades. . #include #include #include #include #include #include #include void read_score read_scores(std s(std::vec ::vector< tor int> & scores, scores, std::ifstr std::ifstream eam & grade_str) grade_str) { // scores scores can be change changed d in this this functi function on int int x; // input input varia variabl ble e while while (grade (grade_st _str r >> x) { scores.push_back(x); } } void compute_av compute_avg_and g_and_std_ _std_dev( dev(const const std::vector& nt>& s, double double & avg, double double & std_dev) std_dev) { // s cannot cannot be change changed d in this this functi function on // Comput Compute e and output output the averag average e value. value. int sum=0; sum=0; for (unsigned int i = 0; i < s.size(); ++ i) { sum += s[i]; s[i]; } avg = double double(su (sum) m) / s.size s.size(); (); // Compute Compute the standard standard deviation double double sum_sq sum_sq = 0.0; 0.0; for (unsign (unsigned ed int i=0; i < s.size s.size(); (); ++i) { sum_sq sum_sq += (s[i]-avg (s[i]-avg) ) * (s[i]-avg (s[i]-avg); ); } std_dev std_dev = sqrt(sum_ sqrt(sum_sq sq / (s.size()(s.size()-1)); 1)); }
5
double double compute_m compute_median edian(cons (const t std::vecto std::vector t> & scores) scores) { // Crea Create te a copy copy of the the vect vector or std::vector scores_to_sort(scores); scores_to_sort(scores); // Sort the values values in the vector vector. . By default default this is increa increasin sing g order. order. std::sort(scores_to_sort.begin(), std::sort(scores_t o_sort.begin(), scores_to_sort.end( scores_to_sort.end()); )); // Now, Now, comput compute e and output output the median median. . unsigned unsigned int n = scores_to scores_to_sort _sort.size .size(); (); if (n%2 (n%2 == 0) // even even number number of score scores s return return double(sco double(scores_t res_to_sor o_sort[n/2 t[n/2] ] + scores_to_ scores_to_sort[ sort[n/2-1 n/2-1]) ]) / 2.0; else return return double( double(sco scores res_to _to_so _sort[ rt[ n/2 ]); // same as (n-1)/2 (n-1)/2 because because n is odd } int main(i main(int nt argc, argc, char* char* argv[] argv[]) ) { if (argc != 2) { std::c std::cerr err << "Usage "Usage: : " << argv[0 argv[0] ] << " grades grades-fi -file\ le\n"; n"; return return 1; } std::ifstream grades_str(argv[1]); if (!grades_s (!grades_str) tr) { std::c std::cerr err << "Can "Can not open the grades grades file " << argv[1 argv[1] ] << "\n"; "\n"; return return 1; } std::v std::vect ector< or > scores scores; ; // Vector Vector to hold hold the input input scores scores; ; initia initially lly empty. empty. read_s read_scor cores( es(sco scores res, , grades grades_st _str); r); // Read the scores, scores, as before before // Quit Quit with with an error error messag message e if too few scores scores. . if (score (scores.s s.size ize() () == 0) { std::c std::cout out << "No scores scores entere entered. d. Please Please try again! again!" " << std::e std::endl ndl; ; return return 1; } // Compute Compute the average, standard standard deviation deviation and median median double double average, average, std_dev; std_dev; compute_avg_and_std_dev(scores, compute_avg_and_st d_dev(scores, average, std_dev); double double median median = compute_me compute_median( dian(score scores); s); // Output Output std::c std::cout out << "Among "Among << " averag average e = " << " std_ std_de dev v = " << " medi median an = " return return 0;
" << scores scores.si .size( ze() ) << " grades grades: : \n" << std::se std::setpr tpreci ecisio sion(3 n(3) ) << averag average e << '\n' << std_ std_de dev v << '\n' '\n' << med media ian n << std: std::e :end ndl; l;
}
2.12
Passing Passing Vecto Vectors rs (and Strings) Strings) As Paramete Parameters rs
The following outlines rules for passing vectors as parameters. The same rules apply to passing strings. •
If you are passing a vector as a parameter to a function and you want to make a (permanent) change to the vector, then you should pass it by reference. – This is illustrated by the function read scores in the program median grade .
erent from the behavior of arrays as parameters. – This is very diff erent •
What if you don’t want to make changes to the vector or don’t want these changes to be permanent? – The answer we’ve learned so far is to pass by value.
problem is that the entire entire vector vector is copied copied when this happens! Depending Depending on the size of the vector, vector, – The problem this can be a considerable waste of memory. •
The solution is to pass by constant reference: pass it by reference, but make it a constant so that it can not be changed.
6
– This is illustrated by the functions compute avg and std dev and compute median in the program median grade . •
As a general rule, you should not pass a container object, such as a vector or a string, by value because of the cost of copying.
7
CSCI-1200 Data Structures — Spring 2017 Lecture 3 — Classes I Announcements • Submitty team is working on an iClicker solution (we will put an announcement out on Piazza) when it’s ready. This will let you register through Submitty instead of the iClicker site. • Questions about Homework 1?
Today’s Lecture • Classes in C++ – Types and defining new types • A Date class. • Class declaration: member variables and member functions • Using the class member functions • Class scope • Member function implementation • Classes vs. structs • Designing classes
Homework 1 Hints • This section isn’t in the printed lecture notes, but it is online. • There are three major tasks in this assignment – Reading in the layout and commands – Managing the seats in a data structure – Managing the upgrade list (not to be confused with an STL list which we haven’t yet covered)
• One of the problems is that many people naturally want to use erase() , but we haven’t covered it • More importantly, we haven’t really discussed iterators, and they’re very important to functions like erase() • So how can we handle removing from a vector? – To empty out a vector, we can use clear() . – To remove the last value of a vector, we can use pop back() – We could also remove an element by making a second vector that looks right, and then use an assignment =. – Let’s look at a small program that exercises some of these concepts.
3.1
More More Vect Vector or Sample Sample Code
#include #include void printVecto printVector(con r(const st std::vecto std::vector& >& vec, std::ostr std::ostream& eam& out){ for(std:: for(std::size_ size_t t i=0; i a; std::vector b; a.push_back(5); a.push_back(4); a.push_back(3); printVector(a, std::cout); printVector(b, std::cout); b = a; printVector(b, std::cout); b.pop_back(); printVector(a, std::cout); a.clear(); printVector(b, std::cout); printVector(a, std::cout); return return 0; }
3.2 3.2
Exer Exerci cise se
What will be the output of the “More Vector Sample Code” program above?
3.3 3.3
Types ypes and Defi Defini ning ng New New Type Typess
• What is a type? It is a structuring of memory plus a set of operations (functions) that can be applied to that structured memory. memory. – Every C++ object has a type – The type tells us what the data means and what operations can be performed on the data
• Examples: integers, doubles, strings, and vectors. • In many cases, when we are using a class we don’t know how that memory memory is structured structured.. Instead, Instead, what we really think about is the set of operations (functions) that can be applied. • The basic ideas behind classes are data abstraction and encapsulation – Data abstraction hides details that don’t matter from a certain point of view and identifies details that
do matter. – The user sees only the interface to the object – The interface is the collection of data and operations that users of a class can access
For an int, you can access the value, perform addition etc. ∗ For strings, concatenate, access characteres etc. ∗
2
• Encapsulat Encapsulation ion is the packing of data and functions functions into a single component. component. • Information hiding – Users have access to interface, but not implementation
available any more than absolutely absolutely necessary necessary – No data item should be available • To clarify, let’s focus on strings and vectors. These are classes. We’ll outline what we know about them: – The structure of memory within each class object – The set of operations defined
• We are now ready to start defining our own new types using classes.
3.4 3.4
Exam Exampl ple: e: A Dat Date e Cla Class ss
• Many programs require information about dates. • Information stored about the date includes the month, the day and the year. • Operations Operations on the date include recording recording it, printing printing it, asking asking if two two dates are equal, equal, flipping flipping over to the next day (incrementing), etc.
3.5 3.5
C++ C++ Clas Classe sess
• A C++ class consists of – a collection of member variables, usually private , and – a collection of member functions, usually public , which operate on these variables.
• public member functions can be accessed directly from outside the class, • private member functions and member variables can only be accessed indirectly from outside the class, through public member functions. • We will look at the example of the Date class declaration.
3.6 3.6
Usin Using g C++ C++ clas classe sess
• We have have been using C++ classes (from the standard standard library) already this semester, semester, so studying how the Date class is used is straightforward: // Program: Program: // Purpose Purpose: :
date_m date_main ain.cp .cpp p Demons Demonstra trate te use of the Date Date class. class.
#include #include "date.h" int main() main() { std::cout std::cout << "Please "Please enter today's date.\n" date.\n" << "Prov "Provid ide e the the mont month, h, day and year: year: "; int month, month, day, day, year; year; std::c std::cin in >> month month >> day >> year; year; Date today(mont today(month, h, day, year); Date tomorrow(today.get tomorrow(today.getMonth(), Month(), today.getDay(), today.getYear()); tomorrow.increment(); std::c std::cout out << "Tomor "Tomorow ow is "; tomorrow.print(); std::cout std::cout << std::endl; std::endl; Date Sallys_Birthday(2, Sallys_Birthday(2,3,1995); 3,1995); if (sameDay(t (sameDay(tomorr omorrow, ow, Sallys_Bi Sallys_Birthda rthday)) y)) { std::cout std::cout << "Hey, "Hey, tomorrow tomorrow is Sally's Sally's birthday! birthday!\n"; \n"; } std::c std::cout out << "The "The last last day in this this month month is " << today. today.las lastDa tDayIn yInMon Month( th() ) << std::e std::endl ndl; ; return return 0; }
3
• Important: Each object we create of type Date has its own distinct member variables. • Calling class member functions for class objects uses the “dot” notation. For example, tomorrow.increment(); tomorrow.increment(); • Note: We don’t need to know the implementation details of the class member functions in order to understand this example. This is an important feature of object oriented programming and class design.
3.7 3.7
Exer Exerci cise se
Add code to date_main.cpp date_main.cpp to read in another date, check if it is a leap-year, and check if it is equal to tomorrow . Output appropriate messages based on the results of the checks.
3.8 3.8
Class Class Decl Declar arat atio ion n (date.h) & Implementation ( date.cpp)
A class class impleme implement ntati ation on usually usually consist consistss of 2 files. files. First First we’ll look at the the header file date.h // File: date.h // Purpos Purpose: e: Header Header file with with declar declarati ation on of the Date class, class, includi including ng // member member funct function ions s and privat private e member member variab variables les. . class class Date Date { public: Date(); Date(int Date(int aMonth, int aDay, int aYear); aYear); // ACCESSORS ACCESSORS int getDay() getDay() const; const; int getMonth( getMonth() ) const; const; int getYear() getYear() const; // MODIFIERS MODIFIERS void setDay(in setDay(int t aDay); aDay); void setMonth( setMonth(int int aMonth); aMonth); void setYear(i setYear(int nt aYear); aYear); void increment(); // other other member member functi functions ons that that operat operate e on date date object objects s bool bool isEqua isEqual(c l(cons onst t Date& Date& date2) date2) const; const; // same day, day, month, month, & year? year? bool isLeapYea isLeapYear() r() const; const; int lastDayIn lastDayInMonth Month() () const; const; bool isLastDay isLastDayInMon InMonth() th() const; const; void print() const; // output as month/day/year private: private: // REPRESENTATI REPRESENTATION ON (member (member variables) variables) int day; day; int month; month; int year; year; }; // protot prototype ypes s for other other functi functions ons that that operat operate e on class class object objects s are often often // includ included ed in the header header file, file, but outside outside of the class declarat declaration ion bool bool sameDa sameDay(c y(cons onst t Date Date &date1 &date1, , const const Date Date &date2 &date2); ); // same same day & month? month?
And here is the other part of the class implementation, the implementation file date.cpp // File ile:
date ate.cpp .cpp
4
// Purpos Purpose: e: Implem Implement entati ation on file file for the Date Date class. class. #include #include "date.h" // array array to figure figure out the number number of days, days, it's it's used used by the auxiliar auxiliary y functi function on daysIn daysInMon Month th cons const t int int Days DaysIn InMo Mont nth[ h[13 13] ] = {0, {0, 31, 31, 28, 28, 31, 31, 30, 30, 31, 31, 30, 30, 31, 31, 31, 31, 30, 30, 31, 31, 30, 30, 31}; 31}; Date::Date Date::Date() () { //default //default construct constructor or day = 1; month = 1; year year = 1900; 1900; } Date:: Date::Dat Date(i e(int nt aMonth aMonth, , int aDay, aDay, int aYear) aYear) { // constr construct uct from from month, month, day, & year year month = aMonth; day = aDay; aDay; year year = aYear; aYear; } int Date::get Date::getDay() Day() const { return return day; } int Date::get Date::getMonth Month() () const { return return month; month; } int Date::get Date::getYear( Year() ) const { return return year; year; } void Date::setDay( Date::setDay(int int d) { day = d; } void Date::setM Date::setMonth( onth(int int m) { month = m; } void Date::setY Date::setYear(i ear(int nt y) { year = y; } void Date::incr Date::increment ement() () { if (!isLastDa (!isLastDayInMo yInMonth() nth()) ) { day++; } else { day = 1; if (mont (month h == 12) { // Dece Decemb mber er month = 1; year++; } else { month++; } } } bool Date::isEq Date::isEqual(c ual(const onst Date& Date& date2) date2) const const { return return day == date2. date2.day day && month month == date2. date2.mon month th && year year == date2. date2.yea year; r; } bool Date::isLe Date::isLeapYea apYear() r() const { retu return rn (yea (year% r%4 4 ==0 ==0 && year year % 100 100 != 0) || year year%4 %400 00 == 0; }
5
int Date::las Date::lastDayI tDayInMont nMonth() h() const { if (month (month == 2 && isLeap isLeapYea Year() r()) ) return return 29; else return return DaysInMont DaysInMonth[ h[ month ]; } bool Date::isLa Date::isLastDay stDayInMon InMonth() th() const { return return day day == lastDayI lastDayInMo nMonth nth(); (); // uses uses member member functio function n } void Date::print() Date::print() const { std: std::c :cou out t << mont month h << "/" "/" << day day << "/" "/" << year year; ; } bool sameDay(const sameDay(const Date& date1, const const Date& Date& date2) date2) { return return date1.get date1.getDay() Day() == date2.get date2.getDay() Day() && date1.getM date1.getMonth( onth() ) == date2.get date2.getMonth Month(); (); }
3.9 3.9
Class Class scope scope nota notati tion on
• Date:: indicates that what follows is within the scope of the class. • Within class scope, the member functions and member variables are accessible without the name of the object.
3.10 3.10
Constr Construct uctors ors
These are special functions functions that initialize initialize the values values of the mem member ber variables. variables. You have already used constructo constructors rs for string and vector objects. • The syntax of the call to the constructor mixes variable definitions and function calls. (See date main.cpp ) • “Default constructors” have no arguments. • Multiple Multiple constructors constructors are allowed, allowed, just like multiple multiple functions functions with the same name are allowed. allowed. The compiler determines which one to call based on the types of the arguments (just like any other function call). • When a new object is created, EXACTLY one constructor for the object is called . called .
3.11 3.11
Membe Memberr Function unctionss
Member functions are like ordinary functions except: • They can access and modify the object’s member variables. • They can call the other member functions without using an object name. • Their syntax is slightly diff erent erent because they are defined within class scope. For the Date class: • The set and get functions access and change a day, month or year. • The increment member function uses another member function, isLastDayInMonth isLastDayInMonth . • isEqual accepts a second Date object and then accesses accesses its values values directly directly using the dot notation. notation. Since we are inside class Date scope, this is allowed. The name of the second object, date2 , is required to indicate that we are interested in its member variables. • lastDayInMonth uses the const array defined at the start of the .cpp file. More on member functions: • When the member variables are private , the only means of accessing them and changing them from outside the class is through member functions. • If member variables are made public , they can be accessed accessed directly directly.. This is usually considered considered bad style and should not be used in this course. 6
• Functions that are not members of the Date class must interact with Date objects through the class public members mem bers (a.k.a., the “public “public interface” interface” declared declared for the class). One example is the function function sameDay which accepts two Date objects and compares them by accessing their day and month values through their public member functions.
3.12 3.12
Heade Headerr Files Files (.h) and Impleme Implement ntati ation on Files Files (.cpp) (.cpp)
The code for the Date example is in three files: • The header The header file , date.h, contains contains the class declaration. declaration. • The implementation The implementation file , date.cpp , contains the member function definitions. Note that date.h is #include ’ed. • date main.cpp contains the code outside the class. Again date.h again is #include ’ed. • The files date.cpp and date main.cpp are compiled separately and then linked to form the executable program. -Wall date.c date.cpp pp – g++ -c -Wall – g++ -c -Wall -Wall date date main.cp main.cpp p
date.exe xe date.o date.o date date main.o main.o – g++ -o date.e date.exe date.cpp date.cpp date main.c main.cpp pp – or all on one line g++ -o date.exe
• Diff erent erent organizations organizations of the code are possible, but not preferable. preferable. In fact, we could have put all of the code from the 3 files into a single file main.cpp . In this case, we would not have to compile two separate files. • In many large projects, programmers programmers establish establish follow follow a conven convention tion with two files per class, one header file and one implementatio implementation n file. This makes the code more manageable manageable and is recommen recommended ded in this course.
3.13 3.13
Consta Constant nt member member func functio tions ns
Member Mem ber functions functions that do not change the mem member ber variable variabless should should be declared declared const • For example: bool bool Date::is Date::isEqua Equal(co l(const nst Date &date2) &date2) const; const; • This must appear consistently in both the member function declaration in the class declaration (in the .h file) and in the member function definition (in the .cpp file). • const objects (usually passed into a function as parameters) can ONLY use const member functions. Remember, you should only pass objects objects by value value under under spec special ial circumstanc circumstances. es. In genera general, l, pass pass all objects objects by refer referenc encee so they aren’t copied, and by const reference if you don’t want/need them to change. • While you are learning, you will probably make mistakes in determining which member functions should or should not be const. Be prepared for compile warnings & errors, and read them carefully.
3.14 3.14
Exer Exerci cise se
Add a member function to the Date class to add a given number of days to the Date object. The number should be the only argument and it should be an unsigned int. Should this function be const ?
3.15 3.15
Classe Classess vs. struct structss
• Technically, a struct is a class where the default protection is public, not private . – As mentioned above, when a member variable is public it can be accessed and changed directly using the tomorrow.day w.day = 52; We can see immediate dot notation: tomorro immediately ly why this is dangerous dangerous (and an example of
bad programming style) because a day of 52 is invalid! • The usual practice of using struct is all public members and no member functions. Rule for the duration of the Data Structures course: You may not declare new struct types, and class member
variables should not be made public. This rule will ensure you get plenty of practice writing C++ classes with good programming style.
7
3.16 3.16
C++ C++ vs. vs. Ja Jav va Clas Classes ses
• In C++, classes classes have have sections sections labeled public and private , but there can be multiple public and private sections. sections. In Java, Java, each each individual individual item is tagged tagged public or private. private. • Class declarations declarations and class definitions definitions are separated separated in C++, whereas they are together together in Java. Java. • In C++ there is a semi-colon at the very end of the class declaration (after the }). } ).
3.17 3.17
C++ C++ vs. vs. Python Python Classe Classess
• Python Python class classes es have have a single single cons constru tructo ctor, r,
init .
• Python Python is dynmaically dynmaically typed. typed. Class attributes attributes such as mem members bers are defined defined by assignment assignment.. • Python classes do not have private members. Privacy is enforced by convention. • Python methods have an explicit self reference reference variable.
3.18
Designing Designing and implemen implementing ting classes classes
This takes a lot of practice, but here are some ideas to start from: • Begin by outlining what the class objects should be able to do. This gives a start on the member functions. • Outline what data each object keeps track of, and what member variables are needed for this storage. • Write a draft class declaration in a .h file. • Write code that uses the member functions (e.g., the main function). Revise the class .h file as necessary. • Write the class .cpp file that implements the member functions. In general, don’t be afraid of major rewrites if you find a class isn’t working correctly or isn’t as easy to use as you intended. intended. This happens frequent frequently ly in practice! practice!
8
CSCI-1200 Data Structures — Spring 2017 Lecture 4 — Classes II: Sort, Non-member Operators Announcements •
•
•
Excercise solutions will be posted to the calendar. already registered registered on the iClicker iClicker web website site, Submitty iClicker registration is still open. Even if you already submit your code on Submitty.
Starting with HW2, when Submitty opens for the homework assignment, there may be a message at the top regarding an extra late day for earning enough autograder points by Wednesday night.
•
Practice problems for Exam 1 will be posted Monday, but the solutions will not be posted until the weekend.
•
We will talk more about the exam next Tuesday.
Review from Lecture 3 •
•
C++ classes, member variables variables and mem member ber functions, functions, class scope, public and private private Nuances to remember – Within class scope (within the code of a member function) member variables and member functions of that class may be accessed without providing the name of the class object. – Within Within a mem member ber function, function, when an object of the same class type has been passed passed as an argument, argument, direct access access to the private private member variables variables of that object is allowed allowed (using the ’.’ notation) notation)..
•
Classes vs. structs
•
Designing classes
•
Common error
Today’s Lecture •
Extended example of student grading program
•
Passing comparison functions to sort
•
Non-member operators
4.1 4.1
Examp Example le:: Stud Studen entt Gra Grade dess
Our goal is to write a program that calculates the grades for students in a class and outputs the students and their average averagess in alphabetical alphabetical order. order. The program source code is broken broken into three three parts: parts: •
Re-use of statistics code from Lecture 2.
•
Class Student to record information about each student, including name and grades, and to calculate averages.
•
The main function controls the overall flow, including input, calculation of grades, and output.
// File File: : main main_s _stu tude dent nt.c .cpp pp // Purpose Purpose: : Compute Compute student student averages averages and output output them alphabeti alphabeticall cally. y. #include #include #include #include #include #include "student.h" "student.h" int main(int main(int argc, argc, char* char* argv[]) argv[]) { if (argc != 3) { std::cer std::cerr r << "Usage: "Usage:\n \n " << argv[0] argv[0] << " infileinfile-stud students ents outfile-g outfile-grad rades\n es\n"; "; return return 1; } std::ifstream std::ifstream in_str(argv[1]) in_str(argv[1]); ; if (!in_str (!in_str) ) {
std:: std::cer cerr r << "Coul "Could d not open " << argv[ argv[1] 1] << " to read\ read\n"; n"; return return 1; } std::ofstream std::ofstream out_str(argv[2] out_str(argv[2]); ); if (!out_st (!out_str) r) { std:: std::cer cerr r << "Coul "Could d not open " << argv[ argv[2] 2] << " to write write\n" \n"; ; return return 1; } int num_homeworks, num_homeworks, num_tests; num_tests; double hw_weight; hw_weight; in_str in_str >> num_hom num_homewor eworks ks >> num_test num_tests s >> hw_weig hw_weight; ht; std::vector dent> students; students; Student one_student; one_student; // Read the stude students nts, , one at a time. time. while(one_stude while(one_student.read nt.read(in_str, (in_str, num_homeworks, num_homeworks, num_tests)) num_tests)) { students.push_back(one_student); } // Compute Compute the average averages. s. At the same same time, time, determine determine the maximum maximum name length. length. unsigne unsigned d int i; unsigne unsigned d int max_length max_length = 0; for (i=0; (i=0; i
HW
Test Final"; Final";
// Output Output the student students... s... for (i=0; (i=0; i
// everyt everythi hing ng fine
}
4.2 4.2 •
•
•
•
•
•
Decl Declara arati tion on of Clas Classs Student Stores names, id numbers, numbers, scores and averages. averages. The scores are stored using a vector! vector! Mem Member ber variables variables of a class can be other classes! Functionality is relatively simple: input, compute average, provide access to names and averages, and output. No constructor is explicitly provided: Student objects are built through the read function. (Other code organization/designs are possible!) Overall, the Student class design di ff ers ers substantially substantially in style style from the Date class design. We will continue to see diff erent erent styles of class designs throughout the semester. Note the helpful convention used in this example: all member variable names end with the “ _” character. #ifnde def f stud studen ent t h , #def #defin ine e The special pre-processor directives #ifn this files is included at most once per .cpp file.
stud studen ent t h , and #endif ensure that
For larger larger programs programs with multiple multiple class files and interdepend interdependencies encies,, these lines are essential essential for successful successful compilation. We suggest you get in the habit of adding these include guards to to all your header files.
2
/ / Fil e: e: // Purpose Purpose: :
stu de dent .h .h Header Header for declarat declaration ion of student student record class and associat associated ed functio functions. ns.
#ifndef __student_h_ __student_h_ #define __student_h_ __student_h_ #include #include #include class class Student Student { public: // ACCESSORS ACCESSORS const const std::st std::string ring& & first_n first_name( ame() ) const const { return return first_na first_name_; me_; } const const std::st std::string ring& & last_na last_name() me() const const { return return last_nam last_name_; e_; } const const std::st std::string ring& & id_numb id_number() er() const const { return return id_numbe id_number_; r_; } double double hw_avg() hw_avg() const const { return return hw_avg_; hw_avg_; } double double test_avg test_avg() () const const { return return test_av test_avg_; g_; } double double final_av final_avg() g() const const { return return final_a final_avg_; vg_; } bool read(std::istream& read(std::istream& in_str, unsigned int num_homeworks, num_homeworks, unsigned int num_tests); num_tests); void compute_average compute_averages(doubl s(double e hw_weight); hw_weight); std::ostream& std::ostream& output_name(std output_name(std::ostre ::ostream& am& out_str) out_str) const; std::ostream& std::ostream& output_averages output_averages(std::o (std::ostream& stream& out_str) out_str) const; private: // REPRESENTATION REPRESENTATION std::string std::string first_name_; first_name_; std::string std::string last_name_; last_name_; std::string std::string id_number_; id_number_; std::vector > hw_scores_; hw_scores_; double hw_avg_; std::vector > test_scores_; test_scores_; double test_avg_; test_avg_; double final_avg_; final_avg_; }; bool less_names(const less_names(const Student& stu1, const Student& stu2); #endif
4.3 •
•
Automa Automatic tic Creati Creation on of Two Constr Construct uctors ors By the Compile Compiler r Two constructors constructors are created automatically automatically by the compiler compiler because they are needed and used. The first is a default constructor which has no arguments and just calls the default constructor for each of the member variables. The prototype is Student(); Student one student; student; is executed. The default constructor is called when the main() function line Student executed.
If you wish a di ff erent erent behavior for the default constructor, you must declare it in the .h file and provide the alternate implementation. •
The second automatically-created constructor constructor is a “copy constructor”, whose only argument is a const reference to a Student object. The prototype is Student Student(con (const st Student Student &s); This constructor calls the copy constructor for each member variable to copy the member variables from the passed Student object to the corresponding member variables of the Student object being created. If you wish a diff erent erent behavior for the copy constructor, you must declare it an provide the alternate implementation. The copy constructor is called during the vector push_back function in copying the contents of one_student to a new Student object on the back of the vector students .
•
•
The beha b ehavior vior of automatica automatically-cr lly-create eated d default default and copy constructo constructors rs is often, often, but not always, always, what’s what’s desired. desired. When they do what’s desired, the convention is to not write them explicitly. Later in the semester we will see circumstances where writing our own default and copy constructors is crucial.
3
4.4 •
Implem Implemen entat tation ion of Class Class Student The read function function is fairly fairly sophisticat sophisticated ed and depends heavily heavily on the expected structure structure of the input data. It also has a lot of error checking. – In many class designs, this type of input would be done by functions outside the class, with the results
passed into a constructor. Generally prefer this style because it separates elegant class design from clunky I/O details. •
•
•
The accessor functions for the names are defined within the class declaration in the header file. In this course, you are allowed to do this for one-line functions only! For complex classes, including long definitions within the header header file has dependency dependency and performance performance implications. implications. The computation computation of the averages averages uses some but not all of the functionalit functionality y from stats.h and stats.cpp (not included in your handout). Output is split across two functions. Again, stylistically, it is sometimes preferable to do this outside the class.
// File File: : // Purpose: Purpose: #include #include #include #include #include #include
stud studen ent. t.cp cpp p Impleme Implementat ntation ion of the class class Student Student
"student.h" "student.h" "std_dev.h" "std_dev.h"
// Read informatio information n about about a student, student, returning returning true if the informatio information n was read read correct correctly. ly. bool Student::read(std::ist Student::read(std::istream& ream& in_str, unsigned unsigned int num_homeworks, unsigned int num_tests) num_tests) { // If we don' don't t find find an id, id, we've we've reac reached hed the end end of the file file & sile silentl ntly y retur return n false false. . if (!(in_st (!(in_str r >> id_numbe id_number_) r_)) ) return return false; false; // Once Once we have have an id numbe number, r, any any other other failur failure e in readi reading ng is treat treated ed as an error error. . // read the name if (! (in_str (in_str >> first_na first_name_ me_ >> last_nam last_name_)) e_)) { std::cer std::cerr r << "Failed "Failed reading reading name for student student " << id_numb id_number_ er_ << std::end std::endl; l; return false; } unsigne unsigned d int i; int score; score; // Read the homework homework scores scores hw_scores_.clear(); for (i=0; (i=0; i> score); score); ++i) ++i) hw_scores_.push_back(score); if (hw_scores_.size() (hw_scores_.size() != num_homeworks) num_homeworks) { std::cer std::cerr r << "Pre-ma "Pre-mature ture end of file file or invalid invalid input reading reading " << "hw scores for " << id_numb id_number_ er_ << std::en std::endl; dl; return false; } // Read the test scores scores test_scores_.clear(); for (i=0; (i=0; i> score); score); ++i) ++i) test_scores_.push_back(score); if (test_scores_.size() (test_scores_.size() != num_tests) num_tests) { std::cer std::cerr r << "Pre-ma "Pre-mature ture end of file file or invalid invalid input reading reading " << "test "test scores scores for" for" << id_numbe id_number_ r_ << std::en std::endl; dl; return false; } return return true; true; // everythi everything ng was fine fine } // Compute Compute and store the hw, test and final average average for the student. student. void Student::comput Student::compute_averag e_averages(doubl es(double e hw_weight) hw_weight) { double dummy_stddev; dummy_stddev; avg_and_std_dev avg_and_std_dev(hw_sco (hw_scores_, res_, hw_avg_, hw_avg_, dummy_stddev); dummy_stddev); avg_and_std_dev avg_and_std_dev(test_s (test_scores_, cores_, test_avg_, test_avg_, dummy_stddev); dummy_stddev); final_a final_avg_ vg_ = hw_weigh hw_weight t * hw_avg_ hw_avg_ + (1 - hw_weig hw_weight) ht) * test_av test_avg_; g_; }
4
std::ostream& std::ostream& Student::output Student::output_name(st _name(std::ostr d::ostream& eam& out_str) const { out_str out_str << last_nam last_name_ e_ << ", " << first_n first_name_ ame_; ; return out_str; } std::ostream& std::ostream& Student::output Student::output_average _averages(std:: s(std::ostream& ostream& out_str) const { out_str << std::fixed std::fixed << std::setprecisi std::setprecision(1); on(1); out_s out_str tr << hw_av hw_avg_ g_ << " " << test_ test_av avg_ g_ << " " << final final_a _avg_ vg_ << std: std::en :endl dl; ; return out_str; }
// Boolean Boolean functio function n to define alphabet alphabetical ical ordering ordering of names. names. The vector vector sort // functio function n require requires s that that the objects objects be passed passed by CONSTAN CONSTANT T REFEREN REFERENCE. CE. bool less_nam less_names(c es(cons onst t Student Student& & stu1, stu1, const const Student& Student& stu2) stu2) { return stu1.last_name() stu1.last_name() < stu2.last_name( stu2.last_name() ) || (stu1.last_name (stu1.last_name() () == stu2.last_name stu2.last_name() () && stu1.first_nam stu1.first_name() e() < stu2.first_name stu2.first_name()); ()); } /* alternative alternative version bool less_nam less_names(c es(cons onst t Student Student& & stu1, stu1, const const Student& Student& stu2) stu2) { if (stu1.last_name( (stu1.last_name() ) < stu2.last_name stu2.last_name()) ()) return true; else if (stu1.last_name( (stu1.last_name() ) == stu2.last_name( stu2.last_name()) )) return stu1.first_nam stu1.first_name() e() < stu2.first_name stu2.first_name(); (); else return false; } */
4.5 4.5
Exer Exerci cise se
Add code to the end of the main() function to compute and output the average of the semester grades and to output a list of the semester grades sorted into increasing order.
4.6
Provid Providing ing Compa Comparis rison on Func Functio tions ns to Sort
Consider sorting the students vector: •
•
sort(students.begin(), students.end()); students.end()); the sort function If we used sort(students.begin(), function would would try to use the < operator on student objects to sort the students, just as it earlier used the < operator on doubles to sort the grades. However, this doesn’t work because there is no such operator on Student objects.
Fortunately ortunately,, the sort function function can be called with a third argument argument,, a comparison comparison function: function: sort(students.begin(), sort(students.begin(), students.end(), students.end(), less names); less_names, defined in student.cpp , is a function that takes two const references to Student objects and
returns true if and only if the first argument should be considered “less” than the second in the sorted order. less_names uses the < operator defined on string objects to determine its ordering.
5
4.7 4.7
Exer Exerci cise se
greater_averages that could be used in place of less_names to sort the students vector so that Write a function greater_averages the student with the highest semester average average is first.
4.8 •
•
Operator Operatorss As Non-M Non-Mem ember ber Func Functio tions ns A second option for sorting is to define a function that creates a < operator for Student objects! At first, this seems a bit weird, but it is extremely useful. Let’s start with syntax. The expressions a < b and x + y are really function function calls! operator< < (a, b) operator+ (x, y) respectively. Syntactically, they are equivalent to operator and operator+
•
When we want to write our own operators, we write them as functions with these weird names.
•
For example, if we write: bool operator< operator< (const (const Student& Student& stu1, const Student& Student& stu2) { return return stu1.last stu1.last_name _name() () < stu2.last_nam stu2.last_name() e() || (stu1.las (stu1.last_nam t_name() e() == stu2.last_ stu2.last_name name() () && stu1.first_name() < stu2.first_name()); stu2.first_name()); }
sort(students.begin(), students.end()); students.end()); then the statement sort(students.begin(), will sort Student object ob jectss into into alphabetical alphabetical order. •
•
Really, the only weird thing about operators is their syntax. We We will have have many opportunities opportunities to write operators operators throughout throughout this course. Sometimes Sometimes these will be made class member functions, but more on this in a later lecture.
4.9 4.9
A Wor Word d of Cauti Caution on about about Operat Operator orss
•
Operators should only be defined if their meaning is intuitively clear.
•
operator< on Student objects fails the test because the natural ordering on these objects is not clear.
•
By contrast, operator< on Date objects is much more natural and clear.
4.10 4.10
Exer Exerci cise se
Write an operator< for comparing two Date objects.
6
4.11 4.11
Anothe Another r Class Class Examp Example: le: Alphabet Alphabetizi izing ng Names Names
// name_main.cpp name_main.cpp // Demonst Demonstrate rates s another another example example with the use of classes, classes, including including an output output stream stream operator operator #include #include #include #include "name.h" int main() main() { std::vector e> names; std::string std::string first, last; std::co std::cout ut <<"\nEn <<"\nEnter ter a sequenc sequence e of names names (first (first and last) and this program program will alphabetiz alphabetize e them\n"; them\n"; while while (std::c (std::cin in >> first first >> last) last) { names.push_back names.push_back(Name(fi (Name(first, rst, last)); } std::sort(names.begin(), names.end()); std::co std::cout ut << "\nHere "\nHere are the names, names, in alphabe alphabetica tical l order.\n order.\n"; "; for for (int (int i = 0; i < names names.s .size ize() (); ; ++i) ++i) { std::cou std::cout t << names[i names[i] ] << "\n"; } return return 0; }
7
4.12 4.12
Name Name Class Class Declara Declaratio tion n & Implemen Implementat tation ion
#ifndef __NAME__ #define __NAME__ // name.h name.h #include #include class class Name Name { public: // CONSTRUCTOR CONSTRUCTOR Name(const Name(const std::string& std::string& fst, const std::string& lst); // ACCESSORS ACCESSORS // Providing Providing a const const referenc reference e to the string allows allows the string to be // examined examined and treated treated as an r-value r-value without without the cost of copying copying it. const const std::st std::string ring& & first() first() const { return return first_; first_; } const const std::st std::string ring& & last() last() const const { return return last_; last_; } // MODIFIERS MODIFIERS void void set_fir set_first(c st(const onst std::stri std::string ng & fst) { first_ first_ = fst; } void void set_las set_last(co t(const nst std::strin std::string& g& lst) lst) { last_ last_ = lst; lst; } private: // REPRESENTATION REPRESENTATION std::string std::string first_, last_; }; // operato operator< r< to allow allow sorting sorting bool operator< operator< (const (const Name& Name& left, left, const const Name& Name& right); right); // operato operator<< r<< to allow allow output output std::ostream& std::ostream& operator<< (std::ostream& (std::ostream& ostr, const Name& n); #endif
// name.cpp name.cpp #include "name.h" // Here we use special special syntax syntax to call the string class copy construct constructors ors Name::Name(cons Name::Name(const t std::string& std::string& fst, const std::string& lst) : first_(fst), first_(fst), last_(lst) last_(lst) {} // The alterna alternative tive implementa implementation tion below first first calls calls the default default string // constru constructor ctor for the two variabl variables, es, then performs performs an assignm assignment ent in // the body of the constructo constructor r functio function. n. /* Name::Name(cons Name::Name(const t std::string& std::string& fst, const std::string& std::string& lst) { first_ first_ = fst; fst; last_ last_ = lst; } */ // operator< operator< bool operator< operator< (const Name& left, const Name& Name& right) right) { return left.last()
8
CSCI-1200 Data Structures — Spring 2017 Lecture 5 — Pointers, Arrays, Pointer Arithmetic Announcements •
•
•
Submitty iClicker registration is still open. Even Even if you already registered registered on the iClicker iClicker website website, submit your code on Submitty. Starting with HW2, when Submitty opens for the homework assignment, there may be a message at the top regarding an extra late day for earning enough autograder points by Wednesday night. In fact, fact, right now it’s set for 12 autograder autograder points. points. This is the number number you see and is the p oints oints from visible test cases.
Announcements: Test 1 Information •
•
Test 1 will be held Monday, Feb 6th, 2017 from 6-7:50pm , Your seating assignment will be posted on Submitty / through the gradesheet. Details will be given out Friday. No make-ups will be given except for pre-approved absence or illness, and a written excuse from the Dean of Students Students or the Student Experience Experience o ffice or the RPI Health Center will be required. Contac Contactt Mrs. Mrs. Eberwe Eberwein in by email by Friday riday Feb Feb 3rd to arrang arrangee for extra extra time time accomm accommodat odation ions. s. You can alternatel alternately y e-ma e-mail il the ds instructors instructors list.
•
•
•
•
•
•
Coverage: Lectures 1-6, Labs 1-3, and Homeworks 1-2. Closed-book and closed-notes except for 1 sheet of notes on 8.5x11 inch paper (front & back) that may be handwritten or printed . Computers, cell-phones, calculators, PDAs, music players, etc. are not permitted and must be turned o ff and and placed under your desk. All students must bring their Rensselaer photo ID card. At the start of the exam, proctors will check check that you have your ID card, and if you have a sheet of notes, they will staple it to the back of your exam. Practice problems from previous exams are available on the course website. Solutions to the problems will be posted on Sunday. Sunday. The best way to prepare prepare is to completely completely work through through and write out your solution solution to each each problem, before problem, before looking looking at the answers. The exam will will invol involve ve handwriting handwriting code on paper (and other short answer answer problem solving). solving). Neat legible handwr handwriti iting ng is apprec appreciat iated. ed. We will somewhat somewhat forgivi forgiving ng about about minor minor syntax syntax errors errors – it will be graded graded by humans not computers :)
Review from Last Week •
C++ class syntax, designing classes, classes vs. structs;
•
Passing comparison functions to sort; Non-member operators.
•
More practice with const and reference (the ’ &’)
Today’s Lecture — Pointers and Arrays •
Pointers store memory addresses.
•
They can be used to access the values stored at their stored memory address.
•
They can be incremented, decremented, added and subtracted.
•
Dynamic memory is accessed through pointers.
•
Pointers are also the primitive mechanism underlying vector iterators, which we have used with std::sort and will use more extensively throughout the semester.
5.1 •
After *p=72
Before *p=72
Poin Pointe terr Examp Example le Consider the following code segment: floa float t x = 15.5 15.5; ; floa float t *p; *p; /* equi equiv v: floa loat* p; p = &x; *p = 72; if ( x > 20 ) cout << "Bigger\n" "Bigger\n"; ; else cout << "Smaller\n "Smaller\n"; ";
or
floa float t * p; p;
*/
x
15.5
x
72.0
p
p
The output is Bigger because x == 72.0. What’s going on? Computer memory
5.2 •
Pointe Pointerr Variabl Variables es and and Memory Memory Access erence x is an ordinary float, but p is a pointer that can hold the memory address of a float variable. The di ff erence is explained in the picture above.
•
•
•
•
•
Every Every variable variable is attache attached d to a location in memo memory ry.. This is where the value value of that variable variable is stored. stored. Hence, Hence, we draw a picture with the variable name next to a box that represents the memory location. Each memory location also has an address, which is itself just an index into the giant array that is the computer memory. The value stored in a pointer variable is an address in memory. The statement takes the address p = &x; of x’s memory location and stores it (the address) in the memory location associated with p. Since the value of this address is much less important than the fact that the address is x’s memory location, we depict the address with an arrow. The statement: causes causes the computer computer to get get the memory memory location location stored stored at p, then go to that *p = 72; memory location, and store 72 there. This writes the 72 in x ’s location. location. Note: *p is an l-value an l-value in in the above expression.
5.3 •
Defining Defining Point Pointer er Variables ariables In the example below, p, s and t are all pointer variables (pointers, for short), but q is NOT. You need the * before each variable name. int * p, q; float float *s, *t;
•
There is no initialization of pointer variables in this two-line sequence, so the statement below is dangerous, and may cause your program program to crash! crash! (It won’t crash if the uninitialize uninitialized d value happens to be a legal address.) address.) *p = 15;
5.4 •
•
•
Operati Operations ons on Poin Pointer terss The unary (single argument/operand) operator * in the expression *p is the “dereferencing operator”. It means “follow the pointer” *p can be either an l-value or an r-value, depending on which side of the = it appears on. The unary operator & in the expression &x means “take the memory address of.” Pointers Pointers can be assigned. assigned. This just copies memory addresses addresses as though though they were values values (which they are). Let’s work through the example below (and draw a picture!). What are the values of x and y at the end? float float x=5, x=5, y=9; y=9; float *p = &x, *q = &y; *p = 17.0 17.0; ; *q = *p; q = p; *q = 13.0 13.0; ;
2
•
Assignments of integers or floats to pointers and assignments mixing pointers of di ff erent erent types are illegal. Continuing with the above example: int *r; r = q; p = 35.1 35.1; ;
•
5.5 5.5 •
// //
Ill Illegal egal: : diff iffere erent poin ointer ter type types s; Ille Illega gal: l: flo float at ass assig igne ned d to a point pointer er
Comparisons between pointers of the form or legal and very very if ( p == == q ) if ( p != != q ) are legal useful! Less than and greater than comparisons are also allowed. These are useful only when the pointers are to locations within an array.
Exer Exerci cise se Draw a picture for the following code sequence. What is the output to the screen? int x = 10, y = 15; int *a = &x; cout << x << " " << y << endl; int *b = &y; *a = x * *b; cout << x << " " << y << endl; int *c = b; *c = 25; cout << x << " " << y << endl;
5.6 5.6 •
•
Null Nu ll Poin ointers ters Like the int type, pointers are not default default initialized. initialized. We should assume assume it’s a garbage garbage value, leftover leftover from the previous user of that memory. Pointers Pointers that don’t (yet) (yet) point anywhere anywhere useful should be explicitly explicitly assigned to NULL. – NULL is equal to the integer 0, which is a legal pointer value (you can store the NULL in a pointer variable). – But NULL is not a valid valid memory location location you are allowed allowed to read or write. write. If you try to dereferen dereference ce or follow a NULL pointer , your your program will immediately immediately crash. You may see a segmentatio segmentation n fault, a bus
error, error, or something something about a null pointer dereference. dereference. encouraged to switch switch to use nullptr, to avoid – NOTE: In C++11 (the server still uses C++03), we are encouraged some subtle situations where NULL is incorrectly seen as an int type instead of a pointer. – We indicate a NULL value in diagrams with a slash through the memory location box. •
Comparing Comparing a pointer pointer to NULL is very useful. useful. It can be used to indicate whether whether or not a pointer pointer variable variable is pointing pointing at a useable useable memo memory ry location. For example, if ( p != NULL ) cout cout << *p << endl endl. .
tests to see if p is pointing somewhere that appears to be useful before accessing and printing the value stored at that location. •
5.7 5.7 •
But don’t mak makee the mistake mistake of assuming pointers pointers are automatically automatically initialized initialized to NULL.
Arr Arrays Here’s a quick example to remind you about how to use an array: const int n = 10; double double a[n]; int int i; for ( i=0; i
•
Remember: Remember: the size of array a is fixed at compile time. STL vectors vectors act like arrays, arrays, but they can grow and shrink dynamically in response to the demands of the application.
3
5.8 •
Stepping Stepping through through Arrays Arrays with with Poin Pointers ters (Arra (Array y Iterators) The array code above that uses [] subscripting , can be equivalently rewritten to use pointers: const int n = 10; double double a[n]; double double *p; for for ( p=a; p=a; p
•
•
•
The assignment:
p = a;
takes takes the address address of the start start of the array array and and assigns it to p .
This illustrates the important fact that the name of an array is in fact a pointer to the start of a block of memory . We will come back to this several times! We could also write this line as: p = &a[0 &a[0]; ]; which means “find the location of a[0] and take its address”. By incrementing, ++p , we make p point to the next location in the array. – When we increment a pointer we don’t just add one byte to the address, we add the number of bytes
(sizeof ) used to store one object of the specific type of that pointer. pointer. Similarly Similarly,, basic addition/sub addition/subtract traction ion of pointer variables is done in multiples of the sizeof the sizeof the type of the pointer. – Since the type of p is double, and the size of double is 8 bytes, we are actually adding 8 bytes to the address when we execute ++p . •
The test p
n array
locations beyond
In this example, a+n is the memory location 80 bytes after the start of the array (n = 10 slots * 8 bytes per slot). We could equivalently have used the test •
p != != a+n a+n
In the assignment: *p = sqrt( p-a )
p-a is the number of array locations (multiples of 8 bytes) between square root of this value is assigned to *p . •
p and
the start. start. This This is an intege integer. r. The
Here’s a picture to explain this example:
const int n
10 a[10]
increasing address value
double [] a
double* p
4
3.00
a[9]
2.83
a[8]
2.65
a[7]
2.45
a[6]
2.23
a[5]
2.00
a[4]
1.73
a[3]
1.41
a[2]
1.00
a[1]
0.00
a[0]
•
Note that there may or may not be unused memory between your array and the other local variables. Similarly, the order that your local variables appear on the stack is not guaranteed (the compiler may rearrange things a bit in an attempt attempt to optimize performance performance or memory memory usage). A bu ff er er overflow (attempting overflow (attempting to access an illegal array index) may or may not cause an immediate failure – depending on the layout of other critical program memory.
5.9 5.9 •
Sort Sortin ing g an Arra Array y Arrays may be sorted using std::sort, just like vectors. Pointers are used in place of iterators. For example, if a is an array of doubles and there are n values in the array, then here’s how to sort the values in the array into increasing order: std::s std::sort ort( ( a, a+n );
5.10 5.10
Exer Exerci cise sess
For each of the following problems, you may only use pointers and not subscripting: 1. Write code to print the array array a backwards, using pointers.
2. Write code to print print every other value value of the array a, again using pointers.
3. Write a function function that checks checks whether the contents contents of an array of doubles doubles are sorted into increasing increasing order. The function function must accept two argument arguments: s: a pointer pointer (to the start of the array), array), and an integer integer indicating indicating the size of the array.
5
5.11 5.11 •
•
•
C Callin Calling g Conv Conventio ention n
We take for granted the non-trivial task of passing data to a helper function, getting data back from that function, and seamlessly continuing on with the program. How does that work?? A calling convention convention is a standardized method for passing arguments between the caller and the function. Calling conventions vary between programming languages, compilers, and computer hardware. In C on x86 architectures here is a generalization of what happens: 1. The caller puts all the argument argumentss on the stack the stack , in reverse order. 2. The caller puts the address address of its code on the stack (the (the return address ). ). 3. Control Control is transferred transferred to the callee. callee. 4. The callee puts any local variables variables on the stack. 5. The callee does its work and puts the return return value value in a special register special register (storage (storage location). 6. The callee removes removes its local variables variables from the stack. stack. 7. Control Control is transferre transferred d by removing removing the address of the caller from the stack stack and going there. there. 8. The caller removes removes the argument argumentss from the stack. stack.
•
On x86 architectures the addresses on the stack are in descending order. This is not true of all hardware.
6
5.12 5.12 •
Pokin Poking g around around in the Stack Stack & Looking Looking for the C Callin Calling g Conven Conventio tion n
Let’s look more closely closely at an example of where the compiler stores our data. Specifically Specifically,, let’s print print out the addresses and values of the local variables and function parameters: int int foo( foo(in int t a, int int *b) *b) { int q = a+1; int r = *b+1; std: std::c :cou out t << "add "addre ress ss of a = " << &a << std: std::e :end ndl; l; std: std::c :cou out t << "add "addre ress ss of b = " << &b << std: std::e :end ndl; l; std: std::c :cou out t << "add "addre ress ss of q = " << &q << std: std::e :end ndl; l; std: std::c :cou out t << "add "addre ress ss of r = " << &r << std: std::e :end ndl; l; std::cout << "value at " << &a << " = " << a << std::endl; std::cout << "value at " << &b << " = " << b << std::endl; std::cout << "value at " << b << " = " << *b << std::endl; std::cout << "value at " << &q << " = " << q << std::endl; std::cout << "value at " << &r << " = " << r << std::endl; return return q*r; } int main() main() { int x = 5; int y = 7; int int answ answer er = foo foo (x, (x, &y); &y); std: std::c :cou out t << "add "addre ress ss of x = " << &x << std: std::e :end ndl; l; std: std::c :cou out t << "add "addre ress ss of y = " << &y << std: std::e :end ndl; l; std::c std::cout out << "addre "address ss of answer answer = " << &answe &answer r << std::e std::endl ndl; ; std::cout << "value at " << &x << " = " << x << std::endl; std::cout << "value at " << &y << " = " << y << std::endl; std: std::c :cou out t << "val "value ue at " << &ans &answe wer r << " = " << answ answer er << std: std::e :end ndl; l; }
•
•
•
•
•
Note that the first function parameters parameters is regular integer, integer, passed by copy. copy. The second parameter parameter is a passed passed in as a pointer. Note that we can print out data values or pointers – the address is printed as a big integer in hexadecimal format (beginning with “ Ox”). This example was compiled as 32-bit program, so our addresses are 32-bits. A 64-bit program will have longer addresses. Let’s look at the program output and reverse engineer a drawing of the stack:
0xbf23ef18 addres address s of a = 0xbf23 0xbf23eef eef0 0 x= 0xbf23ef14 5 addres address s of b = 0xbf23 0xbf23eef eef4 4 7 y= 0xbf23ef10 addres address s of q = 0xbf23 0xbf23eee eee4 4 addres address s of r = 0xbf23 0xbf23eee eee0 0 answer=0xbf23ef0c 48 value value at 0xbf23 0xbf23eef eef0 0 = 5 0xbf23ef08 value at 0xbf23eef 0xbf23eef4 4 = 0xbf23ef1 0xbf23ef10 0 value value at 0xbf23 0xbf23ef1 ef10 0 = 7 0xbf23ef04 value value at 0xbf23 0xbf23eee eee4 4 = 6 0xbf23ef00 value value at 0xbf23 0xbf23eee eee0 0 = 8 addres address s of x = 0xbf23 0xbf23ef1 ef14 4 0xbf23eefc addres address s of y = 0xbf23 0xbf23ef1 ef10 0 0xbf23eef8 address address of answer answer = 0xbf23ef0 0xbf23ef0c c b= 0xbf23eef4 0xbf23ef10 value value at 0xbf23 0xbf23ef1 ef14 4 = 5 value value at 0xbf23 0xbf23ef1 ef10 0 = 7 a= 0xbf23eef0 5 value value at 0xbf23 0xbf23ef0 ef0c c = 48 0xbf23eeec Note: The unlabeled portions in our diagram of the stack 0xbf23eee8 will include the frame pointer, the return address, temp q= 0xbf23eee4 6 variables (complex C++ expressions turn into many smaller r= 0xbf23eee0 8 steps of assembly), space to save registers, and padding between variables to meet alignment requirements. 0xbf23eedc Note: Diff erent erent compilers and/or di ff erent erent optimization 0xbf23eed8 levels will produce a di ff erent erent stack diagram. 7
CSCI-1200 Data Structures — Spring 2017 Lecture 6 — Pointers & Dynamic Memory Announcements •
•
Exam 1 is on Monday Monday Feb 6, at 6pm. Check Check Submitty Submitty for room assignment assignments. s. They might might be up already, already, if not they should be up by the end of today (Frida (Friday). y). See Lecture 5’s notes for more exam-rela exam-related ted announcemen announcements. ts. The next next homew homework ork will be chec checke ked d for memory memory errors errors on the server. server. Run Dr. Mem Memory ory or Valgri Valgrind nd on your code to detect memory errors. See http://www.cs http://www.cs.rpi.edu/acad .rpi.edu/academics/courses emics/courses/spring16/csc /spring16/csci1200/ i1200/ memory_debugging.php memory_debugg ing.php for more information on how to run Dr. Memory or valgrind.
Review from Lecture 5 •
Pointer variables, arrays, pointer arithmetic and dereferencing, character arrays, and calling conventions.
Today’s Lecture — Pointers and Dynamic Memory •
•
Arrays and pointers Diff erent erent types of memory
•
Dynamic allocation of arrays
•
Memory Debuggers
6.1 6.1 •
Three Three Types of Memo Memory ry Automatic memory: memory allocation inside a function when you create a variable. This allocates space for local variables in functions (on the stack ) and deallocates it when variables go out of scope. For example: int int x; double double y;
•
Static Static memo memory: ry: variable variabless allocated allocated statically statically (with the keyword keyword static). They They are are not elimina eliminated ted when when they go out of scope. They retain their values, values, but are only accessible accessible within the scope where they are defined. static static int counter; counter;
•
6.2 6.2 •
Dynamic memory: explicitly allocated (on the heap) as needed. This is our focus for today.
Dynam Dynamic ic Memo Memory ry Dynamic memory is: – created using the new operator, – accessed through pointers, and – removed through the delete operator.
•
Here’s a simple example involving dynamic allocation of integers: int * p = new int; *p = 17; cout cout << *p << endl endl; ; int * q; q = new int; *q = *p; *p = 27; cout << *p << " " << *q << endl; int * temp = q; q = p; p = temp; cout << *p << " " << *q << endl; delete delete p; delete delete q;
stack stack grows as variables are assigned sequentially and shrinks as variables go out of scope
p
heap 17
q temp memory allocated as needed, where space is available (not necessarily sequentially!)
•
•
•
•
•
6.3 6.3 •
The expression new new int asks the system for a new chunk of memory that is large enough to hold an integer and returns the address address of that memory memory. Therefore, Therefore, the statemen statementt int * p = new int; allocates memory memory from the heap and stores its address in the pointer variable p . delete p; takes the integer memory pointed by p and returns it to the system for re-use. The statement delete
This memory is allocated from and returned to a special area of memory called the heap. By contra contrast, st, local variables and function parameters are placed on the stack as as discussed discussed last lecture. lecture. In between the new and delete statements, the memory is treated just like memory for an ordinary variable, except the only way to access it is through pointers. Hence, the manipulation of pointer variables and values is similar to the examples covered in Lecture 5 except that there is no explicitly named variable for that memory other than the pointer variable. Dynamic Dynamic allocation of primitives primitives like ints and doubles doubles is not very interesting interesting or significant. significant. What’s What’s more important is dynamic allocation of arrays and objects.
Exer Exerci cise se What’s the output of the following code? Be sure to draw a picture to help you figure it out. doub double le * p = new new doub double le; ; *p = 35.1 35.1; ; double * q = p; cout << *p << " " << *q << endl; p = new new doub double le; ; *p = 27.1 27.1; ; cout << *p << " " << *q << endl; *q = 12.5 12.5; ; cout << *p << " " << *q << endl; delete delete p; delete delete q;
2
6.4 •
•
Dynamic Dynamic Allocat Allocation ion of Arrays Arrays How do we allocate allocate an array on the stack? stack? What is the code? What memory memory diagram diagram is produced by the code? Declaring the size of an array at compile time doesn’t o ff er er much flexibility. Instead we can dynamically allocate allocate an array based on data. This gets us part-wa part-way y toward toward the behavior behavior of the standard standard library library vector vector class. Here’s an example: stack heap int main() main() { std::c std::cout out << "Enter "Enter the size size of the array: array: "; int n,i; n,i; std::c std::cin in >> n; double double *a = new double[n double[n]; ]; for for (i=0; (i=0; i
•
n i a
double[n] ] asks the system to dynamically allocate enough consecutive memory to hold n The expression new double[n double’s double’s (usually (usually 8n bytes). – What’s crucially important is that n is a variabl variable. e. Theref Therefore ore,, its value value and, as a result result,, the size of the
array are not known until the program is executed and the the memory must be allocated dynamically. – The address of the start of the allocated memory is assigned to the pointer variable a . •
•
After this, a is treated as though it is an array. For example: a[i] a[i] = sqr sqrt( t( i ); ); In fact, the expression a[i] is exactly equivalent to the pointer arithmetic and dereferencing expression *(a+i) which we have seen several times before. After we are done using the array, the line: releases releases the the memory memory allocated allocated for the the entire entire delete delete [] a; array and calls the destructo destructorr (we’ll (we’ll learn about these soon!) for each slot of the array. array. Deleting Deleting a dynamically dynamically allocated array without the [] is an error (but it may not cause a crash or other noticeable problem, depending on the type stored in the array and the specific compiler implementation). – Since the progra program m is ending ending,, releas releasing ing the memor memory y is not a major major concer concern. n. Ho Howe weve ver, r, to demons demonstra trate te that you understand memory allocation & deallocation, you should always delete dynamically allocated
memory in this course, even if the program is terminating. – In more substantial programs it is ABSOLUTELY CRUCIAL. If we forget to release memory repeatedly the program can be said to have a memory leak . Long-running programs with memory leaks will eventually run out of memory and crash.
6.5 6.5
Exer Exerci cise sess
n integers, point to this array using the integer pointer variable 1. Write code to dynamically dynamically allocate an array of n a, and then read n values into the array from the stream cin .
2. Now, suppose we wanted to write code to double the size of array a without without losing the values. values. This requires requires some work: First allocate an array of size 2*n , pointed to by integer pointer variable temp (which will become a). Then copy the n values of a a into the first n locations of array temp. Finally delete array a and assign temp to a .
Why don’t you need to delete temp? Note: The code for part 2 of the exercise is very similar to what happens inside the resize member function of vectors! 3
6.6 •
Dynamic Dynamic Alloca Allocatio tion n of Two-D Two-Dime imensi nsiona onall Arrays Arrays To store a grid of data, we will need to allocate a top level array of pointers to arrays of the data. For example: double** double** a = new double*[rows]; double*[rows]; for (int i = 0; i < rows; i++) { a[i] = new double[cols]; double[cols]; for (int j = 0; j < cols; j++) { a[i][j] a[i][j] = double(i+1 double(i+1) ) / double double (j+1); (j+1); } }
Draw a picture of the resulting data structure. Then, write code to correctly delete all of this memory. You need to call delete or delete [] as many times as you new or new [] respectively.
6.7 •
Dynamic Dynamic Allocat Allocation ion:: Array Arrayss of Class Class Objects Objects We can dynamically dynamically allocate arrays of class objects. The default constructor constructor (the constructor constructor that takes no arguments) must be defined in order to allocate an array of objects. clas class s Foo Foo { public: Foo(); double double value() value() const const { return return a*b; } private: int int a; double double b; }; Foo::Foo() Foo::Foo() { static static int counte counter r = 1; a = counte counter; r; b = 100. 100.0; 0; counter++; } int main() main() { int int n; std::c std::cin in >> n; Foo *things *things = new Foo[n]; Foo[n]; std::c std::cout out << "size "size of int: int: " << sizeof sizeof(in (int) t) << std::e std::endl ndl; ; std::c std::cout out << "size "size of double double: : " << sizeof sizeof(do (doubl uble) e) << std::e std::endl ndl; ; std::c std::cout out << "size "size of foo object: object: " << sizeof sizeof(Fo (Foo) o) << std::e std::endl ndl; ; for for (Foo (Foo* * i = thin things gs; ; i < thin things gs+n +n; ; i++) i++) std: std::c :cou out t << "Foo "Foo stor stored ed at: at: " << i << " has has valu value e " << i->v i->val alue ue() () << std: std::e :end ndl; l; delete delete [] things; things; } size size of int: int: 4 size size of double double: : 8 size size of foo object: object: 16 Foo stored stored at: 0x104800 0x104800890 890 Foo stored stored at: 0x104800 0x1048008a0 8a0 Foo stored stored at: 0x104800 0x1048008b0 8b0 Foo stored stored at: 0x104800 0x1048008c0 8c0 ...
•
•
has has has has
value value value value
100 200 300 400
What does “- >” do? It is a member access operator for objects created on the heap. We could also use (*i).value(). Why? 4
6.8 6.8
Memo Memory ry Debug Debuggi ging ng
In addition to the step-by-step debuggers like gdb, lldb, or the debugger in your IDE, we recommend using a memory debugger like “Dr. Memory” (Windows, Linux, and MacOSX) or “Valgrind” (Linux and MacOSX). These tools can detect the following problems: •
Use of uninitialized memory
•
Reading/writing memory after it has been free’d ( NOTE: delete calls free )
•
Reading/writing off the the end of malloc’d blocks ( NOTE: new calls malloc )
•
Reading/writing inappropriate areas on the stack
•
Memory leaks - where pointers to malloc’d blocks are lost forever
•
Mismatched use of malloc/new/new [] vs free/delete/delete []
•
Overlapping src and dst pointers in memcpy() and related functions
6.9 6.9
Samp Sample le Bugg Buggy y Pro Progr gram am
Can you see the errors in this program? 1 #include #include > 2 3 int int main main() () { 4 5 int *p *p = ne new in int; 6 if (*p (*p != 10) std: std::c :cou out t << "hi" "hi" << std:: std::en endl dl; ; 7 8 int int *a *a = ne new in int[3 t[3]; 9 a[3] = 12; 10 delete a; 11 12 }
6.10 6.10
Usin Us ing g Dr. Dr. Memor Memory y http://www.drmemory.org
Here’s how Dr. Memory reports the errors in the above program: ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ hi ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M ~~Dr.M~~ ~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr ~~Dr.M .M~~ ~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr ~~Dr.M .M~~ ~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M ~~Dr.M~~ ~~ ~~Dr ~~Dr.M .M~~ ~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ ~~Dr ~~Dr.M .M~ ~~ ~~Dr ~~Dr.M .M~ ~~
Dr. Memory version version 1.8.0 Error Error #1: UNINITIAL UNINITIALIZED IZED READ: reading reading 4 byte(s) byte(s) # 0 main [memory_debugger_test.cpp:6]
Error #2: UNADDRESSABLE UNADDRESSABLE ACCESS beyond heap bounds: bounds: writing writing 4 byte(s) byte(s) # 0 main [memory_debugger_test.cpp:9] Note: Note: refers refers to 0 byte(s byte(s) ) beyond beyond last valid valid byte byte in prior prior malloc malloc Error #3: INVALID HEAP ARGUMENT: ARGUMENT: allocated allocated with operator new[], freed with operator operator delete # 0 repl replac ace_ e_op oper erat ator or_d _del elet ete e [/dr [/drme memo mory ry_p _pac acka kage ge/c /com ommo mon/ n/al allo loc_ c_re repl plac ace. e.c: c:26 2684 84] ] # 1 main [memory_debugger_test.cpp:10] Note: Note: memory memory was allocated here: Note Note: : # 0 repl replac ace_ e_op oper erat ator or_n _new ew_a _arr rray ay [/dr [/drme memo mory ry_p _pac acka kage ge/c /com ommo mon/ n/al allo loc_ c_re repl plac ace. e.c: c:26 2638 38] ] Note: # 1 main [memory_debugger_test.cpp:8] Error Error #4: LEAK 4 bytes bytes # 0 repl replac ace_ e_op oper erat ator or_n _new ew # 1 main ERRORS ERRORS FOUND: FOUND: 1 uni unique que, 1 uni unique que,
[/dr [/drme memo mory ry_p _pac acka kage ge/c /com ommo mon/ n/al allo loc_ c_re repl plac ace. e.c: c:26 2609 09] ] [memory_debugger_test.cpp:5]
1 tota total l unad unadd dres ressabl sable e acce ccess( ss(es) es) 1 tota total l unin unini itia tialize lized d acce ccess( ss(es) es)
5
~~Dr.M~~ 1 unique, 1 total invalid heap argument(s) ~~Dr.M~~ 0 unique, 0 total warning(s) ~~Dr.M~~ 1 unique, 1 total, 4 byte(s) of leak(s) ~~Dr.M~~ 0 unique, 0 total, 0 byte(s) of possible leak(s) ~~Dr.M~~ Details: /DrMemory-MacOS-1. /DrMemory-MacOS-1.8.0-8/drmemory/logs 8.0-8/drmemory/logs/DrMemory-a.out.772 /DrMemory-a.out.7726.000/results.txt 6.000/results.txt
And the fixed version: ~~Dr.M~~ ~~Dr.M~~ Dr. Memory version version 1.8.0 hi ~~Dr.M~~ ~~Dr.M~~ ~~Dr.M~~ NO ERRORS ERRORS FOUND: FOUND: ~~Dr ~~Dr.M .M~ ~~ 0 uni unique que, 0 tota total l unad unadd dres ressabl sable e acce ccess( ss(es) es) ~~Dr ~~Dr.M .M~ ~~ 0 uni unique que, 0 tota total l unin unini itia tialize lized d acce ccess( ss(es) es) ~~Dr.M~~ 0 unique, 0 total invalid heap argument(s) ~~Dr.M~~ 0 unique, 0 total warning(s) ~~Dr.M~~ 0 unique, 0 total, 0 byte(s) of leak(s) ~~Dr.M~~ 0 unique, 0 total, 0 byte(s) of possible leak(s) ~~Dr.M~~ Details: /DrMemory-MacOS-1. /DrMemory-MacOS-1.8.0-8/drmemory/logs 8.0-8/drmemory/logs/DrMemory-a.out.776 /DrMemory-a.out.7762.000/results.txt 2.000/results.txt
Dr. Memo Memory ry on Windows Windows with the Visual Studio Studio compiler compiler may not report report a mismatched mismatched free() free() / delete / delete delete [] error error (e.g., line 10 of the sample sample code above above). ). This This ma may y happen happen if optimi optimizat zation ionss are enabled enabled and the objects stored in the array are simple and do not have their own dynamically-allocated memory that lead to their own indirect memory leaks.
Note:
6.11 6.11
Using Using Valgrin algrind d http://valgrind.org/
And this is how Valgrind reports the same errors: ==31226== ==31226== ==3122 ==31226== 6== ==31226== ==31226== ==31226== ==31226== ==31226== ==31226== ==31226== ==3122 ==31226== 6== ==31226== hi ==3122 ==31226== 6== ==3122 ==31226== 6== ==3122 ==31226== 6== ==31226== ==31226== ==3122 ==31226== 6== ==31226== ==3122 ==31226== 6== ==31226== ==31226== ==3122 ==31226== 6== ==3122 ==31226== 6== ==31226== ==31226== ==3122 ==31226== 6== ==31226== ==31226== ==31226== ==31226== ==31 ==3122 226 6== ==3122 ==31226== 6== ==31226== ==31 ==3122 226= 6== = ==31226== ==31226== ==3122 ==31226== 6== ==31226== ==31226== ==31226== ==31 ==3122 226= 6== =
Memcheck, Memcheck, a memory memory error detector detector Copyri Copyright ght (C) 2002-2 2002-2013 013, , and GNU GPL'd, GPL'd, by Julian Julian Seward Seward et al. Using Valgrind-3.9. Valgrind-3.9.0 0 and LibVEX; rerun with -h for copyright copyright info Command: Command: ./a.out ./a.out Conditional Conditional jump or move depends on uninitial uninitialised ised value(s) at 0x4009 0x40096F: 6F: main main (memo (memory_ ry_deb debugg ugger_ er_tes test.c t.cpp: pp:6) 6)
Invalid Invalid write write of size size 4 at 0x4009 0x4009A3: A3: main main (memo (memory_ ry_deb debugg ugger_ er_tes test.c t.cpp: pp:9) 9) Addres Address s 0x4c3f 0x4c3f09c 09c is 0 bytes bytes after a block block of size size 12 alloc'd alloc'd at 0x4A0700A: 0x4A0700A: operator operator new[](unsig new[](unsigned ned long) (in /usr/lib6 /usr/lib64/va 4/valgrin lgrind/vgp d/vgpreloa reload_mem d_memcheck check-amd -amd64-li 64-linux.s nux.so) o) by 0x4009 0x400996: 96: main main (memo (memory_ ry_deb debugg ugger_ er_tes test.c t.cpp: pp:8) 8) Mismat Mismatche ched d free() free() / delete delete / delete delete [] at 0x4A07991: 0x4A07991: operator operator delete(void delete(void*) *) (in /usr/lib64/ /usr/lib64/valgr valgrind/ ind/vgpre vgpreload_ load_memch memcheck-a eck-amd64md64-linu linux.so) x.so) by 0x4009 0x4009B4: B4: main main (memor (memory_d y_debu ebugge gger_t r_test est.cp .cpp:1 p:10) 0) Addres Address s 0x4c3f 0x4c3f090 090 is 0 bytes bytes inside inside a block block of size size 12 alloc'd alloc'd at 0x4A0700A: 0x4A0700A: operator operator new[](unsig new[](unsigned ned long) (in /usr/lib6 /usr/lib64/va 4/valgrin lgrind/vgp d/vgpreloa reload_mem d_memcheck check-amd -amd64-li 64-linux.s nux.so) o) by 0x4009 0x400996: 96: main main (memo (memory_ ry_deb debugg ugger_ er_tes test.c t.cpp: pp:8) 8)
HEAP SUMMARY: SUMMARY: in use use at at exit exit: : 4 byte bytes s in 1 bl blocks ocks total total heap heap usage: usage: 2 allocs, allocs, 1 frees, frees, 16 bytes alloc allocate ated d 4 byte bytes s in 1 bloc blocks ks are are defi defini nite tely ly lost lost in loss loss reco record rd 1 of 1 at 0x4A06965: 0x4A06965: operator operator new(unsigne new(unsigned d long) (in /usr/lib64/ /usr/lib64/valg valgrind/ rind/vgpre vgpreload_ load_memch memcheck-a eck-amd64 md64-linu -linux.so) x.so) by 0x4009 0x400961: 61: main main (memo (memory_ ry_deb debugg ugger_ er_tes test.c t.cpp: pp:5) 5) LEAK SUMMARY: SUMMARY: defi defini nite tely ly los lost: t: 4 byt bytes es in in 1 bloc blocks ks
6
==31 ==3122 226= 6== = indi indire rect ctly ly los lost: t: 0 byt bytes es in in 0 bloc blocks ks ==31 ==3122 226= 6== = poss possib ibly ly lost lost: : 0 byte bytes s in 0 bloc blocks ks ==31 ==3122 226= 6== = stil still l reac reacha habl ble: e: 0 byt bytes es in in 0 bloc blocks ks ==31226== suppressed: 0 bytes in 0 blocks ==31226== ==3122 ==31226== 6== For counts counts of detect detected ed and suppre suppresse ssed d errors errors, , rerun rerun with: with: -v ==31226== ==31226== Use --track-o --track-origin rigins=yes s=yes to see where where uninitiali uninitialised sed values values come from ==3122 ==31226== 6== ERROR SUMMAR SUMMARY: Y: 4 errors errors from from 4 contex contexts ts (suppr (suppress essed: ed: 2 from from 2)
And here’s what it looks like after fixing those bugs: ==31252== ==31252== ==3125 ==31252== 2== ==31252== ==31252== ==31252== ==31252== ==31252== hi ==31252== ==31252== ==31252== ==31 ==3125 252 2== ==3125 ==31252== 2== ==31252== ==3125 ==31252== 2== ==31252== ==3125 ==31252== 2== ==3125 ==31252== 2==
6.12 6.12 •
Memcheck, Memcheck, a memory memory error detector detector Copyri Copyright ght (C) 2002-2 2002-2013 013, , and GNU GPL'd, GPL'd, by Julian Julian Seward Seward et al. Using Valgrind-3.9. Valgrind-3.9.0 0 and LibVEX; rerun with -h for copyright copyright info Command: Command: ./a.out ./a.out
HEAP SUMMARY: SUMMARY: in use use at at exit exit: : 0 byte bytes s in 0 bl blocks ocks total total heap heap usage: usage: 2 allocs, allocs, 2 frees, frees, 16 bytes alloc allocate ated d All heap heap blocks blocks were freed freed -- no leaks leaks are possible possible For counts counts of detect detected ed and suppre suppresse ssed d errors errors, , rerun rerun with: with: -v ERROR SUMMAR SUMMARY: Y: 0 errors errors from from 0 contex contexts ts (suppr (suppress essed: ed: 2 from from 2)
How How to use use a mem memor ory y debug debugge ger r Detailed Detailed instructions instructions on installation installation & use of these tools are availabl availablee here: http://www.cs.rpi.edu/acad http://www.cs .rpi.edu/academics/courses emics/courses/spring17/ds/ /spring17/ds/memory_debugg memory_debugging.php ing.php
•
Memory errors (uninitialized memory, out-of-bounds read/write, use after free) may cause seg faults, crashes,
or strange output. •
Memory leaks on the other hand will never cause incorrect output, but your program will be ine fficient and
hog system resources. resources. A program program with a memo memory ry leak may waste so much much memory it causes causes all programs programs on the system to slow down significantly or it may crash the program or the whole operating system if the system runs out of memory (this takes a while on modern computers with lots of RAM & harddrive space). •
•
•
For HW3, the homework homework submission submission server server will be configured to run your code with Dr. Memo Memory ry to search search for memory memo ry problems and present present the output output with the submission submission results. For full credit credit your your program program must be memory error and memory leak free! A program that seems to run perfectly fine on one computer may still have significant memory errors. Running a memory debugger will help find issues that might break your homework on another computer or when submitted to the homework server. Important Note: When these tool find a memory leak, they point to the line of code where this memory was allocated . These tools does not understand the program logic and thus obviously cannot tell us where it should
have been deleted. •
A final note: note: STL and other 3rd party libraries libraries are highly optimized optimized and sometimes sometimes do sneaky sneaky but correct correct and bug-free tricks for e fficiency ciency that confuse confuse the memo memory ry debugger. debugger. For example, because the STL string class uses its own allocator, there may be a warning about memory that is “still reachable” even though you’ve deleted all your dynamically allocated memory. The memory debuggers have automatic suppressions for some of these known “false “false positives”, positives”, so you will see this listed as a “suppressed “suppressed leak”. So don’t worry if you see those messages.
7
6.13 6.13 •
Diagram Diagrammin ming g Memo Memory ry Exerc Exercise isess
Draw a diagram of the heap and stack memory memory for each segment of code below. Use a “ ?” to indicate that the value value of the memory is uninitialized uninitialized.. Indicate Indicate whether there are any errors errors or memory leaks during execution of this code.
clas class s Foo Foo { public: double double x; int* int* y; }; Foo a; a.x = 3.1415 3.14159; 9; Foo *b = new Foo; (*b).y (*b).y = new int[2] int[2]; ; Foo *c = b; a.y a.y = b->y b->y; ; c->y[1 c->y[1] ] = 7; b = NULL;
int a[5] = { 10, 11, 12, 13, 14 }; int *b = a + 2; *b = 7; int int *c = new new int[ int[3] 3]; ; c[0] c[0] = b[0]; b[0]; c[1] c[1] = b[1]; b[1]; c = &(a[3] &(a[3]); );
•
Write code to produce this diagram: stack a:
4.2 8.6 2.9
b:
6.14 6.14
heap
6.5 5.1 3.4
Soluti Solutions ons to Diagra Diagrammi mming ng Memory Memory Exerci Exercises ses n i 3 . m a r g o r p s i h t n i s t f o k a e l y r o m e m a s i e r e h T
; 4 . 3 = ] 2 [ b ; 1 . 5 = ] 1 [ b ; 5 . 6 = ] 0 [ b ; 9 . 2 = ] 2 [ a ; 6 . 8 = ] 1 [ a ; 2 . 4 = ] 0 [ a ; ] 3 [ e l b u o d w e n = b * e l b u o d ; ] 3 [ a e l b u o d
c * t n i b * t n i 0 1 ?
1 1
3 1
7 2 1
7
3 1 4 1
p a e h e h t
k c a t s e h t
8
a * t n i
CSCI-1200 Data Structures — Spring 2017 Lecture 7 — Order Notation & Basic Recursion Review from Lectures 5 & 6 •
Arrays and pointers, Pointer arithmetic and dereferencing
•
Diff erent erent types of memory (“automatic”, static, dynamic)
•
Dynamic allocation of arrays
•
Drawing pictures to explain stack vs. heap memory allocation
•
Memory debugging
Today’s Lecture •
Algorithm Analysis
•
Formal Definition of Order Notation
•
Simple recursion
•
Visualization of recursion
•
Iteration vs. Recursion
•
“Rules” for writing recursive functions.
•
Lots of examples!
7.1
Algori Algorithm thm Analys Analysis is
Why should we bother? •
We want want to do better than just implemen implementing ting and testing testing every idea we have. have.
•
We want to know why one algorithm is better than another.
•
We want to know the best we can do. (This is often quite hard.)
How do we do it? There are several several options, including: •
Don’t do any analysis; just use the first algorithm you can think of that works.
•
Implement and time algorithms to choose the best.
•
•
7.2 •
Analyze algorithms by counting operations while assigning di ff erent erent weights to di ff erent erent types of operations based on how long each takes. Analyze Analyze algorithms algorithms by assuming assuming each each operation requires requires the same amount of time. Count Count the total number number of operations, and then multiply this count by the average cost of an operation.
Exercis Exercise: e: Count Counting ing Example Example Suppose arr is an array of n doubles. Here is a simple fragment of code to sum of the values in the array: doub double le sum sum = 0; for (int i=0; i=0; i
•
What is the total total numbe numberr of operation operationss perform performed ed in execut executing ing this fragme fragment nt?? Come Come up with with a functi function on describing the number of operations in terms of n .
7.3
Exercis Exercise: e: Which Which Algori Algorithm thm is Best Best? ?
A venture venture capitalist capitalist is trying to decide which of 3 startup startup companies companies to invest invest in and has asked for your help. Here’s the timing data for their prototype software on some di ff erent erent size test cases: n 10 20 30 100 1000
foo-a 10 u-sec 13 u-sec 15 u-sec 20 u-sec ?
foo-b 5 u-sec 10 u-sec 15 u-sec 50 u-sec ?
foo-c 1 u-sec 8 u-sec 27 u-sec 1000 u-sec ?
Which company has the “best” algorithm?
7.4
Order Order Notati Notation on Definiti Definition on
In this course course we will focus on the intuition intuition of order notation. notation. This topic will be cover covered ed again, again, in more more depth, in later computer science courses. •
•
Definition: Algorithm A Algorithm A is order f order f ((n) — denoted O denoted O((f ( f (n)) — if constants k and n and n 0 exist such that A that A requires no more than k than k ∗ f ( f (n) time units (operations) to solve a problem of size n ≥ n0 . For example, algorithms requiring 3n 3 n + 2, 2, 5n − 3, and 14 + 17n 17n operations are all O( O (n). This is because we can select values for k and n and n 0 such that the definition above holds. (What values?) Likewise, algorithms requiring n requiring n 2 /10 + 15n 15n − 3 and 10000 + 35n 35n2 are all O all O((n2 ).
•
Intuitively, we determine the order by finding the asymptotically dominant term (function of n of n)) and throwing out the leading leading constant constant.. This term could involve involve logarithmic logarithmic or exponential exponential functions functions of n. Implication Implicationss for analysis: – We don’t need to quibble about small di ff erences erences in the numbers numbers of operations. – We also do not need to worry about the di ff erent erent costs of di ff erent erent types of operations. – We don’t produce an actual time. We just obtain a rough count of the number of operations. This count
is used for comparison comparison purposes. •
7.5 7.5 •
In practice, this makes analysis relatively simple, quick and (sometimes unfortunately) rough.
Comm Common on Ord Orders ers of of Magn Magnit itud ude e O(1), a.k.a. (1), a.k.a. CONSTANT CONSTANT : The number number of operations operations is independent independent of the size of the problem. problem. e.g., compute compute quadratic root.
•
O(log n), ), a.k.a. LOGARITHMIC. e.g., LOGARITHMIC. e.g., dictionary lookup, binary search.
•
O(n), a.k.a. ), a.k.a. LINEAR. e.g., LINEAR. e.g., sum up a list.
•
O(n log n), e.g., sorting.
•
O(n2 ), O ), O((n3 ), O ), O((nk ), a.k.a. ), a.k.a. POLYNOMIAL. POLYNOMIAL. e.g., e.g., find closest pair of points.
•
O(2n ), O ), O((k n ), ), a.k.a. EXPONENTIAL. e.g., Fibonacci, playing chess.
7.6 •
Exercis Exercise: e: A Sligh Slightly tly Harder Harder Exam Example ple Here’s an algorithm to determine if the value stored in variable x is also in an array called foo. Can you analyze it? What did you do about the if statement statement?? What did you assume about where the value stored stored in x occurs in the array (if at all)? int loc=0; loc=0; bool bool found found = false; false; whil while e (!fo (!foun und d && loc < n) { if (x == foo[lo foo[loc]) c]) found found = true; true; else loc++; loc++; } if (found (found) ) cout cout << "It is there! there!\n" \n"; ;
2
7.7 •
Best-Case, Best-Case, Average-Ca Average-Case se and Worst-Case orst-Case Analysi Analysiss For a given fixed size array, we might want to know: – The fewest number of operations (best case) that might occur. – The average number of operations (average case) that will occur. – The maximum maximum number number of operations operations (worst case) that can occur.
•
The last is the most common. The first is rarely used.
•
On the previous algorithm, the best case is O(1), O (1), but the average case and worst case are both O( O (n).
7.8 •
•
Approa Approach ching ing An Anal Analysi ysiss Proble Problem m Decide the important important variable variable (or variables) variables) that determine determine the “size” of the problem. problem. For arrays and other “containe “containerr classes” classes” this will generally generally be the number of values values stored. Decide what to count. The order notation helps us here. – If each loop iteration does a fixed (or bounded) amount of work, then we only need to count the number
of loop iterations. – We might also count specific operations. For example, in the previous exercise, we could count the number
of comparisons. •
Do the count and use order notation to describe the result.
7.9 7.9
Exerc Exercis ise: e: Order Order Not Notat atio ion n
For each version below, give an order notation estimate of the number of operations as a function of n: 1.
int count=0; count=0; for (int i=0; i
7.10 •
2.
int count=0; count=0; for (int i=0; i
int count=0; count=0; for (int i=0; i
Recursive Recursive Definitio Definitions ns of Factor Factorials ials and Integer Integer Exponenti Exponentiation ation
Factorial is defined for non-negative integers as n! =
(
n (n (n − 1)! n > 0 1 n == 0 ·
Computing Computing integer integer powers powers is defined as:
•
n p =
•
3.
(
n n p−1 p > 0 1 p == 0 ·
These are both examples of recursive recursive definitions .
7.11 7.11
Recursi Recursive ve C++ Functio unctions ns
C++, like other modern programming programming languages, languages, allows allows functions functions to call themselves. themselves. This gives a direct direct method method of implementing recursive functions. Here are the recursive implementations of factorial and integer power: int fact(int fact(int n) { if (n == 0) { return return 1; } else { int result = fact(n-1) fact(n-1); ; return return n * result result; ; } }
int int intp intpow ow(i (int nt n, int int p) { if (p == 0) { return return 1; } else { retu return rn n * intp intpow ow( ( n, p-1 ); } }
3
7.12 7.12 •
The Mechan Mechanism ism of Recursi Recursive ve Func Functio tion n Calls Calls
For each recursive call (or any function call), a program creates an activation record to record to keep track of: parameters and local variable variabless for the newly-called newly-called function. function. – Complete Completely ly separate separate instances instances of the parameters – The location in the calling function function code to return to when the newly-called newly-called function function is complete. complete. (Who
asked for this function to be called? Who wants the answer?) Which activ activati ation on record record to return return to when when the function function is done. done. For recurs recursiv ivee functi functions ons this can be – Which confusing since there are multiple activation records waiting for an answer from the same function. •
This is illustrated in the following diagram of the call fact(4). Each Each box is an activation activation record, record, the solid lines indicate indicate the function function calls, and the dashed lines indicate indicate the returns. returns. Inside Inside of each each box we list the parameters parameters and local variable variabless and make notes about the computati computation. on.
tmp = fact(4) 24
•
fact(3)
fact(2)
fact(1)
fact(0)
n=4 result = fact(3) return 4*6
n=3 result = fact(2) return 3*2
n=2 result = fact(1) return 2*1
n=1 result = fact(0) return 1*1
n=0 return 1
6
2
1
1
This chain of activation records is stored in a special part of program memory called the stack .
7.13 7.13 •
fact(4)
Iterati Iteration on vs. Recursi Recursion on
Each of the above functions could also have been written using a for or while loop, i.e. iteratively i.e. iteratively . For example, here is an iterative iterative version version of factorial: factorial: int ifact(in ifact(int t n) { int int resu result lt = 1; for (int i=1; i=1; i<=n; i<=n; ++i) ++i) resu result lt = resu result lt * i; return return result; result; }
Often writing recursive recursive functions functions is more natural natural than writing iterative iterative functions, functions, especially for a first draft of a problem implementation.
•
•
•
You should learn how to recognize whether an implementation is recursive or iterative, and practice rewriting one version version as the other. Note: Note: We’ll see that not all recursive recursive functions functions can be easily easily rewritten in iterative form! Note: The order notation for the number number of operations operations for the recursive recursive and iterative iterative versions versions of an algorithm algorithm is usually the same. Howeve Howeverr in C, C++, Java, Java, and some other languages, languages, iterative functions are generally faster than their correspondi corresponding ng recursive recursive functions . This This is due to the overhe overhead ad of the functi function on call mechamechanism. Compiler Compiler optimizations optimizations will sometimes sometimes (but not always!) always!) reduce reduce the performance performance hit by automatically automatically eliminating the recursive function calls. This is called tail call optimization .
7.14 7.14
Exerc Exercis ises es
1. Draw Draw a picture to illustrate illustrate the activation activation records records for the function function call cout cout << intpow intpow(4, (4, 4) << endl; endl;
intpow. 2. Write an iterative iterative version version of intpow
4
7.15
Rules for Writin Writing g Recursiv Recursive e Functi Functions ons
Here is an outline outline of five steps that are useful in writing and debugging recursiv recursivee functions. functions. Note: You don’t have to do them in exactly this order... 1. Handle the the base case(s). case(s). 2. Define the problem problem solution solution in terms of smaller instances instances of the problem. problem. Use wishful Use wishful thinking , i.e., if someone else solves the problem of fact(4) I can extend that solution to solve fact(5). This This defines defines the necessa necessary ry recursive calls. It is also the hardest part! 3. Figure out what work needs needs to be done before making making the recursive recursive call(s). 4. Figure out what work needs to be done after the recursive recursive call(s) complete(s) to finish the computation. computation. (What are you going to do with the result of the recursive call?) 5. Assume Assume the recursive recursive calls work correctly correctly,, but make sure they are progressing progressing toward toward the base case(s)! case(s)!
7.16 7.16
Recursi Recursion on Example Example:: Print Printing ing the Cont Conten ents ts of a Vector ector Here is a function function to print the contents contents of a vector. Actually Actually,, it’s two two functions: functions: a driver function , and a true recursive recursive function. function. It is comm common on to have have a driver driver function that just initializes initializes the first recursive recursive function function call.
•
void print_vec(std print_vec(std::vec ::vector& nt>& v, unsigned unsigned int i) { if (i < v.si v.size ze() ()) ) { cout << i << ": " << v[i] << endl; print_vec(v, i+1); } } void print_vec( print_vec(std:: std::vecto vector& >& v) { print_vec( print_vec(v, v, 0); } •
Exercise: What will this print when called in the following code? int main() main() { std::vector a; a.push_bac a.push_back(3); k(3); a.push_ba a.push_back(5) ck(5); ; print_vec(a); }
•
a.push_bac a.push_back(11 k(11); ); a.push_bac a.push_back(17) k(17); ;
Exercise: How can you change the second print vec function as little as possible so that this code prints the
contents of the vector in reverse order?
7.17 7.17 •
Bina Binary ry Searc Search h
Suppose you have a std::vector T ), sorted so that: std::vector v (for a placeholder type T ), v[0] v[0] <= v[1] v[1] <= v[2] v[2] <= ... ...
•
•
Now suppose that you want to find if a particular value x is in the vector vector somewhere. somewhere. How can you you do this without looking at every value in the vector? The solution is a recursive algorithm called binary search , based on the idea of checking the middle item of the search interval within the vector and then looking either in the lower half or the upper half of the vector, depending on the result of the comparison. template template bool bool binsea binsearch rch(co (const nst std::vec std::vector tor &v, int low, int high, high, const const T &x) { if (high == low) low) return return x == v[low]; v[low]; int int mid mid = (low (low+h +hig igh) h) / 2; if (x <= v[mi v[mid] d]) ) return return binsearch binsearch(v, (v, low, mid, x); else return return binsearch binsearch(v, (v, mid+1, mid+1, high, x); } template template bool binsearch(co binsearch(const nst std::vector > &v, const T &x) { return return binsearch binsearch(v, (v, 0, v.size()v.size()-1, 1, x); }
5
7.18 7.18
Exerc Exercis ises es
1. Write a non-recurs non-recursive ive version version of binary binary search. search. 2. If we replaced replaced the if-else if-else structure inside the recursive recursive binsearch binsearch function function (above) (above) with if ( x < v[mid] ) return return binsearc binsearch( h( v, low, low, mid-1, mid-1, x ); else return return binsearc binsearch( h( v, mid, mid, high, high, x );
would the function still work correctly?
6
CSCI-1200 Data Structures — Spring 2017 Lecture 8 — Templated Classes & Vector Implementation Review from Lectures 7 •
•
•
8.1 •
Algorithm Analysis, Formal Definition of Order Notation Simple recursion, Visualization of recursion, Iteration vs. Recursion, “Rules” for writing recursive functions. Lots of examples!
Today’ oday’ss Lectu Lecture re Designing our own container classes: – Mimic the interface of standard library (STL) containers – Study the design of memory management. – Move toward eventually designing our own, more sophisticated classes.
•
Vector implementation
•
Templated classes (including compilation compilation and instantiation of templated classes)
•
Copy constructors, assignment operators, and destructors
Optional Reading: Reading: Ford&T Ford&Topp, opp, Sections 5.3-5.5; Koening Koening & Moo Chapter 11
8.2 •
Vector ector Public Public Inter Interface face In creating our own version of the STL vector class, we will start by considering the public interface: public: // MEMBER MEMBER FUNCTIONS FUNCTIONS AND OTHER OPERATORS OPERATORS T& operator[] operator[] (size_type (size_type i); const T& operator[ operator[] ] (size_typ (size_type e i) const; const; void push_back( push_back(cons const t T& t); void resize(size_t resize(size_type ype n, const const T& fill_in_va fill_in_value lue = T()); void clear(); clear(); bool empty() empty() const; const; size_type size_type size() size() const; const;
•
8.3 •
To implem implemen entt our own own generi genericc (a.k.a (a.k.a.. templa templated ted)) vecto vectorr class, class, we will implem implemen entt all of these these operati operations ons,, manipulate the underlying representation, and discuss memory management.
Templated emplated Class Class Declaration Declarationss and Member Functio Function n Definitions Definitions In terms of the layout of the code in vec.h (pages 5 & 6 of the handout), the biggest di ff erence erence is that this is a templated class . The keyword template and the template type name must appear before the class declaration: templa template te class class Vec
•
•
Within the class declaration, T is used as a type and all member functions are said to be “templated over type T”. In the actual text of the code files, templated member functions are often defined (written) inside the class declaration . The templa templated ted functi functions ons defined defined outsid outsidee the templa template te class class declar declarati ation on must must be preced preceded ed by the phrase phrase:: then when Vec is referred to it must be as Vec . For example example,, for member member template template and then function create (two versions), we write: template template void Vec::cr Vec::create( eate(... ...
8.4 •
•
•
•
Syntax Syntax and Compil Compilati ation on Templa Templated ted classes classes and templated templated mem member ber functions functions are not created/c created/compile ompiled/inst d/instant antiated iated until they are Vec v1; with int needed needed.. Compila Compilatio tion n of the class class declar declarati ation on is trigge triggered red by a line of the form: form: Vec replacing T. This also compiles the default constructor for Vec because because it is used here. Other Other member member functions are not compiled unless they are used. When a di ff erent erent type is used with Vec, for example in the declaration: Vec > z; the template declaration is compiled again, this time with double replacing T instead of int. Aga Again, in, howev however, er, only the member functions used are compiled. This is very di ff erent erent from ordinary classes, which are usually compiled separately and all functions are compiled regardless of whether or not they are needed. The templated class declaration and the code for all used member functions must be provided where they are used. As a result, result, member functions functions definitions definitions are often included included within the class declaration declaration or defined defined outside of the class declaration but still in the .h file. If member function definitions are placed in a separate .cpp file, this file must be #include-d, just like the .h file, because the compiler needs to see it in order to generate code. (Normally we don’t #include .cpp files!) See also diagram on page 7 of this handout. Note: Including function definitions in the .h for ordinary non-templated classes may lead to compilation errors about functions functions being “multiply “multiply defined”. Some of you have have already already seen these errors. errors.
8.5
Membe Memberr Variabl ariables es
Now, looking inside the Vec class at the member variables: •
m data is a pointer to the start of the array (after it has been allocated). Recall the close relationship between
pointers and arrays. •
m size indica indicates tes the number number of locatio locations ns curren currently tly in use in the vector vector.. This This is exactl exactly y what what the size()
member function should return, •
m alloc is the total number of slots in the dynamically allocated block of memory.
Drawing pictures, which we will do in class, will help clarify this, especially the distinction between m size and m alloc.
8.6 8.6 •
8.7 8.7 •
Typede ypedefs fs Vec . Once created the names Several types are created through typedef statements in the first public area of Vec are used as ordinary type names. For example Vec::size type is the return type type of the size() function, function, defined here as an unsigne unsigned d int.
opera operato tor[ r[]] Access to the individual locations of a Vec is provided through operator[]. Syntactically, use of this operator is translated by the compiler into a call to a function called operator[]. For exampl example, e, if v is a Vec, then: v[i] = 5;
translates into: v.operato v.operator[](i r[](i) ) = 5; •
In most classes there are two versions of operator[]: – A non-const version returns a reference to m data[i]. This is applied to non-const Vec objects. – A const version is the one called for const Vec objects objects.. This This also also return returnss m data[i], but as a const
reference, so it can not be modified.
2
8.8 •
•
Default Default Versi Versions ons of Assignmen Assignmentt Operator and Copy Copy Constructor Constructor Are Are Dangerous! Dangerous! Before we write the copy constructor and the assignment operator, we consider what would happen if we didn’t write them. C++ compilers provide default versions of these if they are not provided. These defaults just copy the values of the member variables, one-by-one. For example, the default copy constructor would look like this: template template Vec Vec :: Vec(co Vec(const nst Vec Vec& & v) : m_data(v.m_data), m_size(v.m_size), m_alloc(v.m_alloc) {}
In other words, it would construct each member variable from the corresponding member variable of v. This is dangerous and incorrect behavior for the Vec class. We don’t want to just copy the m_data pointer. We really want to create a copy of the entire array! Let’s look at this more closely...
8.9 8.9
Exer Exerci cise se
Suppose we used the default version of the assignment operator and copy constructor in our Vec class class.. What What would would be b e the output output of the following following program? program? Assume Assume all of the operations except the copy constructor behave as std::vector. they would with a std::vector Vec > v(4, 0.0); 0.0); v[0] v[0] = 13.1 13.1; ; v[2] v[2] = 3.14 3.14; ; Vec u(v); u[2] u[2] = 6.5; 6.5; u[3] u[3] = -4.8; -4.8; for (unsig (unsigned ned int i=0; i=0; i<4; i<4; ++i) ++i) cout << u[i] << " " << v[i] << endl;
Explain what happens by drawing a picture of the memory of both u and v .
8.10 8.10
Classe Classess With Dynam Dynamica ically lly Alloca Allocated ted Memor Memory y For Vec (and other classes with dynamically-allocated memory) to work correctly, each object must do its own dynamic memory allocation and deallocation. We must be careful to keep the memory of each object instance separate from all others.
•
All dynamically-a dynamically-allocate llocated d memory for an object should be released released when the ob ject is finished with it or when the object itself goes out of scope (through what’s called a destructor ). ).
•
To prevent prevent the creation and use of default default versions versions of these operations, we must write our own:
•
– Copy constructor – Assignment operator – Destructor
8.11 8.11 •
•
•
The The “th “this is” ” poin pointe terr
All class objects have a special pointer defined called this which simply points to the current current class object, and it may not be changed. The expression *this is a reference to the class object. The this pointer is used in several ways: – Make it clear when member variables of the current object are being used. – Check to see when an assignment is self-referencing. – Return a reference to the current object.
3
8.12 8.12 •
•
This constructor must dynamically allocate any memory needed for the object being constructed, copy the contents of the memory of the passed object to this new memory, and set the values of the various member variables appropriately. appropriately. Exercise: In our Vec class, the actual copying is done in a private member function called copy. Write rite the private member function copy.
8.13 8.13 •
Copy Copy Constr Construct uctor or
Assign Assignmen mentt Operato Operatorr
Assignment operators of the form: are translated by the compiler as:
•
v1.operator=(v2);
Cascaded assignment operators of the form: are translated by the compiler as:
•
v1 = v2; v2;
v1 = v2 = v3;
v1.operator=(v2.operator=(v3));
Therefore, the value of the assignment operator ( v2 = v3) must be suitable for input to a second assignment operator. This in turn means the result of an assignment operator ought to be a reference to an object. The implementatio implementation n of an assignmen assignmentt operator operator usually takes on the same form for every class:
•
– Do no real work if there is a self-assignment. – Otherwise, destroy the contents of the current object then copy the passed object, just as done by the
copy copy constructor. constructor. In fact, it often often makes sense to write a private private helper function function used by both the copy constructor and the assignment operator. – Return a reference to the (copied) current object, using the this pointer.
8.14
The destructor destructor is called implicitly implicitly when an automaticallyautomatically-allocat allocated ed object ob ject goes out of scope or a dynamicallyallocated object is deleted . It can never be called explicitly!
•
The destructor destructor is responsible responsible for deleting deleting the dynamic dynamic memory “owned” by the class.
•
•
The syntax of the function definition is a bit weird. The ~ has been used as a logic negation in other contexts.
8.15 8.15 •
Destructo Destructorr (the “constr “constructor uctor with with a tilde/ tilde/twi twiddle” ddle”))
Incre Inc reasi asing ng the the Size Size of of the Vec
location. n. But what what if the push_bac push_back(T k(T const& const& x) adds to the end of the array, increasing m size by one T locatio allocated array is full ( m size == m alloc)? 1. Allocate Allocate a new, larger array array. The best strategy is generally to double the size of the current current array. array. Why? Why? 2. If the array array size size was was origin originall ally y 0, doubli doubling ng does nothin nothing. g. We must must be sure sure that that the resulti resulting ng size is at least 1. 3. Then we need to copy copy the contents contents of the current current array array. 4. Finally Finally, we must delete current current array, array, mak makee the m data pointer point to the start of the new array, and adjust the m size and m alloc variables appropriately. appropriately.
•
Only when we are sure there is enough room in the array should we actually add the new object to the back of the array.
8.16 8.16
Exer Exerci cise sess
•
Finish the definition of Vec::push back .
•
Write the Vec::resize function.
4
8.17
Vec Declar Declaration ation & Implem Implement entation ation (vec.h)
#ifndef #ifndef Vec_h_ Vec_h_ #define #define Vec_h_ Vec_h_ // Simple Simple impleme implementa ntatio tion n of the vector vector class, class, revised revised from from Koenig Koenig and Moo. Moo. This This // class class is implem implement ented ed using using a dynami dynamical cally ly alloca allocated ted array array (of templa templated ted type type T). // We ensure ensure that that m_size m_size is always always <= m_allo m_alloc c and when a push_b push_back ack or resize resize // call call would would violat violate e this this condit condition ion, , the data is copied copied to a larger larger array. array. templa template te class class Vec { public: // TYPEDEFS TYPEDEFS typedef typedef unsigned unsigned int size_type; size_type; // CONSTRUCTO CONSTRUCTORS, RS, ASSIGNMNE ASSIGNMNENT NT OPERATOR, OPERATOR, & DESTRUCTO DESTRUCTOR R Vec() { this->crea this->create(); te(); } Vec(si Vec(size_ ze_typ type e n, const const T& t = T()) T()) { this-> this->cre create ate(n, (n, t); } Vec(co Vec(const nst Vec& v) { copy(v copy(v); ); } Vec& operator= operator=(cons (const t Vec& v); ~Vec() ~Vec() { delete delete [] m_data m_data; ; } // MEMBER MEMBER FUNCTIONS FUNCTIONS AND OTHER OPERATORS OPERATORS T& operat operator[ or[] ] (size_ (size_typ type e i) { return return m_data m_data[i] [i]; ; } const const T& operat operator[ or[] ] (size_ (size_typ type e i) const const { return return m_data m_data[i] [i]; ; } void push_back(cons push_back(const t T& t); void resize(size_t resize(size_type ype n, const T& fill_in_v fill_in_value alue = T()); void void clear() clear() { delete delete [] m_data; m_data; create create(); (); } bool bool empty( empty() ) const const { return return m_size m_size == 0; } size_t size_type ype size() size() const const { return return m_size m_size; ; } private: // PRIVATE PRIVATE MEMBER MEMBER FUNCTIONS FUNCTIONS void create(); create(); void create(size_ create(size_type type n, const T& val); void copy(const copy(const Vec& Vec& v); // REPRESENTATION T* m_da m_dat ta; // Poin Point ter to fir firs st loc loca ation tion in the the all alloc ocat ate ed arr arra ay size_t size_type ype m_size; m_size; // Number Number of elements elements stored stored in the vector vector size_t size_type ype m_alloc; m_alloc; // Number Number of array array locati locations ons allocate allocated, d, m_size m_size <= m_allo m_alloc c }; // Create Create an empty vector (null pointers pointers everywhere). everywhere). templa template te void void Vec Vec::c ::crea reate( te() ) { m_data = NULL; m_size = m_alloc = 0; // No memory allocated yet } // Create Create a vector vector with size n, each each locati location on having having the given given value value template template void Vec::crea Vec::create(si te(size_t ze_type ype n, const T& val) { m_data = new T[n]; m_size = m_alloc = n; for for (siz (size_ e_ty type pe i = 0; i < m_si m_size ze; ; i++) i++) { m_data[i] = val; } } // Assign Assign one vector to another, another, avoiding duplicate duplicate copying. copying. template template Vec& Vec& Vec::o Vec::operat perator=( or=(const const Vec& v) { if (this != &v) { delete delete [] m_data; m_data; this -> copy(v); copy(v); } return return *this; *this; }
5
// Create Create the vector vector as a copy copy of the given given vector. vector. template template void Vec::copy Vec::copy(cons (const t Vec& Vec& v) {
} // Add an elemen element t to the end, resize resize if necess necesssar sary. y. template template void Vec::push Vec::push_back _back(con (const st T& val) { if (m_siz (m_size e == m_allo m_alloc) c) { // Alloca Allocate te a larger larger array, array, and copy the old values values
} // Add the value value at the last location location and increm increment ent the bound bound m_data[m_size] = val; ++ m_size; m_size; } // If n is less less than than or equa equal l to the curre current nt size, size, just just chan change ge the size. size. If n is // greate greater r than than the current current size, size, the new slots slots must must be filled filled in with with the given value. value. // Re-all Re-alloca ocatio tion n should should occur only if necess necessary ary. . push_b push_back ack should should not be used. used. template template void Vec::resi Vec::resize(si ze(size_t ze_type ype n, const T& fill_in_v fill_in_value alue) ) {
} #endif
8.18
File Organizati Organization on & Compilat Compilation ion of Templ Templated ated Classes Classes
The diagram on the next page shows shows the typical typical and suggested suggested file organizatio organization n for non-templa non-templated ted vs. templated templated classes. classes. Common Common mistakes mistakes and the resulting compilation compilation errors errors are noted.
6
} ; 7 > T
h . t s l
n " r p u p s t h h h s e . _ _ a r t t t l s s s c { { l l l < ; " _ _ t ) ) e s ( ( e f e t L f g d e n a u f d i l s t t l i n f p s n n c d f e m a i i n n i d e l ; i e # # t c } # #
> T
h . c e v
> T {
p p h . t s l
s ) s ( a g l : c : ; < > 8 T e < n t t r a s u l L t p e m t r e n t i }
s e l i f y r n o a i v t a s t n n o e i t m n l e e v p n m o i c e g t a n l i p m m a t n e r o f
> T {
s s ) h h s s ( _ _ a a e c c l l : e e c { c : ; v v < ; < > 6 _ _ c ) T e e ( e < n f e t V e t c r e n a a e u f d i l s t l V t i n f p s n p e d f e m a i m t r n i d e l ; e n e # # t c } t i } #
) y e l b e v t s i t u c m e p g s e d ( r n a p e p . s c n r a o b i t c d n n u a f d p e p t c . a o l p f o m g e n t f i i o l n p o m i t o a c t n n e e h m w e e l p l m b a i l e i a h t v a
} ; 4 n r u t e r
h . r a b
h . o o f
h h _ _ r r a a { b b { _ _ r ) a ( ; f e B c ) e n ( f d i s t d i n f s n d f e a i t n i d l ; n e # # c } i #
" " h h h h . . _ _ z c o o a e o o b v { f f " " ; _ _ o ) e e o ( f e d d F b e n u u f d i l l s t i n f c c s n d f e n n a i n i d i i l ; e # # # # c } #
p p c . r a b
p p c . o o f
" " h h . . r t a s b l ; " " 5 { e e n d d ) r u u ( u l l d t c c e n n t r i i n # # i }
" " h h . . { o r o a ) f b ( ; " " b 3 : e e : n d d o r u u o u l l F t c c e n n t r i i n # # i }
e l i p m o c
o . r a b
e l i p m o c
o . o o f
} ; 2 n r u t e r
h . z a b
h h _ _ z z a a { { b b _ _ z ) a ( f e B a e n f d i s t i n f s n d f e a i n i d l ; e # # c } #
" " h h . . z o a o { b f ; " " ) 1 ( e e n n d d i r u u a u l l m t c c e n n t r i i n # # i }
p p c . n i a m
e l i p m o c
o . n i a m
n o i t , c o . h . n u r r f a a e b b h n t & i e o . d s o e t u o f n a e c e i n m b " e r l o d p r e n m r e i i f s k e a i n d w l − y d a l p n e i t o v l i a t h u c m n d " l u f u s f i o a w w e t w u
k n i l
o . g o r p y m / e x e . g o r p y m
" s n s o p e i p v t i a c t . r c a n i e l r c a i e m d d r e i o l n s p z s i a e t l c u B o r m s s p " l a e t r n c p e f o e v e h t r p
CSCI-1200 Data Structures — Spring 2017 Lecture 9 — Iterators & STL Lists Review from Lecture 8 •
Designing our own container classes
•
Dynamically allocated memory in classes
•
Copy constructors, assignment operators, and destructors
•
Templated classes, Implementation of the DS Vec class, mimicking the STL vector class
HW3 Tips •
You must write the assignment operator, Matrix::operator=(const Matrix::operator=(const Matrix& other matrix)
•
•
•
•
•
•
When writing copy constructors and assignment operators, if there is dynamic memory involved, you must copy the values, not the pointers. Draw memory diagrams! diagrams! Use small matrices (the SimpleTest( SimpleTest()) matrices are all small) small) so that you can draw out the details. Follow your code line by line. The homework assignment shows how the matrix data is organized in a double**. Which Which part(s) part(s) are on the stack and which are on the heap? If an asserti assertion on fails, fails, your your code will crash. crash. This This is by design. design. Fine Fine the line numbe numberr of the asserti assertion, on, and see what the assert was testing. Read the lines above it too. Use Dr. Memo Memory ry or Valgrind Valgrind to catch leaks and memory errors. errors. Not fixing these can lead to problems all over. over. Let’s consider quarter() of a 1x1 and of a 0x0 together.
Today •
Another vector operation: pop back
•
Erasing items from vectors is ine fficient!
•
Iterators and iterator operations
•
•
•
STL lists are a diff erent erent sequential container class. Returning references to member variables from member functions Vec iterator implementation
Optional Reading: Ford & Topp Ch 6; Koenig & Moo, Sections 5.1-5.5
9.1
Review: Review: Constructo Constructors, rs, Assignm Assignment ent Operator, Operator, and Destr Destructor uctor
From an old test: Match up the line of code with the function that is called. Each letter is used exactly once. Foo f1;
a)
assign assignmen mentt operato operatorr
Foo* Foo* f2;
b)
dest destru ruct ctor or
f2 = new Foo(f1) Foo(f1); ;
c)
copy copy constr construct uctor or
f1 = *f2;
d)
defaul defaultt constr construct uctor or
delete delete f2;
e)
none none of the the above above
9.2 9.2 •
•
Anot Anothe herr STL STL vector operation: pop back We have seen how push back adds a value to the end of a vector, increasing the size of the vector by 1. There is a corresponding corresponding function function called pop back, which removes the last item in a vector, reducing the size by 1. There are also vector functions called front and back which denote (and thereby provide access to) the first and last item in the vector, allowing them to be changed. For example: vector vector t> a(5,1) a(5,1); ; a.pop_back(); a.fr a.fron ont( t() ) = 3; 3; a.ba a.back ck() () = -2; -2;
// // // //
a has 5 values values, , all 1 a now has 4 values equi equiva vale lent nt to the the sta state teme ment nt, , a[0 a[0] ] = 3; equ equiv ival alen ent t to the the sta state teme ment nt, , a[a. a[a.si size ze() ()-1 -1] ] = -2; -2;
2
9.3 •
•
•
Motiv Motivating Exampl Example: e: Course Course Enrollme Enrollment nt and and Waiti Waiting ng List This program maintains maintains the class list and the waiting list for a single course. course. The program is structured structured to handle handle interact interactive ive input. Error checking checking ensures that the input is valid. valid. Vecto Vectors rs store the enrolled enrolled students students and the waiting students students.. The main work is done in the two functions functions enroll student and remove student . The invariant on the loop in the main function determines how these functions must behave.
9.4 9.4
Exer Exerci cise sess
1. Write erase from vector . This function function removes removes the value at index location i from a vector vector of strings. The size of the vector should be reduced by one when the function is finished. // Remove Remove the valu value e at index index loca locati tion on i from from a vect vector or of stri string ngs. s. The The // size size of the vector vector should should be reduce reduced d by one when the functi function on is finish finished. ed. void erase_from erase_from_vect _vector(un or(unsigne signed d int i, vector ring>& & v) {
}
2. Give an order notation notation estimate estimate of the average average cost of erase_from_vector, pop_back, and push_back.
9.5 •
•
What What To To Do About the Expense Expense of Erasin Erasing g From From a Vect Vector? or? When items are continu continually ally being inserted inserted and removed removed,, vectors vectors are not a good choice choice for the container. container. Instead we need a di ff erent erent sequential container, called a list . – This has a “linked” structure that makes the cost of erasing independent of the size.
•
We will move toward a list-based implementation of the program in two steps: – Rewriting our classlist vec.cpp code in terms of iterator operations. – Replacing vectors with lists
9.6 9.6 •
Iter Iterat ator orss Here’s the definition (from Koenig & Moo). An iterator: – identifies identifies a containe containerr and a specific element element stored stored in the containe container, r, – lets us examine (and change, except for const iterators) the value stored at that element of the container, – provides operations for moving (the iterators) between elements in the container, – restricts restricts the availabl availablee operations operations in ways ways that correspond correspond to what the containe containerr can handle e fficiently.
•
•
9.7 •
As we will see, iterators for diff erent erent container classes have many operations in common. This often makes the switch between containers fairly straightforward from the programer’s viewpoint. Iterators Iterators in many ways are generaliza generalizations tions of pointers: pointers: many many operators operators / operations operations defined for pointers pointers are defined for iterators. You should use this to guide your beginning understanding and use of iterators.
Iterat Iterator or Decla Declarat ration ionss and Operat Operation ionss Iterator types are declared by the container class. For example, vector::iterator vector::iterator p; vector::const_iter vector::const_iterator ator q;
defines two (uninitialized) iterator variables. •
The dereference operator is used to access the value stored at an element of the container. The code: p = enrolled. enrolled.begi begin(); n(); *p = "01231 "012312"; 2";
changes the first entry in the enrolled vector. 4
•
The dereference dereference operator is com combined bined with dot operator for accessing accessing the mem member ber variable variabless and member functions functions of elements elements stored in container containers. s. Here’s Here’s an example example using the Student class and students vector from Lecture 4: vector::iterator vector::iterator i = students.begin(); students.begin(); (*i).compute_averages(0.45);
Notes: – This operation would be illegal if i i had been defined as a const iterator because compute_averages compute_averages is
a non-const member function. – The parentheses on the *i are required (because of operator precedence). •
There is a “syntactic sugar” for the combination of the dereference operator and the dot operator, which is exactly equivalent: equivalent: vector::iterat vector::iterator or i = students.begin(); students.begin(); i->compute_averages(0.45);
•
•
•
•
Just like pointers, iterators can be incremented and decremented using the ++ and -- operators to move to the next or previous element of any container. Iterators can be compared using the == and != operators. Iterators Iterators can b e assigned, assigned, just like any other variable. variable. Vector iterators have several additional operations: – Integer values may be added to them or subtracted from them. This leads to statements like
enrolled.erase(enrolled.be enrolled.erase(enrolled.begin() gin() + 5); – Vector iterators may be compared using operators like < , <=, etc. – For most containers (other than vectors), these “random access” iterator operations are not legal and
therefore prevented by the compiler. The reasons will become clear as we look at their implementations.
9.8 •
Exerci Exercise: se: Revis Revising ing the Class Class List List Program Program to Use Itera Iterator torss Now let’s modify the class list program to use iterators. First rewrite the erase from vector to use iterators. void erase_from_ve erase_from_vector( ctor(vecto vector: ing>::iter :iterator ator itr,
vector ring>& & v) {
}
Note: the STL vector class has a function function that does does just this... this... called called erase! •
9.9 9.9 •
•
Now, edit the rest of the file to remove all use of the vector subscripting operator.
A New New Data Datattype: ype: The list Standard Library Container Class Lists are our second second standa standardrd-libr library ary contain container er class. class. (Vect (Vectors ors were the first.) first.) Both Both lists lists & vecto vectors rs store store sequential data that can shrink or grow. However, the use of memory is fundamentally di ff erent. erent. Vectors ectors are formed as a single contiguous contiguous array-lik array-likee block of memory. Lists are formed as a sequentially linked structure instead.
array/vector:
list:
7
5
8
1
9
0
1
2
3
4
7
5
5
8
1
9
•
Although the interface (functions called) of lists and vectors and their iterators are quite similar, their implementations are VERY diff erent. erent. Clues to these these diff erences erences can be seen in the operations that are NOT in common, such as: – STL vectors / arrays allow “random-access” / indexing / [] subscripting. We can immediately jump to
an arbitrary location within the vector / array. – STL lists have no subscripting operation (we can’t use [] to access access data). The only way to get to the middle of a list is to follow pointers one link at a time. – Lists have push front and pop front functions in addition to the push back and pop back functions of vectors. – erase and insert in the middle of the STL list is very efficient, cient, independent independent of the size of the list. Both are implemente implemented d by rearranging rearranging pointers between between the small blocks of memo memory ry.. (We’ll (We’ll see this when we discuss the implementation details next week). – We can’t use the same STL sort function we used for vector; we must use a special sort function defined by the STL list type. std::vector my_vec; std::list my_lst; // ... ... put put some some data data in my_v my_vec ec & my_l my_lst st std::sort(my_vec.begin(),my_vec.end() std::sort(my_vec.b egin(),my_vec.end(),optional_compare_fu ,optional_compare_function); nction); my_lst.sort(optional_compare_function my_lst.sort(option al_compare_function); );
Note: STL list list sort member function is just as e fficient, O (n log n ), and will also take the same optional compare function as STL vector. – Several operations invalidate the values of vector iterators, but not list iterators: ∗ erase invalidates all iterators after the point of erasure in vectors; ∗ push back and resize invalidate ALL iterators in a vector The value of any associated vector iterator must be re-assigned / re-initialized after these operations.
9.10 9.10
Exerc Exercise ise:: Revis Revising ing the Class Class List Progra Program m to Use Lists Lists (& Iterato Iterators) rs)
Now let’s further modify the program to use lists instead of vectors. vectors. Because Because we’ve already already switched to iterators, iterators, this change will be relatively easy. And now the program will be more e fficient!
9.11 9.11 •
Erase Erase & IIte tera rato tors rs STL lists and vectors each have a special member function called erase. In particular, particular, given list of ints s, consider the example: std::list< std::list: int>::ite :iterator rator p = s.begin(); s.begin(); ++p; std::list< std::list: int>::ite :iterator rator q = s.erase(p) s.erase(p); ;
•
After the code above is executed: – The integer stored in the second entry of the list has been removed. – The size of the list has shrunk by one. – The iterator p does not refer to a valid entry. – The iterator q refers to the item that was the third entry and is now the second.
p 7
5 p
7 •
?
8
1
9
8
1
9
q
To reuse the iterator p and make it a valid entry, you will often see the code written: std::list< std::list: int>::ite :iterator rator p = s.begin(); s.begin(); ++p; p = s.eras s.erase(p e(p); );
6
•
Even though the erase function has the same syntax for vectors and for list, the vector version is O (n), whereas the list version is O(1).
9.12 9.12 •
•
Inse Insert rt
Similarly, there is an insert function for STL lists that takes an iterator and a value and adds a link in the chain with the new value immediately before the item pointed to by the iterator. The call returns an iterator that points to the newly added element. Variants on the basic insert function are also defined.
9.13 9.13
Exer Exerci cise se:: Using Using STL STL list Erase & Insert
Write a function that takes an STL list of integers, lst, and an integer, x. The functio function n should 1) remove remove all negative numbers from the list, 2) verify that the remaining elements in the list are sorted in increasing order, and 3) insert x into the list such that the order is maintained.
9.14 9.14 •
Implem Implemen entin ting g Vec Iterators
Let’s add iterators to our Vec class declaration from last lecture: public: // TYPEDEFS TYPEDEFS typedef typedef T* iterator; iterator; typedef typedef const T* const_ite const_iterator rator; ; // MODIFIERS MODIFIERS iterator iterator erase(ite erase(iterator rator p); // ITERATOR ITERATOR OPERATION OPERATIONS S iterat iterator or begin( begin() ) { return return m_data m_data; ; } const_iter const_iterator ator begin() const { return return m_data; m_data; } iterat iterator or end() end() { return return m_data m_data + m_size m_size; ; } const_ const_ite iterat rator or end() end() const const { return return m_data m_data + m_size m_size; ; }
•
First, remember that typedef statements create custom, alternate names for existing types. iterator type defined by by the Vec class class.. It is just just a T * (an int *). Thus, Thus, Vec::iterator is an iterator internal to the declarations and member functions, T* and iterator may be used interchangeably .
•
Because the underlying implementation of Vec uses an array, and because pointers are the the “iterator”s of arrays, implementation n of iterators iterators for other STL the implementation of vector iterators is quite simple. Note: the implementatio containers is more involved!
•
•
•
•
•
Thus, begin() returns a pointer to the first slot in the m data array. And end() returns a pointer to the “slot” just beyond the last legal element in i n the m data array (as prescribed in the STL standard). Furthermore, dereferencing a Vec::iterator Vec::iterator (dereferencing a pointer to type T) correctly returns one of the objects in the m data, an object with type T . And similarly, the ++ , -- , < , ==, != , >= , etc. operators on pointers automatically apply to Vec iterators. The erase function function requires requires a bit more attention. attention. We’ve e’ve implemented implemented the core of this function function above. The STL standard further specifies that the return value of erase is an iterator pointing to the new location of the element just after the one that was deleted. Finally, note that after a push back or erase or resize call some or all iterators referring to elements in that vector may be invalidated . Why? Why? You must take take care care when when design designing ing your program program logic logic to avoid avoid invalid invalid iterator bugs!
7
CSCI-1200 Data Structures — Spring 2017 Lecture 10 — Vector Iterators & Linked Lists Review from Lecture 9 •
Explored a program to maintain a class enrollment list and an associated waiting list. Unfortunat Unfortunately ely,, erasing erasing items from the front or middle middle of vectors is ine fficient.
•
•
Iterators can be used to access elements of a vector
•
Iterators and iterator operations (increment, decrement, erase, & insert)
•
STL’s list class Diff erences erences between indices and iterators, di ff erences erences between STL list and STL vector .
•
Today’s Class •
Quick review of iterators
•
Implementation of iterators in our homemade Vec class (from Lecture 8)
•
const and reference on return values
•
Building our own basic basic linked lists: – Stepping through a list – Push back – ... & even more in the next couple lectures!
10.1 10.1 •
Revie Review: w: Iterat Iterators ors and and Iterat Iterator or Operat Operation ionss
An iterator type is defined by each STL container class. For example: std::vector::iterator v_itr; std::vector::iterator std::list::iterator std::list::iterator l_itr; std::string::iterator std::string::itera tor s_itr;
•
An iterator is assigned to a specific location in a container. For example: v_it v_itr r = vec. vec.be begi gin( n() ) + i; l_itr = lst.begin(); s_i s_itr = str. str.b begi egin() n();
// // //
i-th i-th loca locati tion on in a vect vector or first entry in a list first irst char char of a strin tring g
Note: We can add an integer integer to vector vector and string iterators, iterators, but not to list iterators iterators.. •
The contents of the specific entry referred to by an iterator are accessed using the * dereference operator : In the first and third lines, *v itr and *l itr are l-values. In the second, *s_itr is an r-value. *v_itr *v_itr = 3.14; 3.14; cout cout << *s_itr *s_itr << endl; endl; *l_itr *l_itr = "Hello"; "Hello";
•
Stepping Stepping through a containe container, r, either either forward forward and backwar backward, d, is done using increment increment ( ++) and decrement ( --) operators: ++itr;
itr++;
--itr;
itr--;
These These operati operations ons move move the iterato iteratorr to the next next and previou previouss locatio locations ns in the vector vector,, list, list, or string. string. The operations do not change the contents of container! •
Finally, we can change the container that a specific iterator is attached to as long as the types match match. v and w are both std::vector std::vector , then the code: Thus, if v v_itr = v.begin() v.begin(); ; *v_i *v_itr tr = 3.14; 3.14; // chan change ges s 1st entr entry y in v v_itr v_itr = w.begi w.begin() n() + 2; *v_i *v_itr tr = 2.78; 2.78; // chan change ges s 3rd entr entry y in w
std::vector::iterator tor , but if a is a std::vector std::vector works fine because v_itr is a std::vector::itera then v_itr = a.begin() a.begin(); ;
is a syntax error because of a type clash!
10.2 •
Additional Additional Iterato Iteratorr Operations Operations for Vecto Vectorr (& String) String) Iterato Iterators rs
Initialization at a random spot in the vector: v_itr v_itr = v.begi v.begin() n() + i;
Jumping Jumping around around inside the vector vector through addition and subtractio subtraction n of location location counts: counts: v_it v_itr r = v_it v_itr r + 5;
moves p 5 locations further in the vector. These operations are constant time, •
•
O(1)
for vectors.
These operations are not allowed for list iterators (and most other iterators, for that matter) because of the way way the corresponding corresponding containers containers are built. These operations operations would be linear time, O(n), for lists, where n is i s the number of slots jumped forward/backward. Thus, they are not provided by STL for lists. Students are often confused by the di ff erence erence between iterators and indices for vectors. Consider the following declarations: std::vect std::vector > a(10, 2.5); std::vect std::vector::it >::iterato erator r p = a.begin() a.begin() + 5; unsigned unsigned int i=5;
•
Iterator p refers to location 5 in vector a . The value stored there is directly accessed through the * operator: *p = 6.0; cout cout << *p << endl; endl;
•
The above code has changed the contents of vector a . Here’s the equivalent code using subscripting: a[i] a[i] = 6.0; 6.0; cout cout << a[i] a[i] << endl endl; ;
•
Here’s another common confusion: std::l std::list ist t> lst; lst; lst.pu lst.push_ sh_bac back(1 k(100) 00); ; lst.pu lst.push_ sh_bac back(2 k(200) 00); ; lst.push_back(300); lst.push_back(400); lst.push_back(500) lst.push_back(500); ; std::list::iterator itr,itr2,itr3; std::list::iterator itr = lst.be lst.begin gin(); ();// // itr is pointi pointing ng at the 100 ++itr; // itr is now pointing at 200 *itr += 1; // 200 becomes 201 // itr itr += += 1; // doe does s not not comp compil ile! e! can' can't t adva advanc nce e list list ite itera rato tor r like like thi this s itr itr = lst. lst.en end( d(); ); itr--; itr2 itr2 = itritr--; -; itr3 itr3 = --it --itr; r;
// // // //
itr itr itr itr itr itr
is is is is
pointi pointing ng "one "one past the last legal legal value" value" of lst lst now pointing at 500; now now poin pointi ting ng at 400 400, , itr2 itr2 is stil still l point pointin ing g at 500 500 now now poin pointi ting ng at 300 300, , itr3 itr3 is also also poi point ntin ing g at 300 300
// dangerous dangerous: : decrement decrementing ing the begin iterator is "undefined "undefined behavior" behavior" // (simil (similarl arly, y, increm increment enting ing the end iterator iterator is also also undefi undefined ned) ) // it may seem seem to work work, , but but brea break k late later r on this machi machine ne or on anothe another r mach machin ine! e! itr = lst.begin lst.begin(); (); itritr--; -; // dang danger erou ous! s! itr++; asse assert rt (*itr (*itr == 100) 100); ; // might might seem seem ok.. ok... . but but rewr rewrit ite e the the code code to avoi avoid d this this! !
10.3 10.3 •
•
•
STL List: List: Erase Erase (rev (review iew)) & Insert Insert (skip (skipped ped last last time) time) The erase member function (for STL vector and STL list) takes in a single argument, an iterator pointing at an elemen elementt in the contain container. er. It remove removess that that item, item, and the functi function on return returnss an iterat iterator or pointin pointingg at the element after the removed item.
Similarly, there is an insert function for STL vector and STL list that takes in 2 arguments, an iterator and a new element, element, and adds that element element immediate immediately ly befor b eforee the item pointed pointed to by the iterator. iterator. The function function returns an iterator pointing at the newly added element. Even though the erase and insert functions have the same syntax for vector and for list, the vector versions are O(n), whereas the list versions are O(1). 2
•
•
Iterators positioned on an STL vector , at or after the point of an erase operation, are invalidated. Iterators positioned anywhere on an STL vector may be invalid after an insert (or push back or resize) operation. operation. Iterators attached to an STL list are not invalidated after an insert or erase (except iterators attached to the erased element!) or push back /push front .
10.4 10.4
Exer Exerci cise se:: Using Using STL STL list Erase & Insert
Write a function that takes an STL list of integers, lst, and an integer, x. The functio function n should 1) remove remove all negative numbers from the list, 2) verify that the remaining elements in the list are sorted in increasing order, and 3) insert x into the list such that the order is maintained.
10.5 10.5 •
Implem Implemen entin ting g Vec Iterators
Let’s add iterators to our Vec class declaration from Lecture 8: public: // TYPEDEFS TYPEDEFS typedef typedef T* iterator; iterator; typedef typedef const T* const_ite const_iterator rator; ; // MODIFIERS MODIFIERS iterator iterator erase(ite erase(iterator rator p); // ITERATOR ITERATOR OPERATION OPERATIONS S iterat iterator or begin( begin() ) { return return m_data m_data; ; } const_iter const_iterator ator begin() const { return return m_data; m_data; } iterat iterator or end() end() { return return m_data m_data + m_size m_size; ; } const_ const_ite iterat rator or end() end() const const { return return m_data m_data + m_size m_size; ; }
•
First, remember that typedef statements create custom, alternate names for existing types. Vec::iterator is an iterator int * ). Thus, iterator type defined by by the Vec class class.. It is just just a T * (an int Thus, internal to the declarations and member functions, T* and iterator may be used interchangeably.
•
Because the underlying implementation of Vec uses an array, and because pointers are the the “iterator”s of arrays, implementation n of iterators iterators for other STL the implementation of vector iterators is quite simple. Note: the implementatio containe containers rs is more involved! involved! We’ll see how STL list iterators work in a later lecture.
•
•
•
•
Thus, begin() returns a pointer to the first slot in the m data array. And end() returns a pointer to the “slot” just beyond the last legal element in i n the m data array (as prescribed in the STL standard). Vec::iterator (dereferencing a pointer to type T) correctly returns one of Furthermore, dereferencing a Vec::iterator the objects in the m data , an object with type T .
And similarly, the ++, --, <, ==, !=, >=, etc. operators operators on pointers automatic automatically ally apply to Vec iterators. iterators. We don’t need to write any additional functions for iterators, since we get all of the necessary behavior from the underlying pointer implementation. The erase function requires a bit more attention. We’ve implemented a version of this function in the previous lecture. lecture. The STL standard standard further further specifies that the return return value of erase is an iterator pointing to the new location of the element just after the one that was deleted.
3
10.6 10.6 •
Refer Referenc ences es and and Retur Return n Valu Values es
A reference is an alias for another variable. For example: string string a = "Tommy "Tommy"; "; stri string ng b = a; // stri string ng& & c = a; // b[1] b[1] = 'i'; 'i'; cou cout << a << " " << c[1] c[1] = 'a'; 'a'; cou cout << a << " " <<
a new stri string ng is crea create ted d using using the the strin string g copy copy const constru ruct ctor or c is an alias alias/r /ref efer eren ence ce to the stri string ng objec object t a b << " " << c << endl endl; ;
// out outpu put ts:
Tomm Tommy y Tim Timmy Tom Tommy my
b << " " << c << endl endl; ;
// out outpu put ts:
Tamm Tammy y Tim Timmy Tam Tammy my
The reference variable c refers to the same string as variable a . Therefore, when we change c , we change a . •
Exactly the same thing occurs with reference parameters to functions and the return values of functions. Let’s look at the Student class from Lecture 4 again: class class Studen Student t { public: const string& first_name() first_name() const { return return first_name first_name_; _; } const string& last_name() last_name() const { return return last_name last_name_; _; } private: string first_name_; string last_name_; };
•
In the main function we had a vector of students: vector students;
Based on our discussion of references above and looking at the class declaration, what if we wrote the following. Would the code then be changing the internal contents of the i-th Student object? string string & fname fname = students[ students[i].fi i].first_n rst_name() ame(); ; fname[ fname[1] 1] = 'i' •
•
The answer is NO! The Student class member function first_name returns a const reference. reference. The compiler will complain that the above code is attempting to assign a const reference to a non-const reference variable. If we instead wrote the following, then compiler would complain that you are trying to change a const object. const string string & fname fname = students[ students[i].fi i].first_n rst_name() ame(); ; fname[ fname[1] 1] = 'i'
•
•
Hence in both cases the Student class would be “safe” from attempts at external modification. However, the author of the Student class would get into trouble if the member function return type was only a reference, reference, and not a const reference. reference. Then external external users could access access and change change the internal internal contents contents of an object! This is a bad idea in most cases.
10.7 10.7
our Working orking towa towards rds our
version version of the STL list
own own
•
Our discussion of how the STL list is implemented has been intuitive: it is a “chain” of objects.
•
Now we will study the underlying mechanism — linked lists . This will allow us to build custom custom classes that mimic the STL list class, and add extensions and new features (more in the next couple lectures!).
•
10.8 •
Objects with with Poin Pointers ters,, Linking Linking Objects Together ogether
The two fundamental mechanisms of linked lists are: – creating objects with pointers as one of the member variables, and – making these pointers point to other objects of the same type.
•
These mechanisms are illustrated in the following program:
4
template template class class Node Node { public: T value; value; Node* ptr; }; int main() main() { Node Node* >* ll; ll; ll = new new Node Node; >; llll->val >value ue = 6; 6; ll-> ll->pt ptr r = NULL NULL; ;
// // // //
ll is a poi point nter er to a (no (nonn-ex exis iste tent nt) ) Nod Node e Create Create a Node Node and assi assign gn its memo memory ry addres address s to ll This This is the the sam same e as as (*l (*ll l).va ).valu lue e = 6; NUL NULL L == 0, 0, whic which h indi indica cate tes s a "nul "null" l" poi point nter er
Node nt>* * q = new Node nt>; ; q->val q->value ue = 8; q->ptr q->ptr = NULL; NULL; // set ll's ptr member member variab variable le to // poin point t to the the same same thin thing g as vari variab able le q ll->pt ll->ptr r = q;
ll
cout cout << "1st value: value: " << ll->valu ll->value e << "\n" << "2nd "2nd value: value: " << ll->pt ll->ptr-> r->val value ue << endl; endl;
q
value ptr
}
10.9 10.9 •
6
8
value
Defini Definitio tion: n: A Link Linked ed List List
ptr
NULL
The definition is recursive: A linked list is either: – Empty, or – Contains a node storing a value and a pointer to a linked list.
•
The first node in the linked list is called the head node and the pointer to this node is called the head pointer. The pointer’s value will be stored in a variable called head.
10.10 10.10
Visual Visualizi izing ng Linke Linked d Lists Lists head
•
•
value
value
value
value
ptr
ptr
ptr
ptr NULL
The head pointer pointer variable variable is drawn with its own box. It is an individual individual variable. variable. It is important to have have a separate pointer to the first node, since the “first” node may change. The objects (nodes) that have been dynamically allocated and stored in the linked lists are shown as boxes, with arrows drawn to represent pointers. – Note that this is a conceptual conceptual view only. only. The memory locations locations could be anywhere anywhere,, and the actual values values
of the memory addresses aren’t usually meaningful. •
•
The last node MUST have NULL for its pointer value — you will have all sorts of trouble if you don’t ensure this! You should make a habit of drawing pictures of linked lists to figure out how to do the operations.
10.11 10.11
Basic Basic Mech Mechani anisms sms:: Steppi Stepping ng Throug Through h the List List
•
We’d like to write a function to determine if a particular value, stored in x , is also in the list.
•
We can access the entire contents of the list, one step at a time, by starting just from the head pointer. – We will need a separate, local pointer variable to point to nodes in the list as we access them. – We will need a loop to step through the linked list (using the pointer variable) and a check on each value.
5
10.12 10.12
Exerci Exercise: se: Write rite is there
templa template te bool bool is_the is_there( re(Nod Node* >* head, head, const const T& x) {
•
If the input linked list chain contains
10.13 10.13
n elements,
what is the order notation of is there ?
Basic Basic Mech Mechani anisms sms:: Pushin Pushing g on the the Back Back
•
Goal: place a new node at the end of the list.
•
We must step to the end of the linked list, remembering the pointer to the last node. – This is an O(n) operation and is a major drawback to the ordinary linked-list data structure we are
discussing discussing now. We will correct this drawbac drawback k by creating a slightly slightly more complicate complicated d linking structure structure in our next lecture. •
We must create a new node and attach it to the end.
•
We must remember to update the head pointer variable’s value if the linked list is initially empty. – Hence, in writing the function, we must pass the pointer variable by reference.
10.14 10.14
Exerci Exercise: se: Write rite push front
templa template te void void push_f push_fron ront( t( Node* >* & head, head, T const& const& value ) {
•
If the input linked linked list list chain chain contai contains ns push front ?
10.15 10.15
n elemen elements, ts,
what what is the order notati notation on of the implem implemen entat tation ion of
Exerci Exercise: se: Write rite push back
templa template te void void push_b push_back ack( ( Node* >* & head, head, T const& const& value ) {
•
If the input linked linked list list chain chain contai contains ns push back ?
10.16 10.16
n elemen elements, ts,
what what is the order order notati notation on of this this implem implemen entat tation ion of
Next Next time... time... Can we get get better better performan performance ce out of linked linked lists? lists? Yes! 6
CSCI-1200 Data Structures — Spring 2017 Lectures 11 — Doubly Linked Lists Review from Lecture 10 •
Review of iterators, implementation of iterators in our homemade Vec class
•
const and reference on return values
•
Building our own basic basic linked lists: Stepping through a list & push back template template class class Node Node { public: T value; value; Node* ptr; }; – Stepping
head
value
value
value
value
ptr
ptr
ptr
ptr NULL
through a list
template template bool bool is_the is_there( re(Nod Node* >* head, head, const const T& x) { for for (Nod (Node< e T> *p = head head; ; p != NULL NULL ; p = p->p p->ptr tr) ) { if (p->va (p->value lue == x) return return true; } return return false; false; }
Today’s Lecture •
STL STL List List w/ iter iterat ator orss
•
Basic linked list operations, continued: Insert & Remove
•
Common mistakes
•
Limitations of singly-linked lists
•
Doubly-linked lists: – Structure – Insert – Remove
vs. vs.
“hom “homem emad ade” e” linke linked d list list with with Node Node objec objects ts & point pointer erss
11.1 11.1
There are two two parts to this: finding the location where the insert must must take place, and doing the insert operation.
•
We will ignore ignore the find for now. We will also write only a code segment segment to understand understand the mechanism mechanism rather rather than writing a complete complete function. function.
•
•
Basic Basic Mech Mechani anisms sms:: Insert Inserting ing a Node Node
The insert operation itself requires that we have a pointer to the location
insert insert location. location.
p is a pointer to this node, and x holds the value to be inserted, If p inserted, then the following following code will do the insertion. insertion. Draw a picture to illustrate what is happening.
•
Node Node > * q = new Node< Node T>; ; q -> value = x; q -> nex next t = p -> nex next; t; p -> next = q;
•
before the
// // // //
crea create te a new new node node store x in this node make make its its suc succe cess ssor or be be the the curr curren ent t succ succes esso sor r of p make p's successor be this new node
Note: This code will not work if you want to insert x in a new node at the front of the linked list. Why not?
11.2 11.2
Basic Basic Mech Mechani anisms sms:: Remo Removing ving a Node
•
There are two parts to this: finding the node to be removed and doing the remove operation.
•
The remove operation itself requires a pointer to the node before the node to be removed.
•
Removing the first node is an important special case.
11.3 11 .3
Exerc Exercis ise: e: Remo Remove a Node Node
Suppose p points to a node that should be removed from a linked list, q points to the node before p , and head points to the first node in the linked list. Write code to remove p , making sure that if p points to the first node that head points to what was the second node and now is the first after p is removed. Draw a picture of each scenario.
11.4 11 .4
Exerc Exercis ise: e: List List Copy Copy
Write a recursive function function to copy all nodes in a linked list to form an new linked list of nodes with identical structure and values. values. Here’s the function function prototype: prototype: template template void CopyAll(No CopyAll(Node de* * old_head,
11.5 11 .5
Node*& Node*& new_head) new_head) {
Exerc Exercis ise: e: Remo Remove All All
Write a recursive function to delete all nodes in a linked list. Here’s the function prototype: template template void RemoveAll(Nod RemoveAll(Node* e*& & head) {
2
11.6 11.6
Basic Basic Linked Linked Lists Lists Mechan Mechanism isms: s: Common Common Mista Mistake kess
Here is a summary of common mistakes. mistakes. Read these carefully carefully,, and read them again when you have have a problem problem that you need to solve. •
Allocating a new node to step through the linked list; only a pointer variable is needed.
•
Confusing the . and the -> operators.
•
Not setting the pointer from the last node to NULL.
•
Not considering special cases of inserting / removing at the beginning or the end of the linked list.
•
•
•
Applying the delete operator to a node (calling the operator on a pointer to the node) before it is appropriately disconnect disconnected ed from the list. Delete Delete should be done after all pointer pointer manipulations manipulations are completed. completed. Pointer manipulations that are out of order. These can ruin the structure of the linked list. Trying to use STL iterators to visit elements of a “home made” linked list chain of nodes. (And the reverse.... trying to use ->next and ->prev with STL list iterators.)
11.7 11.7
Limita Limitatio tions ns of Singl Singly-Li y-Link nked ed Lists Lists
•
We can only move through it in one direction
•
We need a pointer to the node before the node that needs to be deleted.
•
Appending a value at the end requires that we step through the entire list to reach the end.
11.8 11.8 •
•
General Generaliza izatio tions ns of Singly Singly-Li -Link nked ed Lists Lists
Three common generalizations: – Doubly-linked: allows forward and backward movement through the nodes – Circularly linked: simplifies access to the tail, when doubly-linked – Dummy header node: simplifies special-case checks Today we will explore and implement a doubly-linked structure.
11.9 11.9 •
Transiti ransition on to a doubl doubly-l y-link inked ed list list
The revised Node class has two pointers, one going “forward” to the successor in the linked list and one going “backwa “backward” rd” to the predecess predecessor or in the linked list. We will have have a head pointer to the beginning and a tail pointer to the end of the list. templa template te class class Node Node { public: Node() Node() : next_(NULL next_(NULL), ), prev_(NULL prev_(NULL) ) {} Node(cons Node(const t T& v) : value_(v) value_(v), , next_(NULL next_(NULL), ), prev_(NULL prev_(NULL) ) {} T value_; value_; Node* Node* next_; next_; Node* Node* prev_; prev_; };
•
First we’ll reimplement reimplement some of the basic mechanism mechanismss we’ve we’ve already already worked through through for singly-linked singly-linked lists. In the next lecture we’ll build the full ds list class and will define the list iterators as a class inside a class.
11.10 11.10 •
The Struct Structure ure of Doub Doublyly-Lin Linke ked d Lists Lists
Here is a picture of a doubly-linked list holding four integer values:
head tail value 13
value
next
next
NULL
•
•
prev
value
1
3
next prev
value next
prev
9 NULL
prev
Note that we now assume that we have both a head pointer, as before and a tail pointer variable, which stores the address of the last node in the linked list. The tail pointer is not strictly necessary, but it allows immediate access to the end of the list for e fficient push-back operations. 3
11.11 11.11 •
Inserti Inserting ng in the Middl Middle e of a Doubly-Li Doubly-Link nked ed List
Suppose we want to insert a new node containin containingg the value value 15 following following the node containin containingg the value value 1. We have a temporary pointer variable, p, that that stores stores the address address of the node contai containin ningg the value value 1. Here’s Here’s a picture of the state of a ff airs: airs: p
head tail
value 13
value
next
next
NULL prev •
value
1
value
3
next prev
next prev
9 NULL
prev
What must happen? – The – Its
new node must be created, using another temporary pointer variable to hold its address.
two pointers must be assigned.
– Two
pointers in the current linked list must be adjusted. Which ones?
Assigning the pointers for the new node MUST occur before changing the pointers for the current linked list nodes! •
•
At this point, we are ignoring the possibility that the linked list is empty or that p points to the tail node ( p pointing to the head node doesn’t cause any problems). Exercise: write
11.12 11.12 •
•
•
•
•
Remov Removing ing from from the Middl Middle e of a Doubly-Li Doubly-Link nked ed List
Suppose now instead of inserting a value we want to remove the node pointed to by p (the node whose address is stored in the pointer variable p ) Two pointers pointers need to chang changee before before the node is delete deleted! d! All of them them can be access accessed ed through through the pointer pointer variable p. Exercise: write
11.1 11 .13 3 •
the code as just described.
this code.
Speci Special al Case Casess of Rem Remo ove
If p==head and p==tail, the single node in the list must be removed and both the head and tail pointer variables must be assigned the value NULL. If p==head or p==tail, then the pointer adjustment code we just wrote needs to be specialized to removing the first or last node. Next lecture we’ll write the erase function as part of our implementation mimicing the STL list class.
4
CSCI-1200 Data Structures — Spring 2017 Lecture 12 — List Implementation • •
Exam 2 will be Monday evening March 6th from 6-8pm. Practice problems are available on the calendar. Your exam room & zone assignment will be posted on the homework submission site by the end of the week. Note: We are re-shu re-shu ffl ing ing the room & zone assignments from Exam 1.
Review from Lecture 11 •
Limitations of singly-linked lists
•
Doubly-linked lists: Structure, Insert, & Remove – Note: We didn’t finish all of the special/corner cases for remove from a doubly-linked list. Does it matter? Story time....
Today’s Lecture •
Our own version of the STL list class, named dslist
•
Implementing list iterators
12.1 12 .1
The The dslis dslistt Clas Classs — Overvi Overview ew We will write a templated class called dslist that implements much of the functionality of the std::list container and uses a doubly-linked list as its internal, low-level data structure.
•
•
Three classes are involved: the node class, the iterator class, and the dslist class itself.
•
Below is a basic diagram showing how these three classes are related to each other:
dslist Node* head_: Node* tail_: int size_: 3
list_iterator Node* ptr_:
Node float value_: 3.14 Node* next_: Node* prev_: NULL
•
float value_: 6.02 Node* next_: Node* prev_:
Node float value_: 1.61 Node* next_: NULL Node* prev_:
For each list object created by a program, we have one instance of the dslist class, and multiple instances of the Node. For each iterator variable (of type dslist::iterator dslist::iterator ) that is used in the program, we create an instance of the list_iterator list_iterator class.
12.2 12 .2 •
Node
The The Node Node Clas Classs
It is ok to make all members public because individual nodes are never seen outside the list class. (Node objects are not accessible to a user through the public dslist interface.)
•
•
Another option to ensure the Node member variables stay private would be to nest the entire Node class inside of the private section of the dslist declaration. We’ll see an example of this later in the term. Note that the constructors initialize the pointers to NULL.
12.3 12.3
The Iterat Iterator or Class Class — Desired Desired Func Functio tional nalit ity y
•
Increment and decrement operators (operations that follow links through pointers).
•
Dereferencing to access contents of a node in a list.
•
Two comparison operations: operator== and operator!= .
12.4 12.4
The Iter Iterato ator r Class Class — Imple Implemen mentat tation ion
•
Separate class.
•
Stores a pointer to a node in a linked list.
•
Constructors initialize the pointer — they will be called from the dslist class member functions. – dslist is a friend class to allow access to the iterators ptr_ pointer variable (needed by dslist member functions such as erase and insert).
•
operator* dereferences the pointer and gives access to the contents of a node. (The user of a dslist class is never given full access to a Node object!)
•
Stepping through the chain of the linked-list is implemented by the increment and decrement operators.
•
operator== and operator!= are defined, but no other comparison operators are allowed.
12.5 12 .5 •
•
The The dslis dslistt Clas Classs — Overvi Overview ew
Manages the actions of the iterator and node classes. Maintains the head and tail pointers and the size of the list. (member variables: head_, tail_, size_)
•
Manages the overall structure of the class through member functions.
•
Typedef for the iterator name.
•
Prototypes for member functions, which are equivalent to the std::list member functions.
•
const_iterator and reverse_iterator reverse_iterator . Some things are missing, most notably const_iterator
12.6 12.6 •
•
Many short functions are in-lined Clearly, it must contain the “big 3”: copy constructor, operator= , and destructor destructor.. The details of these are realized through the private copy_list and destroy_list member functions.
12.7 12.7
C++ Temp Templat late e Implemen Implementat tation ion Detail Detail - Using Using typename dslist::iterator can confuse the compiler The use of typedefs typedefs within a templated templated class, for example example the dslist::iterator because it is a template-parameter dependent name and is thus ambiguous ambiguous in some contexts. contexts. (Is it a value value or is it a type?)
•
•
The dslis dslistt class class — Implemen Implementat tation ion Detai Details ls
If you get a strange error during compilation (where the compiler is clearly confused about seemingly clear and logical code), you will need to explicitly let the compiler know that it is a type by putting the typename keyword in front of the type. For example, inside of the operator== function: typename typename dslist dslist::ite ::iterator rator left_itr left_itr = left.begin left.begin(); ();
•
Don’t worry worry,, we’ll we’ll never never test you on where this keyword keyword is needed. needed. Just be prepared prepared to use it when working on the homework.
12.8 12 .8
Exerc Exercis ises es
dslist::push_front 1. Write dslist::push_front dslist::erase 2. Write dslist::erase
2
2 { ) v & T t s n o c
{
, r t i
) r t i r o t a r e t i ( e s a r e : : > T < t s i l s d
r o t a r e t i ( t r e s n i : : > T < t s i l s d
r o t a r e t i : : > > T T < s t s s a i l l c s < d
r o t a r e t i : : > > T T < s t s s a i l l c s < d
e e t m a a l n p e h m p . e y t t t
e e t m a a l n p e m p e y } t t
s i l s d
{ ) d l o & > T < t s i l s d
{
t s n o c ( t s i l _ y p > o T c : s : s > a T l < c t < s i e l t s a d l p d m i e o } t v
) ( t s i l _ y o r t s > e T d : s : s > a T l < c t < s i e l t s a d l p d m i e o } t v
}
f i d n e #
}
7 4 1 4 / : 7 4 0 2 / : 2 6 0 1
{ ) d l o & > T < t s i l s d { t ) s v n o & c T ( t = s r t n o n o - N t e c - O a m ( - I r n t - T e g ; ; n - A p i ) ) o - T o s ( d r - N : s t l f - E : a { s o _ - M > - i ( h - E T f ) l t s - L > < l s _ s > u - P T t e i y i T p - M s s h o l : - I s i t r _ ; s : - s l r t y s s > - S a s o = s p i a T - S l d f ! e o h l < - A c d c t c t - L < & k d > > * < s - C > c l - i - e T e o s s n e l - T t < h & i i r t s - S a t c ( h h u a d - I l s t t t l - L p i / f e p d m l / i } r } r m i / / e s e o / / t d } t v
; ) t h g i r = = t f e l ( !
{ ) v & T
{ ) ( t n o r f _ p > o T p : s : s > a T l < c t < s i e l t s a d l p d m i e o } t v
t s n o c ( k c a b _ h s > u T p : s : s > a T l < c t < s i e l t s a d l p d m i e o } t v
{ ) ( k c a b _ p > o T p : s : s > a T l < c t < s i e l t s a d l p d m i e o } t v
n r ; e u ) u t ; ( l e ) n a r { ( i v ? n g { ) ) i e d ) s t g b e t t h e . h h n g ; b t c g e i e . h t i t r s t g a ; r n l f i m e o & a e r s s & c > f l i l > T = m a T & < n = f < t r r a t h s u r t n s t i t t i r r i g l e i _ o u l n s r _ t f t s e d t h e d l ) f g g { r ( , ) e i n , t ( l r i ) ) t e f e k ) r f m e z r r o ( t e a l i o o o d i l s l n _ ; s t t & . a a e t + & e > t r r , . h + > h T h e e s t g r T t < g t t t f i t < t i i i s e r i t k s r : : i l * _ s o i : : l t i o l = > > = = h l l s h ! T T ! ! g s > d < < t i > d s T ( ) t t o r r r T ( t ( s s b t t s s = e i i i i ; s = i s = z l l r _ _ + ; s ! l a r i s s e t t + e a r l o s d d v f f r u l o e c t . o e e t r c t s < a t e e l l i t < a e r f m m k ( * _ r h e e e a a l ( t n e e t t p l n n a e f r t p a o ( e e w l f e u a o o l p p i i l t l d p l f y y / h e p l m o i t t / w } r } r m o / e o e o } / t b } b t
CSCI-1200 Data Structures — Spring 2017 Lecture 13 — Advanced Recursion Announcemen Announcements: ts: Test 2 Information Information •
•
•
•
•
•
Test 2 will be held Monday Monday,, Mar. 6th from 6-8pm. Your test room & zone assignment is posted on the homework submission site. Note: We have re-shu ffl ed ed the room & zone assignments from Test 1. No make-ups will be given except for emergency situations, and even then a written excuse from the Dean of Students or the O ffice of Student Experience will be required. Coverage: Lectures 1-13, Labs 1-7, HW 1-5. Closed-book and closed-notes except for 1 sheet of notes on 8.5x11 inch paper (front & back) that may be handwritten or printed . Computers Computers,, cell-phone cell-phones, s, palm pilots, calculators, calculators, PDAs, music players, players, etc. are not permitted and must be turned o ff . All students must bring their Rensselaer photo ID card. Practice problems problems from previous tests are availabl availablee on the course website. website. Solutions Solutions to the problems problems will b e posted on Friday afternoon.
Test Taking Skills •
•
•
Look at the point values for each problem, allocate time proportional to the problem points. (Don’t spend all of your time on one problem and neglect other big point problems). Look at the size of the answer box & the sample solution code line estimate for each problem. If your solution is going to take a lot more space than the box allows, we are probably looking for the solution to a simpler problem or a simpler solution to the problem. Going in to the test, you should know what big topics will be covered covered on the test. As you skim through the problems, problems, see if you can match match up those big topics to each question. question. Even Even if you are stumped stumped about how to solve the whole problem, or some of the details of the problem, make sure you demonstrate your understanding of the big topic that is covered in that question.
•
Re-read the problem statement carefully. Make sure you didn’t miss anything.
•
Ask questions during the test if something is unclear.
Review from Lecture 11 & Lab 7 •
•
Limitations of singly-linked lists Doubly-linked lists: – Structure – Insert – Remove
•
Our own version of the STL list class, named dslist
•
Implementing list::iterator list::iterator
•
Importance of destructors & using Dr. Memory / Valgrind to find memory errors
•
Decrementing the end() iterator
Today’s Lecture •
•
•
Review Recursion vs. Iteration – Binary Search “Rules” for writing recursive functions Advanced Recursion — problems that cannot be easily solved using iteration (for or while loops): – Merge sort – Non-linear maze search
13.1 13.1
Every* Every* recursive recursive function function can also be written written iterative iteratively ly.. Sometimes Sometimes the rewrite rewrite is quite simple and straight straight-forward. Sometimes it’s more work.
•
Often writing recursive recursive functions functions is more natural natural than writing iterative iterative functions, functions, especially for a first draft of a problem implementation.
•
•
•
You should learn how to recognize whether an implementation is recursive or iterative, and practice rewriting one version as the other. Note: The order notation for the number number of operations operations for the recursive recursive and iterative iterative versions versions of an algorithm algorithm is usually the same. Howeve Howeverr in C, C++, Java, Java, and some other languages, languages, iterative functions are generally faster than their correspondi corresponding ng recursive recursive functions . This This is due to the overhe overhead ad of the functi function on call mechamechanism. Compiler Compiler optimizations optimizations will sometimes sometimes (but not always!) always!) reduce reduce the performance performance hit by automatically automatically eliminating the recursive function calls. This is called tail call optimization .
13.2 13 .2 •
Review Review:: Iteration Iteration vs. Recursi Recursion on
Bina Binary ry Searc Search h
std::vector v (for a placeholder type T ), sorted so that: Suppose you have a std::vector v[0] v[0] <= v[1] v[1] <= v[2] v[2] <= ... ...
•
•
Now suppose that you want to find if a particular value x is in the vector vector somewhere. somewhere. How can you you do this without looking at every value in the vector? The solution is a recursive algorithm called binary search , based on the idea of checking the middle item of the search interval within the vector and then looking either in the lower half or the upper half of the vector, depending on the result of the comparison. template template bool bool binsea binsearch rch(co (const nst std::vec std::vector tor &v, int low, int high, high, const const T &x) { if (high == low) low) return return x == v[low]; v[low]; int int mid mid = (low (low+h +hig igh) h) / 2; if (x <= v[mi v[mid] d]) ) return return binsearch binsearch(v, (v, low, mid, x); else return return binsearch binsearch(v, (v, mid+1, mid+1, high, x); } template template bool binsearch(co binsearch(const nst std::vector > &v, const T &x) { return return binsearch binsearch(v, (v, 0, v.size()v.size()-1, 1, x); }