B.E. Project
ONLINE JUDGE WITH SOFTWARE CODE QUALITY ANALYSIS Submitted by: SURAJ GUPTA (348/CO/11) TUSSHAR SINGH (354/CO/11) UJJWAL RELAN (356/CO/11) Under the guidance of DR. SHAMPA CHAKRAVERTY
DISSERTATION SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE DEGREE OF BACHELOR OF ENGINEERING (COMPUTER ENGINEERING)
DEPARTMENT OF COMPUTER ENGINEERING Netaji Subhas of Technology University of Delhi
2014 - 2015
SELF-DECLARATION The project entitled “Online Judge with Software Code Quality Analysis” is a bonafide work carried out by Suraj Gupta, Tusshar Singh and Ujjwal Relan in Department of Computer Engineering, Netaji Subhas Institute of Technology, Delhi under the supervision and guidance of Dr. Shampa Chakraverty in partial fulfilment of the requirement for the Degree of Bachelor of Engineering in Computer Engineering, University of Delhi for the year 2014-15. The content of this report is to the best of our knowledge and hasn’t been used for any other academic activity. Date:
Suraj Gupta (348/CO/11)
Tusshar Singh (354/CO/11)
1
Ujjwal Relan (356/CO/11)
CERTIFICATE This is to certify that the project entitled “ONLINE “ ONLINE JUDGE WITH SOFTWARE CODE QUALITY ANALYSIS” is a bonafide work done by Mr. Suraj Gupta, Mr. Tusshar Singh and Ms. Ujjwal Relan, students of eighth semester, B.E. Computer Engineering, Netaji Subhas Institute of Technology, Delhi. This project work has been prepared as a partial fulfilment of the requirements for the award of the degree of Bachelor of Engineering in Computer Engineering, University of Delhi, in the academic year 2014-2015. This work has not been presented for any other academic purpose before. I wish them luck for all their future endeavours.
Date: 1 June June 2015
Dr. Shampa Chakraverty Professor & Head, Deptt. of Computer Engineering Netaji Subhas Institute of Technology
2
ACKNOWLEDGEMENT We would like to take this opportunity to acknowledge the support of all those without whom the project wouldn’t have been possible. We sincerely thank our mentor, Dr. Shampa Chakraverty for her guidance, criticism and encouragement which led to the completion of the project. The regular brainstorming sessions with her gave us a deep insight into the topic and evolved our ideas. We thank her for giving us this opportunity and supporting us throughout. We would also like to express our gratitude to the lab assistants in CADLAB for cooperating with us and providing us with the required resources. We wish to express heartfelt gratitude to our college, Netaji Subhas Institute of Technology for giving us this opportunity for research and development. Lastly, we would also like to thank our parents who supported us through our thick and thin. Their trust in our capabilities helped us in giving our best.
3
INDEX SELF DECLARATION…………………………………………………………………......1 CERTIFICATE……….…………………………………………………………………......2 ACKNOWLEDGEMENT.……….………………………………………………………....3 LIST OF TABLES……….………………………………………………………………....7 LIST OF FIGURES……….………………………………………………………………..8 ABSTRACT……….………………………………………………………………………..10 CHAPTER 1: INTRODUCTION……….…………………………………………………12 1.1 Objective……….…………………………………………………………..12 1.2 Motivation…………………………………………………………………..13 1.2.1 Importance of an Online Judge…………………………………..13 1.2.2 Importance of Source Code Comments………………………...13 1.3 Organisation of Chapters…………………………………………………15 CHAPTER 2: LITERATURE SURVEY…………………………………………………..16 2.1 Web Application……………………………………………………………16 2.2 Online Judge……………………………………………………………….16 2.3 Ruby on Rails……………………………………………………………....17 2.3.1 Why Ruby on Rails Over Other Languages…………………….18 2.3.2 Limitations of Ruby on Rails……………………………………...18 2.4 Advantages of C++ on the Backend…………………………………….20 2.5 Python as a Scripting Language………………………………………...20 2.6 Amazon Web Services…………………………………………………...21 2.6.1 Amazon Simple Storage Service………………………………...21 2.6.2 Amazon Elastic Cloud Computing……………………………….22 2.7 Plagiarism Detection……………………………………………………...22 2.7.1 Different Forms of Plagiarism…………………………………….22 2.7.1.1 Collusion………………………………………………...22 2.7.1.2 Unacknowledged Reverse Engineering……………..23 2.7.1.3 Unacknowledged Translation………………………...23 2.7.1.4 Unacknowledged Code Generation………………….23 2.7.1.5 No Reuse without Test………………………………..24 2.8 Static Code Quality Analysis and Its Tools…………………………….24 2.8.1 Some Well Known Open Source Tools for Static Code Analysis……………………………………………………..25 2.8.1.1 GCC……………………………………………………..26 2.8.1.2 Cppcheck……………………………………………….27 2.8.1.3 CPPLINT………………………………………………..28 2.8.2 Commercial Tools………………………………………………...29 2.9 Source Code Comments as a Metric of Software Quality……………30 2.9.1 Comment Analysis………………………………………………..30 2.10 GIT - An Open Source Version Control System……………………….32 2.11 WEKA - An Open Source Data Mining Tool…………………………...32 CHAPTER 3: PROPOSED DESIGN…………………………………………………....34 3.1 Software Requirement Specification for the Online Judge…………...34 4
3.2
3.1.1 Introduction………………………………………………………...34 3.1.1.1 Purpose………………………………………………....34 3.1.1.2 Scope…………………………………………………...34 3.1.1.3 Definitions, Acronyms, and Abbreviations…………..34 3.1.1.4 Overview………………………………………………..34 3.1.2 General Description ……………………………………………...35 3.1.2.1 Product Perspective…………………………………...35 3.1.2.2 Product Functions……………………………………..35 3.1.2.3 User Characteristics…………………………………...35 3.1.2.4 General Constraints and Features…………………...36 3.1.3 Specific Requirements…………………………………………....36 3.1.3.1 External Interface Requirements……………………..36 3.1.3.1.1 User Interfaces…………………………….36 3.1.3.1.2 Hardware Interfaces……………………....37 3.1.3.1.3 Software Interfaces……………………….37 3.1.3.2 Functional Requirements……………………………...37 3.1.3.2.1 User management………………………...37 3.1.3.2.2 Code Evaluation…………………………...37 3.1.3.2.3 Contest Management……………………..38 3.1.3.2.4 Plagiarism Analysis ……………………….38 3.1.3.2.5 Static Code Analysis………………………38 3.1.3.3 Non-Functional Requirements………………………...39 3.1.4 Other Requirements……………………………………………….39 3.1.4.1 Database………………………………………………...39 Comment Classification…………………………………………………...39 3.2.1 Comment Categories……………………………………………...39 3.2.2 Extracting Features for Classification……………………………40
CHAPTER 4: IMPLEMENTATION……………………………………………………….42 4.1 Programming Languages and Tools…………………………………….42 4.2 Components………………………………………………………………..42 4.2.1 Implementation of Interface and Codeshell……………………..42 4.2.1.1 Interfaces for the “User”………………………………..42 4.2.1.2 Interfaces for the “Admin”……………………………...47 4.2.2 Integration of Client and Server………………………………….49 4.2.2.1 Using AWS S3 Buckets………………………………..49 4.2.2.2 Using Python Scripts at the Backend………………...50 4.2.3 Implementation of Codechecker………………………………....52 4.2.4 Implementation of Plagiarism Detection………………………...56 4.2.5 Integration with Cppcheck………………………………………...58 4.2.6 Comment Classifier………………………………………………..59 4.2.6.1 Finding Dataset…………………………………………59 4.2.6.2 Creating .arff…………………………………………….59 4.2.6.3 Good vs Bad Commenting…………………………….59 4.2.7 Using GIT…………………………………………………………..60 CHAPTER 5: OBSERVATIONS AND RESULTS……………………………………...61 5.1 Results of Codechecker…………………………………………………..61 5
5.2 5.3 5.4
Plagiarism Detection Results…………………………………………….64 Cppcheck Results………………………………………………………....65 Comparison with Different Algorithms…………………………………..66
CHAPTER 6: CONCLUSION AND FUTURE WORK………………………………....67 6.1 Conclusion…………………………………………………………………67 6.2 Limitations………………………………………………………………….67 6.3 Future Scope………………………………………………………………68 REFERENCES…………....……………………………………………………...............70
6
LIST OF TABLES
CHAPTER 3 Table 3.1: MACHINE LEARNING FEATURES FOR COMMENTS
CHAPTER 4 Table 4.1 COMMANDS USED IN CODECHECKER Table 4.2 CODE SNIPPET USED FOR EXECUTING CODE
CHAPTER 5 Table 5.1 COMPARISON OF DIFFERENT MACHINE LEARNING ALGORITHMS
7
LIST OF FIGURES CHAPTER 4 Figure 4.1: SCREENSHOT OF LOGIN PAGE Figure 4.2: SCREENSHOT OF SIGNUP PAGE Figure 4.3: SCREENSHOT OF HOME PAGE Figure 4.4: SCREENSHOT OF CONTEST PAGE Figure 4.5: SCREENSHOT OF SUBMIT PAGE Figure 4.6: SCREENSHOT OF SUBMISSIONS PAGE Figure 4.7: SCREENSHOT OF VIEW PROFILE Figure 4.8: SCREENSHOT OF EDIT PROFILE Figure 4.9: SCREENSHOT OF REQUEST ADMIN RIGHTS PAGE Figure 4.10: SCREENSHOT OF STATIC CODE QUALITY CHECK OPTION Figure 4.11: SCREENSHOT OF CREATE CONTEST PAGE Figure 4.12: SCREENSHOT OF EDIT CONTEST PAGE Figure 4.13: SCREENSHOT OF EDIT PROBLEM PAGE Figure 4.14: SCREENSHOT OF ADDING ADMIN Figure 4.15: SCREENSHOT OF LIST OF S3 BUCKETS USED Figure 4.16: INTERACTION OF PYTHON SCRIPTS IN BACKEND Figure 4.17: ORDER OF STEPS IN CODECHECKER Figure 4.18: ORDER OF EXECUTION IN CODECHECKER Figure 4.19: SOURCE CODE FILES USED FOR PLAGIARISM DETECTION Figure 4.20: SOURCE CODE FILES USED FOR STATIC CODE ANALYSIS USING CPPCHECK Figure 4.21: GITHUB REPOSITORY USED FOR THE PROJECT
CHAPTER 5 Figure 5.1: WHEN CODE GETS ACCEPTED Figure 5.2: WHEN CODE GIVES WRONG ANSWER Figure 5.3: WHEN CODE GIVES COMPILATION ERROR Figure 5.4: WHEN CODE EXCEEDS TIME LIMIT 8
Figure 5.5: WHEN CODE GIVES RUNTIME ERROR Figure 5.6: RESULTS AFTER MOSS BASED PLAGIARISM CHECK Figure 5.7: RESULTS AFTER CPPCHECK ANALYSES THE CODE
9
ABSTRACT The digital world revolves around programming. Everything around us involves some kind of programming today. Writing a software is about innovation, creativity and expression but it needs to have some quality attributes like understandability, reliability, modifiability, scalability, reusability and others so that the software’s life can be extended and the previous effort doesn’t go waste. So, we have created an online judge which checks the algorithmic correctness along with complexity and does the code quality analysis, thus helping in enhancement of one’s analytical and programming skills. We have checked code quality through Static Code Analysis and Comment classification. As we know the cost of correction increases as we go down the Software Development Lifecycle so Static Code Analysis helps in detecting errors early. It is really useful in maintaining customised company standards and also, for enhancing the quality of large codebases. Static Code Analysis done before deployment can prevent huge failures. Source Code Comments also play a very important role in enhancing the quality of a code as they increase the understandability, modifiability and reusability of the code. Source Code Comments help team members to work collaboratively and if any developer leaves the organisation, the code can still be updated and managed by others. We have implemented the following main functionalities in our project: Contest Management, Code Evaluation, Plagiarism Detection, Comment Classification and Static Code Analysis.
10
Contest Management includes contest creation, problem addition and participation by users. We have integrated the front end and the back end using AWS S3 Bucket which act as an intermediate storage platform for the server and the client side.
We have checked the algorithmic correctness of the codes using test cases and algorithmic complexity by imposing time and memory limits. The user’s code is executed against some predefined test cases by matching the user’s output with the expected output and it should pass within the time and memory limit set in order to get accepted. This is how we have done the Code Evaluation.
We have checked for Plagiarism amongst the accepted codes using the concept of Stanford’s MOSS (Method of Software Similarity). It uses Winnowing, a local algorithm for Document Fingerprinting.
Static Code Analysis looks for errors which the compiler can’t catch like uninitialized memory and null pointers. We have integrated the online judge with cppcheck, a well known Static Code Analysis tool for C/C++.
We have classified the comments into six categories on the basis of their context: Header, Task, Code, Section, Interface and Inline. This classification helps in analysing the importance given to comments in the given code. We have performed the classification by using the Weka tool. We created our own dataset and then applied different Supervised Machine Learning algorithms in order to identify the best classifier.
11
1. Introduction Computer programming is a collection of instructions in a human-readable programming language that solve a particular problem. Programming involves activities such as analysis, developing understanding, generating algorithms, verification of requirements of algorithms including their correctness and resources consumption, and implementation (commonly referred to as coding) of algorithms in a target programming language. Related tasks include testing, debugging, and maintaining the source code, implementation of the build system, and management of derived artifacts such as machine code of computer programs . These might be considered part of the programming process, but often the term " software development" is used for this larger process with the term "programming", "implementation", or "coding" reserved for the actual writing of source code.Everything around us involves some kind of programming today ([1] and [2]). Due to the increasing need of good programming, the need of good competitive programming platforms in also increasing. These platforms help in enhancing the analytical and problem solving skills of the programmers.
1.1 OBJECTIVE Following are the objectives of this project: i. Create an online judge with the following functionalities: a. It allows the admin to manage contests and set problems. b. It allows users to submit solutions for the contest problems.
12
c. Every submitted code is checked against predefined test cases and is accepted only if the generated output matches the expected output and this happens within the predefined memory and time limit. d. All the accepted codes are checked for Plagiarism. e. It provides an interface where the user can assess the static code quality of the code. ii. Create a classifier which categorises comments on the basis of their context.
1.2 MOTIVATION 1.2.1 IMPORTANCE OF AN ONLINE JUDGE Programming skills are becoming ever more important, quickly turning into the core competency for all kinds of 21st Century workers. That inescapable fact is leading individuals to seek out new ways of learning to code, startups and nonprofits to find ways to help them and businesses to search for innovative approaches to finding the coders they so desperately need[3]. Competitive programming is playing a great role in spreading this programming culture amongst the beginners. So, the motivation of this project was to create an online judge which can help a programmer enhance the following capabilities: 1. Problem Solving Skills 2. Optimised use of resources 3. Efficient and Quick solving 4. Writing codes of good quality 5. Debugging and Testing Skills
1.2.2 IMPORTANCE OF SOURCE CODE COMMENTS
13
Along with the algorithmic correctness of the code, it need to be understandable, maintainable and reusable. Source code comments play an important role in making such a code.
Program comments within and between modules and procedures usually convey information about the program, such as the functionality, design decisions, assumptions, declarations, algorithms, nature of input and output data, and reminder notes. Considering that the program source code may be the only way of obtaining information about a program, it is important that the programmers should accurately record useful information about these facets of the program and update them as the system changes. Common types of comments used are prologue comments and inline comments. Prologue comments precede a program or module and describe goals. In-line comments, within the program code, describe how these goals are achieved. The comments provide information that the understander can use to build a mental representation of the target program. For example, in Brooks' top-down model, comments which act as beacons - help the programmer not only form hypothesis, but refine them to closer representations of the program. Thus, theoretically there is a strong case for commenting programs. The importance of comments is further strengthened by evidence that the lack of good comments in programs constitutes one of the main problems that programmers encounter when maintaining programs. It has to be pointed out that comments in programs can be useful only if they provide additional information. In other words, it is the quality of the comment that is important, not its presence or absence[4].
In our project, we have classified the comments into six categories: Header, Task, Section, Inline, Code and Interface. By determining the proportion of these comments in the code, the quality of the code can be detected.
14
1.3 ORGANISATION OF CHAPTERS In the upcoming chapters, we will provide more details to provide an insight into the work done. Chapter 2 talks about the literature survey conducted before implementing each functionality. In chapter 3,we explain the detailed design of the online judge by giving its software requirement specification and also, the method used for comment categorisation. In chapter 4, we talk about the implementation details of the project. Chapter 5 covers the analysis, observations and the results. Finally, in chapter 6 we present the limitations, future scope and the conclusion of our work.
15
2. Literature Survey This chapter talks in detail about the theoretical and practical concepts explored before starting with the project.
2.1 WEB APPLICATION A web application is any application that uses a web browser as a client. The application can be as simple as a message board or a guest sign-in book on a website, or as complex as a word processor or a spreadsheet. A web application relieves the developer of the responsibility of building a client for a specific type of computer or a specific operating system. Since the client runs in a web browser, the user could be using an IBM-compatible or a Mac. They can be running Windows XP or Windows Vista. They can even be using Internet Explorer or Firefox, though some applications require a specific web browser.
Web applications commonly use a combination of server-side script (ASP, PHP, etc) and client-side script (HTML, Javascript, etc.) to develop the application. The client-side script deals with the presentation of the information while the server-side script deals with all the hard stuff like storing and retrieving the information.[5]
2.2 ONLINE JUDGE Online judges provide a platform for competitive programming. Codechef,Topcoder, and Codeforces are some of the well known online judges. These sites have high
16
quality of problems and also allow you to see other’s code post contest completion. These also categorize problems based on the topic.[6] These online judges help one in like acing a language (your primary tool for coding), design skills (patterns/oops et.al), writing readable/maintainable code, debugging techniques, system knowledge, and domain specific knowledge.
2.3 RUBY ON RAILS Ruby is a language of careful balance. Its creator, Yukihiro “Matz” Matsumoto, blended parts of his favorite languages (Perl, Smalltalk, Eiffel, Ada, and Lisp) to form a new language that balanced functional programming with imperative programming. Ruby is an object oriented language. [8] Rails is a web application development framework written in the Ruby language. It is designed to make programming web applications easier by making assumptions about what every developer needs to get started. It allows you to write less code while accomplishing more than many other languages and frameworks. Having a standard framework, it allows fast development.
Rails is opinionated software. Rails thinks that there is a “best” way to do things and in order to get the most of of Rails, one should follow that method.
The Rails philosophy includes two major guiding software engineering principles:
●
Don't Repeat Yourself: DRY is a principle of software development which states that "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system." By not
17
writing the same information over and over again, our code is more maintainable, more extensible, and less buggy. ●
Convention Over Configuration: Rails has opinions about the best way to do many things in a web application, and defaults to this set of conventions, rather than require that you specify every minutiae through endless configuration files. [9]
2.3.1 WHY RUBY ON RAILS OVER OTHER LANGUAGES?
RoR is preferred over other languages as it provides:
●
It provides a standard framework so designing a Web Application using Ruby on rails is easier.
●
It has a clear code structure which allows faster development.
●
It follows design patterns which encourage less code redundancy.
●
It allows code reusability through the use of gems.
●
RoR allows faster design and development.
●
Ruby code is very readable and self documenting.
●
Rails and most of its libraries are open source unlike other standard frameworks, thus saving on the license cost.
2.3.2 LIMITATIONS OF RUBY ON RAILS
●
Not all website hosts can support Rails.
While it is true that not all web hosts support Rails, this is primarily because it can be more resource intensive than PHP, a fact which deters low-end shared-hosting
18
providers. However, Rails-friendly hosts do exist, for example, Heroku and EngineYard.
Alternatively, you can host your Rails application on a Virtual Private Server (VPS) with Amazon EC2, Rackspace, or Linode. You will then have full control over the server and can allocate sufficient resources for your application.
●
Java and PHP are more widely used, and there are more developers in these languages
The number of Ruby developers is growing year on year as more people switch to it from other programming languages. One of the main differences between the Ruby and other communities is the amount of open source code (gems) which is publicly available, as of writing there are 63,711 gems which you can use to enhance your application.
●
Performance and Scalability
There have been concerns that Rails applications are not as fast as Java or C, which is true, but for the majority of applications it is fast enough. There are plenty of high profile organisations which rely on Rails to power their sites including AirBnB, Yellow Pages, Groupon, Channel 5, and Gov.uk.
There is also the option of running your application under JRuby and then you have the same performance characteristics as Java. [10]
19
2.4 ADVANTAGES OF C++ ON THE BACKEND
C++ is a general purpose programming language. It supports procedural as well as object oriented paradigm. It is easy to learn as it is quite similar to C.
Using C++ on the backend can be really advantageous as it is faster than the scripting languages, it gives optimised performance in terms of memory and CPU, and it is integrable with other scripting languages. It requires relatively less memory space and is closer to low level languages which makes it extremely fast.
2.5 PYTHON AS A SCRIPTING LANGUAGE
The following points highlight why Python can be considered as a beneficial replacement for bash scripting: ● Python is installed by default on all the major Linux distributions. Opening a command line and typing python immediately will drop you into a Python interpreter. Python has a very easy to read and understand syntax. Its style emphasizes minimalism and clean code while allowing the developer to write in a bare-bones style that suits shell scripting. ● Python is an interpreted language, meaning there is no compile stage. This makes Python an ideal language for scripting. Python also comes with a Read Eval Print Loop, which allows you to try out new code quickly in an interpreted way. This lets the developer tinker with ideas without having to write the full program out into a file.
20
● Python is a fully featured programming language. Code reuse is simple, because Python modules easily can be imported and used in any Python script. Scripts easily can be extended or built upon. ● Python has access to an excellent standard library and thousands of third-party libraries for all sorts of advanced utilities, such as parsers and request libraries. For instance, Python's standard library includes datetime libraries that allow you to parse dates into any format that you specify and compare it to other dates easily. ● Python can be a simple link in the chain. Python should not replace all the bash commands. It is as powerful to write Python programs that behave in a UNIX fashion (that is, read in standard input and write to standard output) as it is to write Python replacements for existing shell commands, such as cat and sort. [11]
2.6 AMAZON WEB SERVICES Amazon Web Services is a collection of services which constitute the cloud computing platform provided by amazon.com. The two most used web services provided by AWS are: 2.6.1 AMAZON SIMPLE STORAGE SERVICE Amazon Simple Storage Service (Amazon S3), provides developers and IT teams with secure, durable, highly-scalable object storage. Amazon S3 is easy to use, with a simple web services interface to store and retrieve any amount of data from anywhere on the web. Amazon S3 provides cost-effective object storage for a wide variety of use cases including
21
cloud applications, content distribution, backup and archiving, disaster recovery, and big data analytics[13].
2.6.2 AMAZON ELASTIC CLOUD COMPUTING Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.Amazon EC2’s simple web service interface allows us to obtain and configure capacity with minimal friction. It provides complete control of computing resources and runs on Amazon’s proven computing environment. Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios[14].
2.7 PLAGIARISM DETECTION 2.7.1 DIFFERENT FORMS OF PLAGIARISM All forms of plagiarism involves claiming other people's’ work as your own (or assisting someone to make such false claims). In software engineering, work that is re-used without proper acknowledgement can be hard to identify. Following are the kinds of software plagiarism commonly found :2.7.1.1 Collusion
In all practical projects, it is considered normal practice to be given help. This help must be publically acknowledged when the work is presented for evaluation or publication. When the help is significant then it is normal for the person who has given the help to be credited in a more formal way. Where help has been given, and there is collusion between the parties involved, it is a simple matter for no public acknowledgement of this to be made. In such a case, there is no direct reuse of software in the classical engineering sense. However, this collusion is plagiarism. Software — of any reasonable complexity — is
22
structured and has different components. Collusion in software development involves a third party writing the code for at least one of these components, and a student submitting this code as their own. 2.7.1.2 Unacknowledged Reverse Engineering
Often software engineers will look at some code and be able to reverse engineer some abstract property of that code in order to re-use that abstraction to help them write their own code, usually as a solution to a different, yet similar, problem. When the original piece of code is not acknowledged then this is also commonly known as “stealing someone else’s idea(s)”. In final year projects, this type of plagiarism often results when a student re-uses the design of a software system (or part of a software system) as a structure, template or pattern for their own code. Students should not be discouraged from engineering software in this way (it is a reasonably advanced technique) but they should be strongly encouraged to correctly acknowledge where the original design originated. 2.7.1.3 Unacknowledged Translation
Suppose, the student “finds” the Java code to solve a problem and re-uses it to generate a C++ code. This can be thought of as a specific form of reuse through abstraction. Again, this may be considered a good approach in some circumstances, provided the original code is properly acknowledged. 2.7.1.4 Unacknowledged Code Generation
Software engineering tools, often found as part of complex development environments, can be used to automatically generate code. Any such generated code must be explicitly identified and correctly acknowledged. Note that these tools usually credit themselves, so removing these credits would be considered as deliberate deception on the part of the student, and disciplinary action would follow. For example, there are tools to generate C++ code from data flow diagrams. This type of automated software engineering is good, provided the role of the tool is properly acknowledged.
23
2.7.1.5 No Reuse Without Test
From the examples above, it seems that care needs to be taken about acknowledging any reuse of code. There is a simple guideline to ensure that a student never forgets the acknowledgement, avoiding the risk of being accused of deliberate deception when the plagiarism is a result of incompetency: Explicitly acknowledge the use of someone else’s code — no matter how small — by testing it against your requirements. In the case that a student does not properly test the software that they are reusing, this student should be advised that the re-use is unacceptable.[15]
2.7.2 MEASURE OF SOFTWARE SIMILARITY [16] Moss (for a Measure Of Software Similarity) is an automatic system for determining the similarity of programs. To date, the main application of Moss has been in detecting plagiarism in programming classes. Since its development in 1994, Moss has been very effective in this role. The algorithm behind moss is a significant improvement over other cheating detection algorithms. It uses the local Winnowing Algorithm on k-grams:a. Divide a document into k-grams, where k is a parameter chosen by the user and where, a k-gram is a contiguous substring of length k. b. Now hash each k-gram and select some subset of these hashes to be the document’s fingerprints. In all practical approaches, the set of fingerprints is a small subset of the set of all k-gram hashes. A fingerprint also contains positional information. c. Given a set of documents, we find substring matches between them that satisfy two properties: 1. If there is a substring match at least as long as the guarantee threshold, t, then this match is detected, and 2. We do not detect any matches shorter than the noise threshold, k.
24
The constants t and k ≤ t are chosen by the user. k should be large enough (to find significant matches) but not too large otherwise relocation of strings would change the result. d. In each window select the minimum hash value. If there is more than one hash with the minimum value, select the rightmost occurrence. Now save all selected hashes as the fingerprints of the document. e. Compare the fingerprints of the documents to determine the percentage of similarity between the two. MOSS is a preferable method for Plagiarism Detection because :1. It ensures positional independence, noise suppression and whitespace sensitivity. 2. It has already been implemented for multiple languages like C, C++, Java etc. 3. It is efficient. 4. It has been designed with a special focus on computer programming code rather than text. [17] 5. It excludes template code and keywords, thus reducing the number of false positives. MOSS has a limitation that it can be broken with change in the structure if there are several ways of writing the same logic.
2.8 STATIC CODE QUALITY ANALYSIS AND ITS TOOLS Static code analysis techniques and tools [19] are widely spread and intensively used in the development of software systems. Their main benefit lies in improving the quality of code in early development stages. Thus, static code analysis techniques and tools are numerous for established programming languages like C/C++, Java, C# and many others [18].
25
Static code analysis works by analyzing the static structure and the elements of a program without actually executing it. It is therefore usually based on the source code of a program or an intermediate representation thereof. Nowadays static code analysis has evolved to a self-contained program quality assessment and improvement technology, which is employed independently from compilers. It complements program inspection as well as software testing techniques by providing means to reveal bad code smells [21], violations of programming conventions and guidelines, and potential defects. The main benefit, in distinction to inspection and testing techniques, is that static code analysis works fully automated without user involvement and does not require elaborate test settings. The techniques employed for static code analysis are manifold (see for example [19] ). They include elementary rule-based approaches, which search for patterns in programs that represent known problems and possible defects, control flow analysis, which allows checking the possible branches of a program [19], elaborate data flow analysis techniques to reveal the data dependencies in a program [20], as well as abstract interpretation [22], where a program is executed with symbolic values to show certain run-time properties. Static code analysis provides an effective way for detecting critical defects early in the software lifecycle. Static code quality contributes widely to good software quality. 2.8.1 SOME WELL KNOWN OPEN SOURCE TOOLS FOR STATIC CODE ANALYSIS [23] 2.8.1.1 GCC
The GNU Compiler Collection provides a suite of several compilers for different programming languages. Its C and C++ compilers are widely used in the open-source community and they come installed by default in practically every Linux distribution.This
26
benefit does come with the downside of a lack of full POSIX compatibility. These compilers come with several documented flags for reporting potential issues in source code. For example, the -wformat flag enables format string verification for printf, scanf, etc. Another warning, which is actually enabled by default, is the compile-time division by zero check. The compiler also supports source code annotations in the form of custom attributes. One very interesting attribute is the nonnull attribute, which can be used on function parameters. Functions where these non-null parameter exist are checked by the compiler in case they are called with null arguments. One of the more interesting compiler flags for analysis purposes is the -fsyntax-only flag. With this flag the compiler is instructed to only check for syntax errors. In addition to this, the compiler also reports warnings with this flag enabled. This is very useful for analysis as the compiler can then be used as a pure static analyzer without code generation. Unfortunately, GCC does not instantiate C++ templates when this flag is enabled, thus making it less than ideal as a static analyzer. 2.8.1.2 CPPCHECK
Originally created by Daniel Marjamäki, Cppcheck is an open-source static analysis tool that has been part of several benchmarks [14, 24, 40, 59]. The software comes in both a command line version as well as a graphical version. Both versions use the same analysis engine and thus have identical checks, although the command line version is more suitable for automated analysis due to it being easier to run in non-interactive environments. The analyzer builds a simplified abstract syntax tree (AST) using its own custom parser and lexical analysis, and may therefore not always conform to the latest C++ standard due to bugs or lack of features. To support more elaborate checks, the latest version implements a generic data-flow This new framework supports general-purpose context sensitive interprocedural data-flow analysis and can be used by individual checkers. Many checkers have been modified to use this new system, although some old checkers still use their own
27
specific data-flow tracking. The framework also performs abstract interpretation when tracking values in loops. There are two very different extension mechanisms in Cppcheck for creating custom checks. One way to create them is by modifying the source code to create new C++ classes containing the desired checks. These classes work as visitors on the tokenized AST stream. In addition to this method, custom checks can also be created by specifying regular expression in an XML formatted configuration file. The regular expressions are used find defects by matching them against the token stream. Although convenient, these regular expressions do not provide the same capabilities as the source modifying method. The analyzer comes with many different checkers. These checkers are categorized by their severity. Program options are available for enabling or disabling checks by their severities. These severities are • error for more severe issues like syntax errors, • warning for suggestions about possible problems, • style for dead code and other stylistic issues, • performance for some common performance related suggestions, • portability for 64-bit and general platform portability issues, and • information for informational messages about problems with the analysis. Cppcheck is also able to give even more warnings if enabled with the --inconclusive flag. This flag enables, as the name suggests, inconclusive checks where the analyzer might not be completely sure about the existence of a problem. By enabling this flag, some previously hidden problems might be detected while significantly increasing the number of false positives. 2.8.1.3 CPPLINT
Originating from Google, Cpplint19 is a code analysis and style checking program that enforces the Google C++ Style Guide20. This program, written uses a mixture of regular
28
expression matching and other line based heuristics to detect various problems in the analyzed code. It mostly detects style issues, whitespace irregularities, and various other potential problems. Due to its simplicity, it analyzes even large projects relatively fast. As it is a style convention checking tool created for Google projects, like Chromium21, its use is somewhat limited for general-purpose analysis. Fortunately, irrelevant checks can be disabled through the use of command line flags. In addition to individual checks, whole categories of checks can be enabled or disabled. The currently existing categories are • build for build and preprocessing issues, • legal for missing copyright messages, • readability for correct but unreadable code, • runtime for runtime related issues, and • whitespace for whitespace usage conventions
2.8.2 COMMERCIAL TOOLS Several commercial static analysis tools have been made since the creation of the first lint-type tools. These commercial tools are typically either smaller standalone tools or parts of a larger suite of tools. With the significant costs of some of these tools comes useful benefits like extensive on-site support, tool customization, and project-specific tuning for the customer. Some of the well known commercial tools are:●
Klocwork Insight [23]
●
Coverity SAVE [23]
●
LLBMC [23]
2.9 SOURCE CODE COMMENTS AS A METRIC OF SOFTWARE QUALITY
29
Here, we try to emphasize the quality of source code comments as a metric of software quality. 2.9.1 COMMENT ANALYSIS Jiang and Hassan [26] study the evolution of comments over time in the PostgreSQL project. They claim that developers commonly change code without updating its associated comments and that uncommented interfaces or interfaces with outdated comments are likely to cause bugs. For their study, the authors provide a coarse categorization of comments into header and non-header comments. Header comments are written prior to a function definition, whereas all other comments inside a function body or trailing the function are non-header comments. The authors monitor the percentage of functions with header comments, assuming that a drop over time indicates that developers are not updating the interface documentation. However, the study reveals a constant percentage of commented functions except for early fluctuation due to the commenting style of a particular active developer. Some authors also evaluate the ratio between comments and source code over time to give a trend analysis whether developers increase or decrease their effort on code commenting. However, the results differed greatly among the different test cases such that no unique answer can be given. In particular, the authors did not differentiate between different types of comments, e. g., lines of commented out code were also counted in the ratio between comments and source code. We highly suspect that this leads to a useful metric to assess the effort of code commenting. Commented out code should be excluded in this metric because it does not provide any information gain for system understanding. In previous work, maintenance productivity has been studied extensively. However, the quality of code comments plays only a minor role in assessing a software system’s maintainability. It is a commonly accepted fact that poor documentation constitutes a major problem affecting software maintainability [4]. As software is often maintained by people who
30
did not develop it, poor documentation can cause a variety of effects - ranging from an increase in time to understand and maintain a software to a complete redesign and rebuild of the system. In a worst case scenario, it is easier to rebuild a system completely than to understand and modify an existing one. In [28], Garcia et al. analyze costs and benefits of maintainability. Thereby, maintainability is related to understandability, modifiability, and testability. In order to measure those concepts, the authors apply several metrics, among them lines of source code including comments, lines of comments, lines of easy modification, and lines of error detection. The number of lines of comments is thereby considered to be an understandability measure. On the one hand, this indicates the importance of code comments for software maintenance. On the other hand, the number of lines of comments (LC) can not be a sufficient metric to assess understandability: Commented out code or copyrights, for example, increase the number of lines of comments without contributing to understandability. Hence, analyzing maintainability with metrics requires a more detailed analysis of code comment quality. In a similar way, the authors of [29] use the number of lines of comments to calculate metrics assessing a software system’s maintainability: For system commenting characteristics, they define the overall program commenting ratio as a 2-tuple, containing the percentage of comment lines in the whole program and the percentage of modules with header comments. For component commenting they measure intra module commenting as the number of lines with comments divided by the total number of lines in the module, averaged over all modules. However, again, only measuring the number of comment lines can not be sufficient as these lines can contain any arbitrary content not contributing to understandability. Both the work of [28] and [29] lack a differentiation between comments with different purposes. To overcome this problem, detailed comment categorisation was proposed in order to analyse how comments affect understandability of the code [25].
31
2.10 GIT - AN OPEN SOURCE VERSION CONTROL SYSTEM Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. Git is easy to learn and has a tiny footprint with lightning fast performance. It outclasses SCM tools like Subversion, CVS, Perforce, and ClearCase with features like cheap local branching, convenient staging areas, and multiple workflows. Thus, Git makes collaborative work very convenient and simplified.
2.11 WEKA - AN OPEN SOURCE DATA MINING TOOL [24] Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. Weka is open source software issued under the GNU General Public License . It has the following advantages :1.As weka is fully implemented in java programing languages, it is platform independent & portable. 2.It is freely available under GNU General Public License. 3.Weka s/w has a user-friendly graphical interface, so the system is very easy to use. 4.It has a very large collection of different data mining algorithms.
32
3. Proposed Design The code evaluation, plagiarism detection and static code analysis has been done for C++ only.
33
3.1 SOFTWARE REQUIREMENT SPECIFICATION FOR THE ONLINE JUDGE 3.1.1 Introduction 3.1.1.1 Purpose
The purpose of this document is to present a detailed description of the online judge, It will explain the purpose and features of the system, the interfaces of the system, the algorithms used in implementation, functionality of the system, and the constraints under which it must operate. The application is used to platform to manage contests. 3.1.1.2 Scope
This specification gives the details of the online judge. It explains how users can submit their codes and get them evaluated. It also explains how the problem setters can add problems and manage contests. It highlights the plagiarism detection and static code analysis done by the judge. Online judge is a part of the project which also contains the comment classifier. 3.1.1.3 Definitions, Acronyms, and Abbreviations
1. Plagiarism: Act of copying someone else's work illegitimately 2. Program: A set of computer instructions capable of performing some function 3. SCA: Static Code Analysis 4. MOSS: Measure of Software Similarity 3.1.1.4 Overview
The next section, General Description, of this document gives an overview of the functionality of the application. It describes the informal requirements and is used to establish a context for the technical requirements specification in the next section. The third section, Specific Requirements, of this document is written primarily for the developers and describes in technical terms the details of the functionality of the application. Both sections of the document describe the same software product in its entirety, but are intended for different audiences and thus use different language.
34
The SRS is in IEEE 830 format.
3.1.2 General Description 3.1.2.1 Product Perspective
The main purpose of our project is to create a user-friendly web application that can enhance user’s problem solving and programming skills. The application was developed keeping in mind the increasing use of programmed all around. 3.1.2.2 Product Functions
1.User management: The platform supports two user role: User, the ones who can participate in the contests and Admin, the ones who can manage the contests. 2. Code Evaluation: A platform where the user can test his code for the correctness and adequate algorithmic complexity. 3. Contest Management: The admins can create, edit and delete contests and problems as and when required. 4. Plagiarism Detection: All the accepted codes are checked for plagiarism. 5. Static Code Analysis: User can perform the static code quality check on his/her code. 3.1.2.3 User Characteristics
1. The user should be aware of interfaces. 2. The user must have a valid email-id and password in order to log into the system. 3. The user must have an internet connection in order to use the web application. 4. Only authorized users should be given the admin rights. 3.1.2.4 General Constraints and Features
1. Any Operating System 2. The email-id used for login should be valid. 3. Evaluation,SCA and plagiarism testing are supported.
35
3.1.2.5 Assumptions and Dependencies
1. The user has a web browser on the system and an internet connection. 2. The user is aware about competitive programming. 3. The judge is only available for C++.
3.1.3 Specific Requirements 3.1.3.1 External Interface Requirements 3.1.3.1.1 User Interfaces
Interfaces should allow a “User” to: a. Login or signup into the system b. View upcoming and ongoing contests c. View problems of an ongoing contest d. Submit solution e. View Submissions f.
View profile
g. Edit profile h. Request admin rights i.
Check static code quality of a code
Interfaces should allow an “Admin” to: a. Login or signup into the system b. View upcoming and ongoing contests c. Add, edit and delete contests d. Add, edit and delete problems and test data e. View Submissions f.
View profile
g. Edit profile h. Grant admin rights
36
3.1.3.1.2 Hardware Interfaces
Apart from the recommended configuration no other specific hardware is required to run the software. 3.1.3.1.3 Software Interfaces
Operating system: Any operating system Tools: Web browser 3.1.3.2 Functional Requirements 3.1.3.2.1 User management
1. Introduction: This explains the login/signup functionality. 2. Inputs: For signup, user gives his personal details and for login, user just gives the email-id and password. 3. Processing: The system adds the user in case of signup and in case of login it validates the user. 4. Outputs: The user can view the site. 5. Error Handling: In case of invalid details, appropriate message is displayed. 3.1.3.2.2 Code Evaluation
1. Introduction: Here, the submitted code is checked for algorithmic correctness and required algorithmic complexity. 2. Inputs: User submits the code in C or C++. 3. Processing: The system checks the generated output against the expected output for predefined test cases and checks if the time and memory used is within predefined limits or not. 4. Outputs: The system displays the status: Compilation error, Wrong, Accepted, Time limit exceeded or Runtime error. 5. Error Handling: In case user doesn’t select the problem properly or there is some other error, proper message is displayed.
37
3.1.3.2.3 Contest Management
1. Introduction: Here, the admin can view, edit, add or delete contests/problems 2. Inputs: In case of new or edit, the admin enters the details of the contest. In case of delete, the admin selects the delete option. 3. Processing: Action is taken as per user’s input. 4. Outputs: Database is updated and corresponding updates become visible. 5. Error Handling: Appropriate message is shown in case of valid inputs. 3.1.3.2.4 Plagiarism Analysis
1. Introduction: All the accepted codes are checked for plagiarism. 2. Inputs: Accepted codes are the input. 3. Processing: The codes are analysed using Measure of Software Similarity. 4. Outputs: Plagiarised files are shown. 5. Error Handling: Error handling is done in case of any fault. 3.1.3.2.5 Static Code Analysis
1. Introduction: User can check the static code quality of the code. 2. Inputs: User submits the code in C++. 3. Processing: cppcheck checks the code. 4. Outputs: Static code errors are displayed. 5. Error Handling: Error handling is done in case of any fault. 3.1.3.3 Non-Functional Requirements
1) Performance Correct result should be displayed to the user as fast as possible on to the terminal. 2) Reliability The application should not crash under any circumstance such as a user showing entering some invalid details during signup. 3) Portability
38
The application is portable and runs on any machine with good internet connection. 4) Interoperability It can work on any web browser, on any of the operating systems.
3.1.4 Other Requirements 3.1.4.1 Database
We will use sqlite3 as the database in the development phase.
3.2 COMMENT CLASSIFICATION 3.2.1 COMMENT CATEGORIES After analysing and referring [25], we have classified the comments into the following categories :• Header: It can include the information about the copyright or the license of the source code file or it can give an overview about the functionality of the class. In addition, it can provide information about the author of the class, the revision number, the peer review status etc. Header comments are mostly found in the beginning of the code.
• Interface: An interface comment describes the functionality of a method or a field. Interface comments are therefore located either before a method/field definition or in the same line as a field definition. They can provide information for the developer and be used for an API of the project. It describes the parameters required for interfacing. • Inline: Developers use inline comments to comment on code within a method/structure/class definition. Inline comments describe implementation decisions or other details. • Section: A section comment summarizes a larger part of a class that covers one functional aspect. It usually addresses several methods (or fields) together which belong to the same functional aspect.
39
• Code: Commented out code is source code that developers want to be ignored by the compiler. Often code is temporarily commented out for debugging purposes or for potential later reuse. • Task: A task comment is a note for the developer about an unfulfilled task. It either contains a remaining todo, a note a about a bug that needs to be fixed, or a remark about an implementation hack.
3.2.2 EXTRACTING FEATURES FOR CLASSIFICATION We are using the Weka tool for classification. We use 10 features for machine learning as indicated in Table 3.1.
Table 3.1: MACHINE LEARNING FEATURES FOR COMMENTS FEATURE
TYPE
CRITERIA
copyright
boolean
true if it contains the word “license” or “copyright”
header
boolean
true if it contains the word “author”
section
boolean
true if it contains separator strings (----,/////,*****) multiple times
length
real
length of comment (in words)
task
boolean
true if it contains “todo”, “hack” or “fixme”
specialchars
real
percentage of special characters in the code
40
code
boolean
true if it contains code
insidemethod
boolean
true if it inside a method/class/struct
parenthesis
real
number of outstanding open parenthesis
first
boolean
true if it is the first comment of the file
We classify comments in Weka using multiple machine learning algorithms and analyse the results.
4. Implementation This chapter deals with the implementation details of the project.
4.1 PROGRAMMING LANGUAGES AND TOOLS Programming languages used :a. Frontend: Ruby, HTML, CSS, Javascript b. Backend: C++, Python,Ruby 41
Tools used :a. cppcheck: For static code analysis b. Weka: For machine learning c. Codemirror: An open source editor Framework used :Frontend uses Ruby on Rails web application framework.
4.2 COMPONENTS 4.2.1 IMPLEMENTATION OF INTERFACE AND CODESHELL The following interfaces were implemented using Ruby on Rails :4.2.1.1 INTERFACES FOR THE “USER”
a. The “Devise” ruby gem was used for implementing login/signup. Interface of the login page is as shown in Figure 4.1.
Figure 4.1: SCREENSHOT OF LOGIN PAGE b. Interface of the signup page is as shown in Figure 4.2.
42
Figure 4.2: SCREENSHOT OF SIGNUP PAGE c. View upcoming and ongoing contests Interface of the home page is as shown in Figure 4.3.
Figure 4.3: SCREENSHOT OF HOME PAGE d. View problems of an ongoing contest Interface of the contest page is as shown in Figure 4.4.
43
Figure 4.4: SCREENSHOT OF CONTEST PAGE e. Submit solution The editor portion of the interface uses Codemirror[32], a versatile text editor implemented in Javascript for the browser.Interface of the submit page is as shown in Figure 4.5.
Figure 4.5: SCREENSHOT OF SUBMIT PAGE f.
View Submissions 44
Interface of the submissions page is as shown in Figure 4.6.
Figure 4.6: SCREENSHOT OF SUBMISSIONS PAGE g. View profile Interface of the profile page is as shown in Figure 4.7.
Figure 4.7: SCREENSHOT OF VIEW PROFILE h. Edit profile Interface of the edit profile page is as shown in Figure 4.8.
45
Figure 4.8: SCREENSHOT OF EDIT PROFILE i.
Request admin rights Interface of the admin request page is as shown in Figure 4.9.
Figure 4.9: SCREENSHOT OF REQUEST ADMIN RIGHTS PAGE j.
Check static code quality of a code Interface for checking code quality is as shown in Figure 4.10.
46
Figure 4.10: SCREENSHOT OF STATIC CODE QUALITY CHECK OPTION 4.2.1.2 INTERFACES FOR THE “ADMIN”
a. Login page Same as for “user”. b. Signup page Same as for “user”. c. Add new contest Interface of the new contest page is as shown in Figure 4.11.
Figure 4.11: SCREENSHOT OF CREATE CONTEST PAGE d. Edit contest details Interface of the edit contest page is as shown in Figure 4.12.
47
Figure 4.12: SCREENSHOT OF EDIT CONTEST PAGE e. View upcoming and ongoing contests Same as for “user” but also has a delete option. f.
View View probl problems ems of ongoing ongoing contest contest Same as for “user” but also has a delete option.
g. Add/Edit Add/Edit problem problem Interfaces for adding is same as for editing problems as shown in Figure 4.13.
Figure 4.13: SCREENSHOT OF EDIT PROBLEM PAGE h. View View profile profile Same as for “user”.
48
i.
Edit Edit prof profil ile e Same as for “user”.
j.
Grant admin rights Interface of the creating admin page is as shown in Figure 4.14.
Figure 4.14: SCREENSHOT OF ADDING ADMIN
4.2.2 INTEGRATION OF CLIENT AND SERVER 4.2.2.1 USING AWS S3 BUCKETS
Amazon Simple Storage Service (S3) helps in integration between the client and the server.
Figure 4.15: SCREENSHOT OF LIST OF S3 BUCKETS USED The S3 buckets created (as shown in Figure 4.15) and their purposes are as follows :a. coderunneraccepted: The codes which get accepted after evaluation are uploaded to this bucket for Plagiarism Detection.
49
b. coderunnersubmissions: As soon as a user submits a solution, it is uploaded to the bucket in with the following filename format : username#contestcode#problemcode#timestamp#time_limit#memory_limit#.cpp
c. coderunnerusers: This bucket contains the response file for each user with the name in the following format: username_response.html
d. coderunnerdetails: It contains the contest and problem data files in the following format: TEST DATA: problemcode#contestcode#data#.zip which contain an Input and Output folder. CONTEST DETAILS: contestcode#metadata#.txt e. cppchecksubmissions: This bucket contains the codes submitted for static code analysis in the following format: username#timestamp#.cpp
f. cppcheckresponses: This bucket contains the responses generated after static code analysis in the following format: username_response.html 4.2.2.2 USING PYTHON SCRIPTS AT THE BACKEND
50
Figure 4.16: INTERACTION OF PYTHON SCRIPTS IN BACKEND The python scripts used in backend have been described below. The interactions between the scripts has been shown in Figure 4.16. InitiateJudge.py
Starts other auxiliary scripts which include extractor.py, download.py and monitor.py extractor.py
Initialises the contest. Downloads the test data of problems from coderunnerdetails bucket and creates the folders needed for the codechecker to function properly. It also resets the response pages of users in coderunnerusers bucket. download.py
It downloads the user submissions from coderunnersubmissions bucket and saves it to a local submissions folder from where monitor.py can access it. monitor.py
51
This script is the heart of codechecker. It fetches submissions from local submission folder and sends it to codechecker where it gets evaluated. It receives a verdict from codechecker which it converts to html format and appends to user response file and uploads it to coderunnerresponses. It does the processing of submissions parallely and uses mutex to lock “codechecker” which is the common resource for all the threads.
4.2.3 IMPLEMENTATION OF CODECHECKER Basically, the phases involved in functioning of a codechecker (as shown in Figure 4.17) are: Phase 1: Checking for Compilation Errors during Compilation Phase 2: Running the source code with hidden test cases Phase 3: Ensuring the execution completes within time and
memory limits Phase 4: Checking for Runtime Errors Phase 5: Comparing Generated Output with Actual Output Phase 6: Deleting user-generated files
Figure 4.17: ORDER OF STEPS IN CODECHECKER PURPOSE
COMMAND
To compile
system(g++ filename.cpp -o filename)
To ensure limits
system(ulimit -t time_limit -m memory_limit; file_to_be_executed)
To compare files
system(diff generated_output expected_output)
Table 4.1 COMMANDS USED IN CODECHECKER PHASE 1: Compilation of User-Code and checking for Compilation Error
52
The program uploaded by the user is first checked for compiler errors, then for correctness with the help of predefined test cases. Standard GCC compiler is used in our application. If the source code is compiled successfully, then it is executed. Otherwise the user is provided information about the compilation error and the execution is aborted. A separate compilation error file is generated for every source code. PHASE 2: Running source code with hidden test cases
After successful compilation, the source code is executed using predefined test cases. These test cases are hidden from the user. They are used to test the correctness of the code. Designing test cases is a tedious task. It is impractical to manually design them. As a consequence, test scripts are written to automate this process. PHASE 3: Ensuring Time and Memory limits
Time and Memory limits are specific to every problem and the user-code must finish execution within those limits for successful termination. These limits are ensured by using ulimit command with '-t' and '-m' flags for time and memory limits respectively. Phase 4: Checking for Runtime Errors
Runtime errors come up while execution of the program. Different conditions can cause runtime errors. The most common causes are : Accessing illegal memory, Illegal instruction, File size limit exceeded etc. When a runtime error comes up, the program raises a signal. This signal is handled by a signal handler which terminates the execution and displays the Runtime error to the user. PHASE 5: Comparison of output
After successful execution of the user-code, the generated output for different test cases are compared to the expected outputs for the respective test cases. If the generated outputs
53
exactly match the expected outputs, the user-code is considered to be correct otherwise it is considered wrong. PHASE 6: Deleting user-generated files
While Compilation and Execution, multiple user-specific files are generated. These include compilation_error_files, generated_output, log_files etc. These files need to be deleted after completion of execution. The system call invokes function corresponding to the phase as listed in Table 4.1. Working of Code-checker:
The Code-checker has been divided into multiple files. This is done in order to make the application scalable. Currently there is support for C++. But the codechecker can easily be scaled to include support for other languages like python and Java. Invoker.cpp:
This file is invoked by the server end to start checking the code. This file then executed codechecker.cpp. Codechecker.cpp:
This file is generic. This contains the code to identify the extension of the user-code file. It also contains the code which is common to all the languages. The operations provided by this file include comparison of output, deleting user-generated files and displaying the result. This file then executes the language specific file – Cplusplus.cpp Cplusplus.cpp:
This is language specific file. To add support for other languages, files specific to those languages (like cplusplus.cpp) have to be added. This executes the user-code with the predefined test cases and ensures the time and memory limits.
54
Figure 4.18: ORDER OF EXECUTION IN CODECHECKER The order of execution of these files has been shown in Figure 4.18. Execution of User-Code
The user code is executed with various test cases. Each test case is handled by a separate process. Code Snippet:
Table 4.2 CODE SNIPPET USED FOR EXECUTING CODE
pid_t pids[MAX]; int status; for(int i=0; i
} else if(pids[i] == 0 ) { if(execute_file(i, problem_name, argv[4], executable_name, contest_name, executable_name) == -1) abort(); exit(20); } else { waitpid(-1,&status,0); } }
4.2.4 IMPLEMENTATION OF PLAGIARISM DETECTION
Figure 4.19: SOURCE CODE FILES USED FOR PLAGIARISM DETECTION We modified the open source project [33] for implementing Plagiarism Detection. Following are the source code files used (also depicted in Figure 4.19): filter_confset.rb Configure set to use filter. A configure set is a keywords set+comment type specific to the language. It uses cpp.conf which contains keywords specific to c++ and how comments are specified in c++.
56
noise_filter.rb
This file filters the source file. It removes whitespace, capitalization, punctuations and identifiers. Keywords are always the same, so need to be filtered out. rollhasher.rb
A rolling hash function similar to Rabin Karp Algorithm winnower_confset.rb
Read the conf file to get parameters for Winnower. It uses winnower.conf which contains the values of k,t and q. k is the size of fingerprint, t is the window size from which at least one fingerprint is chosen and q is the prime number used for hashing. robust_winnower.rb
Implementation of Robust Winnowing. Generates fingerprints from filtered text. detect_plagiarism.rb
Filter text from the two files and generate their fingerprints. Find common fingerprints and print percentage similarity on the basis of common fingerprints. plagiarism_detector.py
It runs detect_plagiarism.rb file on all pairs of accepted solutions and prints those who have a percentage similarity over a threshold value(currently set to 50%).
4.2.5 INTEGRATION WITH CPPCHECK
57
Figure 4.20: SOURCE CODE FILES USED FOR STATIC CODE ANALYSIS USING CPPCHECK Following source code files (shown in FIgure 4.20) were implemented and used for static code analysis: InitiateCppcheck.py
Starts other auxiliary scripts which include cppdownload.py and cppmonitor.py cppdownload.py
It downloads the user submissions from cppchecksubmissions bucket and saves it to a local cppsubmissions folder from where cppmonitor.py can access it. monitor.py
This script fetches submissions from local cppsubmission folder and sends it to cppcheck tool where its static analysis is done. It receives a verdict from cppcheck which it converts to html format and adds to user response file and uploads it to cppcheckresponses. It does the processing of submissions parallely and uses mutex to lock “cppcheck” which is the common resource for all the threads.
58
4.2.6 COMMENT CLASSIFIER 4.2.6.1 FINDING DATASET
We picked up the C++ source code datasets from the LLVM[30] and BOOST[31] libraries. 4.2.6.2 CREATING . ARFF
We picked up the comments from the above datasets, extracted features from those comments and created our own training data (.arff file) to be used by Weka. A portion of the dataset has been shown in Table 4.2. Table 4.2 PORTION OF THE TRAINING DATASET
14,'exceptiondemo_370',false,false,false,34,false,13,false,false,false,1,method 15,'exceptiondemo_478',false,false,false,5,false,7,false,true,false,2,inline 16,'exceptiondemo_496',false,false,false,4,false,8,false,true,false,2,inline
4.2.6.3 GOOD VS BAD COMMENTING
After classifying, we can tell about the quality of commenting from the results of the classification. A code is said to have good quality source code comments if :1. Header comment must be present. 2. There should be as many method names as many methods at least. 3. There should be at least 2 -10 section comments to show modularity in the code. 4. There should be least number of task and code comments. 5. 0.5 * Method comments < Number of inline comments < 4 * method comments. Large number of inline comments would indicate more redundancy and less number of inline comments would indicate poor documentation.
4.2.7 USING GIT We maintained a GIT Repository for working collaboratively:
59
Figure 4.21: GITHUB REPOSITORY USED FOR THE PROJECT The codebase is available at https://github.com/tussharsingh13/coderunner (shown in Figure 4.21).
5. Observations and Results
60
5.1 RESULTS OF CODECHECKER Following kinds of responses are given by the codechecker :1. When the code passes all test cases, it gives the accepted (ACC) status as shown in Figure 5.1:
Figure 5.1: WHEN CODE GETS ACCEPTED 2. When the code fails some test cases, it gives the wrong answer (WA) status as shown in Figure 5.2:
61
Figure 5.2: WHEN CODE GIVES WRONG ANSWER 3. When the code fails during compilation, it gives the compilation error (CE) status as shown in Figure 5.3:
Figure 5.3: WHEN CODE GIVES COMPILATION ERROR
62
4. When the code exceeds time limit on some case, it gives the time limit exceeded (TLE) status as shown in Figure 5.4:
Figure 5.4: WHEN CODE EXCEEDS TIME LIMIT 5. When the code gives runtime error, it gives the runtime error (RE) status as shown in Figure 5.5:
63
Figure 5.5: WHEN CODE GIVES RUNTIME ERROR
5.2 PLAGIARISM DETECTION RESULTS The MOSS based Plagiarism detector gave the results as shown in Figure 5.6:
64
Figure 5.6: RESULTS AFTER PLAGIARISM CHECK
5.3 CPPCHECK RESULTS The online judge gives an output as shown in Figure 5.7 after cppcheck has analysed the code.
Figure 5.7: RESULTS AFTER CPPCHECK ANALYSES THE CODE
65
5.4 COMPARISON WITH DIFFERENT ALGORITHMS Table 5.1 COMPARISON OF DIFFERENT MACHINE LEARNING ALGORITHMS
J48 (Trees)
NaiveBayes (Bayes)
Logistic (Functions)
ZeroR (Rules)
IBk (Lazy)
Percentage Split
93.94
84.85
87.88
27.27
90.91
Cross validation
88.66
85.57
84.54
24.74
91.75
Training set
96.91
90.72
100
24.74
100
J48 gives the best results as summarised in Table 5.1. Others either overfit over the data set or have less accuracy.
66
6. Conclusion and Future Work This chapter discusses the limitations and future scope of the project and finally summarises the conclusion.
6.1 CONCLUSION Thus, we were able to implement an online judge which allowed contest management, code evaluation, static code quality analysis and plagiarism detection. We were also able to classify comments with a decent accuracy.
6.2 LIMITATIONS Following are the limitations of the project :● False positives for section comments increase, if training dataset includes comments present inside struct/method/class. ● When the number of download/upload requests are high i.e around 1000 per second, AWS S3 buckets are unable to handle the load. ● The online judge lacks security. ● For algorithms having conventional implementation like Floyd Warshall, the number of false positives during plagiarism detection increase. ● The online judge is currently implemented for C++ only. ●
MOSS has a limitation that it can be broken with change in the structure if there are several ways of writing the same logic.
● J48 is not an online algorithm, it needs training every time the data is changed.
67
6.3 FUTURE SCOPE ● Another category of comments, reference comments can be added which are used to refer to the existing codes and libraries used. ● Automatic test case generation can be added using libraries like Networkx for graph based problems. ● The support for other languages like C, Python, Java etc. can be provided. ● The interface can be made more user friendly and advanced. ● The judge currently provides a platform for contests only. It can act as a platform for problems as well i.e it can maintain a set of problems for practice as well. ● Elastic Cloud Computing (EC2) service can be used in place of S3 so that computations happen directly on the cloud and the time wasted in download/upload on the server end is saved. This will make the judge efficient. In EC2, the number of computing instances scale up as per the load, thus maximizing efficiency. ● Security features like chroot, IP Tables can be added to the online judge to make it secure. ● Advanced sandboxing techniques like SECCOMP can be used to provide high level security. ● Usefulness and coherence of the comments can be analysed after classification by using various techniques like the parsing the function name and the comment, and then evaluating the percentage of common words between the two to assess the usefulness of the comment and its coherence with the code.
68
REFERENCES [1] Shaun Bebbington (2014). "What is coding". Retrieved 2014-03-03. [2] Shaun Bebbington (2014). "What is programming". Retrieved 2014-03-03. [3] Lauren Orsini (2013) . “Why Programming Is The Core Skill Of The 21st Century”, Available at: http://readwrite.com/2013/05/31/programming-core-skill-21st-century [4]Penny Grubb and Armstrong Takang. Software Maintenance: Concepts and Practice, 2003, pg 7, 120-121. [5] Nations, Daniel. "Web Applications". About.com. Retrieved 20 January 2014. [6] Kaushik MV (2013). “Learn to code by Competitive Programming”, Available at: http://blog.hackerearth.com/2013/09/competitive-programming-getting-started_11.ht ml [7] ”Does this competitive programming really helps in industry”, Available at: http://discuss.codechef.com/questions/46837/does-this-competitive-programming-re ally-helps-in-industry [8] ”About Ruby”, Available at: https://www.ruby-lang.org/en/about/ [9] “What is Rails“, Available at: https://www.ruby-lang.org/en/about/ [10] “Ruby on Rails: What It is and Why We use it for Web Applications”, Available at: http://www.bitzesty.com/blog/2014/7/ruby-on-rails-what-it-is-and-why-we-use-it-for-w eb-applications [11] Richard Delaney (2013), “Python Scripts as a Replacement for bash scripts”, Available at: http://www.linuxjournal.com/content/python-scripts-replacement-bash-utility-scripts [12] “Amazon EC2”, Available at: Amazon Elastic Compute Cloud (Amazon EC2), Cloud Computing Servers. Aws.amazon.com (2014-07-01). Retrieved on 2014-07-01. [13] “Amazon s3”, Available at: http://aws.amazon.com/s3/ [14] “Amazon EC2”, Available at: http://aws.amazon.com/ec2/ [15] J Paul Gibson, “Software Reuse and Plagiarism: A Code of Practice,” in ITiCSE’09, July 6–9, 2009, Paris, France. [16] Saul Schleimer, et.al, “Winnowing: Local Algorithms for Document Fingerprinting, SIGMOD 2003, June 9-12, 2003, San Diego, CA. [17] “Overview of Plagiarism Detection Software”, Available at: http://www.cshe.unimelb.edu.au/assessinglearning/03/plagsoftsumm1.html [18] P. Emanuelsson, U. Nilsson, “A Comparative Study of Industrial Static Analysis Tools“, Electronic Notes in Theoretical Computer Science, Vol. 217, pp. 5-21, 2008.
69