Roommate Pairing Algorithm Project Report
SOFT COMPUTING TOOLS IN ENGINEERING 2015 Authored by: Group 9
Roommate Pairing Algorithm Project Report Abstract The project focuses on a typical problem concerning the development of a smart campus. One of the most random procedure is followed while pairing roommates in college hostels. This leads to internal conflicts between students and thus, affect the life of all the students concerned. The best method to do away with this problem is to record the students’ preferences and match them using this data. Doing this manually is almost an impossible task. In fact, using hard computing tools is also a very inefficient, difficult and hard-to-program method. Thus, evolutionary algorithm provides for a good alternative. This project uses genetic algorithm to pair residents of a hall for room allotment.
Introduction
Roommate Pairing Algorithm | 12-Apr-15
The theme of the project is ‘Smart Campus’. A smart college/university campus is the one which uses modern technology in various fields where it can be used. Some of the salient features of a smart campus can be:
1
Energy Use Optimization Water Use Optimization Local Weather Forecasting Analysis and Control of Mutual harmony among students Smart Medical Condition Monitoring Health and Gym Routine Allotment
Smart Career and Curriculum Counseling Smart Parking Bay Locator
Smart Attendance System
For this project report, we are considering only one of these aspects: Analysis and Control of Mutual harmony among students. Harmony among the students is very important. Two roommates if well suited for each other would not just only live in greater peace but will be friends for life. Thus good friendship and lesser conflicts will ensure a better life on campus for everyone.For this purpose, we have developed an algorithm using genetic algorithm to pair roommates as per their preferences. Genetic Algorithm mimics the process of natural selection inspired by mutation, cross over, inheritance and selection is used to get the best solution. Our algorithm provides the fitness of each pair which is indicative of their compatibility. This fitness can be used by the Hostel Management Council to prevent usual conflicts, maintain harmony among students and provide necessary counselling when required.
Project Methodology Data Collection First and foremost, the variables to be considered were decided. And several observations were recorded. The data was collected using a web form which was shared on Facebook.
2
Roommate Pairing Algorithm | 12-Apr-15
3
Roommate Pairing Algorithm | 12-Apr-15
Fitness C alculation Following rules have been used for calculation of fitness of each pair:
If both the roommates smoke or don’t smoke, fitness is increased by 0.5 otherwise it is decreased by 0.2.
If both roommates claim to be morning or night person, fitness is increased by 0.3. If both roommates are hard/medium worker or both work only when necessary, fitness is increased by 0.5 otherwise it is decreased by 0.2.
If both roommates prefer the same level of tidiness in the room, fitness is updated by 0.3 otherwise it is decreased by 0.2. If both roommate listen to same genre of music, fitness is increased by 0.5 otherwise decreased by 0.2. If both roommate fulfil alcoholism preference for each other, fitness is increased by 0.4 otherwise decreased by 0.1. If both roommate fulfil outgoing/flexible/reserve preference for each other, fitness is increased by 0.4 for each of them otherwise decreased by 0.1.
If both have the same level of athleticism, fitness is increased by 0.5.
Here, we have considered only some of the parameters. The rest of the parameters have not been used due to availability of less data and lesser effect of mutual harmony.
Genetic Algorithm The Genetic Algorithm is a model of machine learning which derives its behavior from a metaphor of the processes of evolution in nature. This is done by the creation within a machine of a population of individuals represented by chromosomes, in essence a set of character strings that are analogous to the base-4 chromosomes that we see in our own DNA. The individuals in the population then go through a process of evolution.
Residents are divided into two groups. Residents in each group are paired randomly and fitness of each room and the total building is calculated. All the data generated is put into a structure called ‘Solution’. Multiple such ‘Solution’s are generated and based on the total fitness of each solution, solutions are naturally selected. For crossover, crossover probability of 0.8 has been used. Two random solutions are selected and randomly either first or second group of each of them is exchanged and the corresponding fitness is updated. For mutation, mutation probability of 0.05 has been used. Two random rooms in a random building of a random solution are considered. Randomly either one or the other person is interchanged with one or the other person of the other room and the corresponding fitness is updated. Then this cycle is run for a number of iterations (40 for this case).
The rooms with maximum fitness value are taken and stored in a separate permanent roommates’ list. The people who have been already paired are removed from the original residents’ list and the residue list is again taken through the whole process until the whole list is exhausted and all people are paired successfully.
The paired list is exported to a CSV file.
Roommate Pairing Algorithm | 12-Apr-15
Step s in Genetic Algorithm
4
Limitations Our program has the following limitations:
The solution in the considered case is three-levelled as opposed to optimization of mathematical functions where there are only two levels.
There are multiple solutions in the considered case which are the different pairs. All the above mentioned factors add to the computation time of the algorithm and difficulty to the coding. Due to crossover of groups between two solutions, there are repetitions of people in the solution which has to be removed which leads to wastage of a lot of solutions and increases the time as well.
The algorithm has not yet been designed to handle males and females separately due to lack of female entries in the sample data. Although, the algorithm can be operated on males and females separately to get the desired result. Due to an unconventional modelling of the problem, the problem is difficult to code and debug.
The program doesn’t have a Graphical User Interface due to internal limitations of R language.
The program doesn’t take into consideration special preferences (not included in the form) of students as of yet.
Future Projections Our program can be further improved and expanded to take care of the following projections:
Roommate Pairing Algorithm | 12-Apr-15
5
The code can be further improved and consolidated to reduce the time, consider gender and provide a Graphical User Interface. The input variables can be increased after conducting a psychological study to investigate the factors that affect the harmony between room partners. The algorithm can be appropriately modified to devise a match-making algorithm to provide dating and marriage related advice. The algorithm can be further improved to include the factors that are leading to decrease in fitness, in the result. These factors will help the Hostel Council to understand conflict points better. The code can be further improved and consolidated to reduce the time, consider gender and provide a Graphical User Interface. The input variables can be increased after conducting a psychological study to investigate the factors that affect the harmony between room partners. The algorithm can be appropriately modified to devise a match-making algorithm to provide dating and marriage related advice. The algorithm can be further improved to include the factors that are leading to decrease in fitness, in the result. These factors will help the Hostel Council to understand conflict points better.
Summary Genetic Algorithm is a very efficient and ‘much easier to code’ method to arrive at the most optimal solution. Here, genetic algorithm has been used on an unconventionally modelled problem to arrive at the best arrangement of roommates. The programmed can be improved a lot to do away with the limitations and add the future projections.
Codes ‘initialallot.R’ ##divides residents into two groups for the sake of crossover initialbuildingallot <- function(res) { br <- sample(1:nrow(residents)) b1 <- sample(br, buildingcapacity , replace=F) b2 <- setdiff(br, b1) blist <- list("b1" = b1, "b2" = b2) } ##randomly pair the residents of one building initialroomallot <- function(buil){ builcap <- length(buil) fitness <- vector(mode="numeric",length = builcap/2) randombuilres <- sample(buil) r1 <- sample(randombuilres, builcap/2, replace=F) r2 <- setdiff(randombuilres,r1) rooms <- cbind(r1,r2,fitness) rooms }
updatefitness <- function(res,rooms){ for (i in 1:nrow(rooms)){ x <- rooms[[i,1]] y <- rooms[[i,2]] rooms[i,3] <- 0 if(res[x,6] == res[y,6]) { rooms[i,3] <- rooms[i,3] + 0.5 } else{ rooms[i,3] <- rooms[i,3] - 0.2 } if(res[x,7] == res[y,7]) { rooms[i,3] <- rooms[i,3] + 0.3 } if(res[x,8] == res[y,8]) { rooms[i,3] <- rooms[i,3] + 0.5 } if(res[x,9] == res[y,9]) { rooms[i,3] <- rooms[i,3] + 0.5 } else{ rooms[i,3] <- rooms[i,3] - 0.2 }
Roommate Pairing Algorithm | 12-Apr-15
##update the fitness function of each pair
6
Roommate Pairing Algorithm | 12-Apr-15
7
if(res[x,10] == res[y,10]) { rooms[i,3] <- rooms[i,3] + } else { rooms[i,3] <- rooms[i,3] } if(res[x,11] == res[y,12]) { rooms[i,3] <- rooms[i,3] + } else { rooms[i,3] <- rooms[i,3] } if(res[x,12] == res[y,11]) { rooms[i,3] <- rooms[i,3] + } else { rooms[i,3] <- rooms[i,3] } if(res[x,13] == res[y,13]) { rooms[i,3] <- rooms[i,3] + } else{ rooms[i,3] <- rooms[i,3] } if(res[x,15] == res[y,16]) { rooms[i,3] <- rooms[i,3] + } else{ rooms[i,3] <- rooms[i,3] } if(res[x,16] == res[y,15]) { rooms[i,3] <- rooms[i,3] + } else{ rooms[i,3] <- rooms[i,3] }
0.3
0.2 0.4
0.1 0.4 0.1 0.5 0.2 0.4 0.1 0.4 0.1
} rooms } ##calculates the sum of the fitness function of one set of groups/one solution solutionfitness <- function(r1,r2){ sum(r1[ ,3])+sum(r2[ ,3]) }
‘solutionmaker.R’
##makes a solution where a solution consists of two groups of residents paired randomly solutionmaker <- function(){ source("initialallot.R") names(residents)
‘mainfunctions.R ’ ##calculates the average fitness of each solution
averagefitness <- function(pop){ sum <- 0 for(i in 1:length(pop)){
} sum/length(pop) }
##calculates the expected count of each solution
ecount <- function(pop){ af <- averagefitness(pop)
Roommate Pairing Algorithm | 12-Apr-15
sum = sum + pop[[i]][[3]]
8
ec <- vector(length=length(pop)) for(i in 1:length(pop)){ ec[i] = pop[[i]][[3]]/af } ec }
##calculates probability of selection for each solution
probselection <- function(ec){ ec/length(ec) }
##calculates the cumulative probability of selection
cprobselection <- function(ps){ cs <- vector(length=length(ps)) for(i in 1:length(ps)){ cs[i] = sum(ps[1:i]) }
Roommate Pairing Algorithm | 12-Apr-15
cs
9
}
##generates the vector which will used to populate the mating pool
matingvector <- function(cs){ a <- runif(length(cs)) g <- vector("numeric",length=length(cs)) for(j in 1:length(a)){
for(i in 1:length(cs)){ if(i==1){ if(a[j]<=cs[i]){ g[j] <- i } } else if(a[j]>=cs[i-1] & a[j]<=cs[i]){ g[j] <- i } } } g }
##uses the mating vector to generate the mating pool
matingpoolmaker <- function(matvec,pop){ l <- list() for (i in 1:length(pop)){ l[[i]]<-pop[[matvec[i]]]
l }
##crossover: selects two random solutions and exchange on of the buildings between those
crossover <- function(pop){ if(runif(1)<=0.8){
Roommate Pairing Algorithm | 12-Apr-15
}
10
change1 <- sample(1:length(pop),1) change2 <- sample(1:length(pop),1) site <- sample(1:2,1) temp <- pop[[change1]][[site]] pop[[change1]][[site]] <- pop[[change2]][[site]] pop[[change2]][[site]] <- temp } pop }
##mutate: slects two rooms randomly from any random bulding of a random solution and exchange one of the roommates
mutate <- function(pop,residents){ if (runif(1) <= 0.05){ msite <- sample(1:length(pop),1) bsite <- sample(1:2,1) mr1 <- sample(1:nrow(pop[[1]][[bsite]]),1) mr2 <- sample(1:nrow(pop[[1]][[bsite]]),1) rr1 <- sample(1:2,1)
Roommate Pairing Algorithm | 12-Apr-15
rr2 <- sample(1:2,1)
11
temp <- pop[[msite]][[bsite]][[mr1, rr1]] pop[[msite]][[bsite]][[mr1, rr1]] <- pop[[msite]][[bsite]][[mr2, rr2]] pop[[msite]][[bsite]][[mr2, rr2]] <- temp pop[[msite]][[bsite]]
##search the maximum fitness function in the whole population
bestfitness <- function(population){ max(max(max(population[[1]][[1]][ ,3]),max(population[[1]][[2]][ ,3])),max(max(population[[2]][[1]][ ,3]),max(population[[2]][[2]][ ,3])) ,max(max(population[[3]][[1]][ ,3]),max(population[[3]][[2]][ ,3])),max(max(population[[4]][[1]][ ,3]),max(population[[4]][[2]][ ,3])) ,max(max(population[[5]][[1]][ ,3]),max(population[[5]][[2]][ ,3])),max(max(population[[6]][[1]][ ,3]),max(population[[6]][[2]][ ,3])) ,max(max(population[[7]][[1]][ ,3]),max(population[[7]][[2]][ ,3])),max(max(population[[8]][[1]][ ,3]),max(population[[8]][[2]][ ,3])) ,max(max(population[[9]][[1]][ ,3]),max(population[[9]][[2]][ ,3])),max(max(population[[10]][[1]][ ,3]),max(population[[10]][[2]][ ,3]))) }
‘body.R’ ##the MAIN function
##if the total number of residents is less than the group/building capacity, it decreases the capacity if(nrow(residents) <= buildingcapacity && nrow(residents) > 2){ buildingcapacity <<- buildingcapacity - 2 } ##if the total number of residents is two, directly pair them if(nrow(residents) <= 2){ pmates1 <<- append(pmates1,as.vector(residents$name[[1]])) pmates2 <<- append(pmates2,as.vector(residents$name[[2]])) pmates3 <<- append(pmates3,c(0)) residents <<- NULL print("All Alloted!")
Roommate Pairing Algorithm | 12-Apr-15
algo <- function(residents){ source("solutionmaker.R") source("mainfunctions.R")
12
stop() } ##generates a population of 10 solutions population <- list(solutionmaker(),solutionmaker(),solutionmaker() ,solutionmaker(),solutionmaker(),solutionmaker(), solutionmaker(),solutionmaker(),solutionmaker(), solutionmaker()) ##the population goes through 40 iterations of natural selection, crossover and mutation for(i in 1:40){ ec <- ecount(population) ps <- probselection(ec) cs <- cprobselection(ps) g <- matingvector(cs) l <- matingpoolmaker(g,population) population <- crossover(population) population <- updatecrossfitness(population) population <- mutate(population,residents) } ##calculates best fitness value bf <- bestfitness(population) ##generation of empty vectors to store the roommates with maximum fitness fr1 <- vector("numeric") fr2 <- vector("numeric") ff <- vector("numeric")
Roommate Pairing Algorithm | 12-Apr-15
##generates and stores the roommate pairs with he maximum fitness
13
d <- 1 for(i in 1:length(population)){ for(k in 1:2){ for(h in 1:nrow(population[[i]][[k]])){ if(population[[i]][[k]][[h,3]] == bf){ fr1[d] <- population[[i]][[k]][[h,1]] fr2[d] <- population[[i]][[k]][[h,2]] ff <- bf d = d+1 } } } } ##calculates the best pairs
bestpairs <- cbind(fr1,fr2,ff) ##elimination of roommate pairs where both the roommates are the same person f <- vector() for(i in 1:nrow(bestpairs)){ if(bestpairs[i,1]==bestpairs[i,2]){ f[x] <- i x <- x+1 } } if(length(f)!=0){ bestpairs <- bestpairs[-f, ] } ##elimination of roommate pairs where the same pair is repeated multiple times if(sum(duplicated(as.vector(bestpairs[ ,1:2])))>0){ xx <- 1 } else if(sum(duplicated(as.vector(bestpairs[ ,1:2])))==0){ exclude <- unique(as.vector(bestpairs[ ,1:2])) residents <<- residents[-exclude, ] for(i in 1:nrow(bestpairs)){
pmates1 <
‘Main Scrip t.R’ source("body.R") ##generation of empty vectors to store roommates' name and fitness pmates1 <- vector("character") pmates2 <- vector("character") pmates3 <- vector("numeric") ##importing the database residents <<-read.csv("Housing Survey.csv")
Roommate Pairing Algorithm | 12-Apr-15
##stores the roommates name in global vector
14
names(residents)=2){ algo(residents) } ##displays the final paired data in the console
Roommate Pairing Algorithm | 12-Apr-15
data.frame(pmates1,pmates2,pmates3)
15