Joymallya Chakraborty

I am currently a Ph.D. student in the RAISE Lab (Real-world Artifical Intelligence for Software Engineering) at North Carolina State University , under the supervision of Dr. Tim Menzies . My interest includes machine learning,data mining and optimization.

Before coming to NC State, I was a full-stack software developer at TCG Digital . I obtained my bachelors degree in Computer Science from Jadavpur University .

Email  /  CV  /  Google Scholar  /  LinkedIn  /  GitHub


My research interests include using data mining and artificial intelligence methods to solve real world problems in software engineering field. Previously I worked on finding "discrimination" in social coding platform. Currently my research focus is on finding and mitigating algorithmic "bias" in machine learning models.


Software Engineering for Fairness: A Case Study with Hyperparameter Optimization

Joymallya Chakraborty , Tianpei Xia, Fahmid M. Fahid, Tim Menzies
ASE 2019 (Under review)

Machine learning software is increasingly being used to make decisions that affect people's lives. Potentially, the application of that software will result in fairer decisions because (unlike humans) machine learning software is not biased. However, recent results show that the software within many data mining packages exhibit "group discrimination"; i.e. their decisions are inappropriately affected by "protected attributes" (e.g., race, gender, age, etc.). This paper shows that making fairness as a goal during hyperparamter optimization can preserve the predictive power of a model learned from a data miner while also enerates fairer results. To the best of our knowledge, this is the first application of hyperparameter optimization as a tool for software engineers to generate fairer software.


Predicting Breakdowns in Cloud Services (with SPIKE)

Jianfeng Chen, Joymallya Chakraborty , Tim Menzies , Philip Clark, Kevin Haverlock, Snehit Cherian
FSE 2019

Maintaining web-services is a mission-critical task. Any downtime of web-based services means loss of revenue. Worse, such down times can damage the reputation of an organization as a reliable service provider (and in the current competitive web services market, such a loss of reputation causes extensive loss of future revenue). To address this issue, we developed SPIKE , a data mining tool which can predict upcoming service breakdowns, half an hour into the future.


Why Software Projects need Heroes

Suvodeep Majumder, Joymallya Chakraborty , Amritanshu Agrawal, Tim Menzies
TSE 2019 (Under review)

A "hero" project is one where 80% or more of the contributions are made by the 20% of the developers. In the literature, such projects are deprecated since they might cause bottlenecks in development and communication. This paper explores the effect of having heroes in project, from a code quality perspective. After experimenting on 1100+ GitHub projects, we conclude that heroes are very useful part of modern open source projects.


Measuring the Effects of Gender Bias on GitHub

Nasif Imtiaz, Justin Middleton , Joymallya Chakraborty , Neill Robson, Gina Bai, Emerson Murphy-Hill
ICSE 2019

Diversity, including gender diversity, is valued by many software development organizations, yet the field remains dominated by men. One reason for this lack of diversity is gender bias. In this paper, we study the effects of that bias by using an existing framework derived from the gender studies literature. We adapt the four main effects proposed in the framework by posing hypotheses about how they might manifest on GitHub, then evaluate those hypotheses quantitatively. While our results show that effects are largely invisible on the GitHub platform itself, there are still signals of women concentrating their work in fewer places and being more restrained in communication than men.


Algorithms for generating all possible spanning trees of a simple undirected connected graph: an extensive review

Maumita Chakraborty, Sumon Chowdhury, Joymallya Chakraborty , Ranjan Mehera, Rajat Kumar Pal
Complex & Intelligent Systems (Springer),2018

Generation of all possible spanning trees of a graph is a major area of research in graph theory as the number of spanning trees of a graph increases exponentially with graph size. Several algorithms of varying efficiency have been developed since early 1960s by researchers around the globe. This article is an exhaustive literature survey on these algorithms, assuming the input to be a simple undirected connected graph of finite order, and contains detailed analysis and comparisons in both theoretical and experimental behavior of these algorithms.

Industrial Experience

Software Engineer Research Intern

May 2019 - August 2019 (Bellevue,Seattle)

I worked on post-training quantization of ONNX, Tensorflow DL models and Computational Graph Optimization on Onnxruntime.

May 2018 - August 2018 (Bellevue,Seattle)

I explored optimization opportunities of .NET Core Garbage Collection and implemented PoC (Proof of Concept) prototypes. The prototypes were then verified against different workloads.


Software Developer

July 2015 - June 2017 (Salt Lake,Kolkata)

I was core developer for two different projects. I designed and developed a B2B Travel Search Engine. I was responsible for implementing middleware services and integrating those with front end. For the second project, I designed & implemented an intelligence software to retrieve, analyze, transform and report data for business intelligence. It allows users to create different dashboards using its own customizable visualization. It also features advanced analytics concept like data modelling, forecasting, determining product affinity.



I attended the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering ESEC/FSE 2019 in Tallinn, Estonia and presented two papers there. The first paper TERMINATOR: Better Automated UI Test Case Prioritization is related to Testcase prioritization and second paper Predicting Breakdowns in Cloud Services (with SPIKE) is related to Cloud Computing.

TA Experience

CSC230 - Fall 2017 (C and Software Tools)

CSC326 - Spring 2018 (Software Engineering)

CSC520 - Fall 2018 (Artificial Intelligence)

Website Template Credits Last updated: 05/14/2019