From c6b2cb5e4957e675a84f06e643b7fa64e63f2da6 Mon Sep 17 00:00:00 2001 From: "Sean P. Goggins" Date: Wed, 12 Feb 2025 15:20:37 -0600 Subject: [PATCH] Update gsoc-ideas.md Signed-off-by: Sean P. Goggins --- gsoc-ideas.md | 148 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 148 insertions(+) diff --git a/gsoc-ideas.md b/gsoc-ideas.md index 1cf74e11e..d237d9e37 100644 --- a/gsoc-ideas.md +++ b/gsoc-ideas.md @@ -50,6 +50,154 @@ The aims of the project are as follows: - (Stretch Goal) Improve the operation of the overall machine learning insights pipeline in CHAOSS/augur, and generalize these capabilities. + +## IDEA: Implement Conversion Rate Metric in CHAOSS Software + +**Hours: 350** + +[Micro-tasks and place for questions](https://github.com/chaoss/community/issues/305) + +### Conversion Rate + +Question: What are the rates at which new contributors become more sustained contributors? + +### Description + +The conversion rate metric is primarily aimed at identifying how new community members become more sustained contributors over time. However, the conversion rate metric can also help understand the changing roles of contributors, how a community is growing or declining, and paths to maintainership within an open source community. + +### Objectives (why) + - Observe if new members are becoming more involved with an open source project + - Observe if new members are taking on leadership roles within an open source project + - Observe if outreach efforts are generating new contributors to an open source project + - Observe if outreach efforts are impacting roles of existing community members + - Observe if community conflict results in changing roles within an open source community + - Identify casual, regular, and core contributors + +### Implementation + +This project could be implemented using either the CHAOSS/Augur, or CHAOSS/Grimoirelab (including stack components noted in references) technology stacks. + +The aims of the project are as follows: + - Implement the Conversion Rate Metric in CHAOSS Software + - After discussion, consider which CHAOSS Software Stack you wish to work with + - In collaboration with mentors, define the technology framework, and initial path to a "hello world" version of the metric + - Iterative development of the metric + - Assist in the deployment of this metric for a pre-determined collection of repositories in a publicly viewable website linked to the CHAOSS project. + - Advance the work of the [chaoss metrics models working group](https://github.com/chaoss/wg-metrics-models). + +* _Difficulty:_ Medium +* _Requirements:_ Knowledge of Python is desired. Some knowledge of Javascript or twitter/bootstrap is also desired. Key requirement is a keenness to dig into this challenge! +* _Recommended:_ Python experience. +* _Mentors:_ Sean Goggins, Daniel Izquerdo + +#### Filters (optional) + - Commits + - Issue creation + - Issue comments + - Change request creation + - Change request comments + - Merged change requests + - Code Reviews + - Code Review Comments + - Reactions (emoji) + - Chat platform messages + - Maillist messages + - Meetup attendance + +#### Visualizations + +![](./images/gsoc-1.png) + +Source: https://chaoss.github.io/grimoirelab-sigils/assets/images/screenshots/sigils/overall-community-structure.png + +![](./images/gsoc-2.png) + +Source: https://opensource.com/sites/default/files/uploads/2021-09-15-developer-level-02.png + +#### Tools Providing the Metric + - GrimoireLab + - Augur + - openEuler Infra + +#### Data Collection Strategies + +The following is an example from the [openEuler](https://www.openeuler.org/en/) community: + - A group of people who attended an offline event A held by the community, can be identified as Group A. Demographic information of Group A could be fetched from an on-line survey when people register for the event. To identify the conversation rate of these participants: + - Some people from Group A started watching and forking the repos, indicating they have shown some interest in this community. We marked them as subgroup D0 (Developer Level 0) as a subset of Group A. + - Conversion rate from the total number of people in Group A to the number of people in subgroup D0 is: D0/Group A + - Some people from subgroup D0 make more contributions beyond just watching or forking, including creating issues, making comments on an issue, or performed a code review. We marked them as subgroup D1 (Developer Level 1) as a subset of D0. + - Conversion rate from the total number of people in Subgroup D0 to the number of people in subgroup D1 is: D1/D0. + - Some people from subgroup D1 continue to make more contributions, like code contributions, to the project. This could include creating merge requests and merging new project code. We marked them as subgroup D2 (Developer Level 2) as a subset of D1. + - Conversion rate from the total number of people in subgroup D1 to the number of people in subgroup D2 is: D2/D1. + +![](./images/gsoc-3.png) + + Definition: + - Developer Level 0 (D0) example: Contributors who have given the project a star, or are watching or have forked the repository + - Developer Level 1 (D1): Contributors who have created issues, made comments on an issue, or performed a code review + - Developer Level 2 (D2): Contributors who have created a merge request and successfully merged code + - Conversion Rate (Group A -> D0): CR (Group A -> D2) = D0/Group A + - Conversion Rate (D0 -> D1): CR (D0 -> D1) = D1/D0 + - Conversion Rate (D1 -> D2): CR (D1 -> D2) = D2/D1 + +### References + - https://opensource.com/article/21/11/data-open-source-contributors + - https://github.com/chaoss/augur + - https://gitee.com/openeuler/website-v2/blob/master/web-ui/docs/en/blog/zhongjun/2021-09-15-developer-level.md + - https://chaoss.github.io/grimoirelab-sigils/common/onion_analysis/ + - https://mikemcquaid.com/2018/08/14/the-open-source-contributor-funnel-why-people-dont-contribute-to-your-open-source-project/ +### Contributors + - Yehui Wang + - Clement Li + - zhongjun + - Xiaoya Xia + - Matt Germonprez + - Sean Goggins + - King Gao + + + +## IDEA: Open Source Software Health Metrics Visualization Exploration + +**Hours: 300** + +[Micro-tasks and place for questions](https://github.com/chaoss/augur-community-reports/issues/34) + +The CHAOSS Community currently delivers pre-packaged visualizations of open source software health data through Augur APIs (https://github.com/chaoss/augur/blob/main/augur/routes/pull_request_reports.py and https://github.com/chaoss/augur/blob/main/augur/routes/contributor_reports.py), and the https://github.com/chaoss/augur-community-reports repository. This project seeks to expand, refine, and standardize the visualization of different classes of community health metrics data. Specifically, some analyses are temporal, others are anomaly driven, and in some cases contrasts across repositories and communities are required. In each case, the visualization of data is an essential component for metrics, and what we are now referring to as metrics models (https://github.com/chaoss/wg-metrics-models). + +Additional resources include: http://new.augurlabs.io/ && https://github.com/augurlabs/augur_view which demonsrate the updated twitter/bootstrap Augur frontend. + +The aims of the project are as follows: + - Experiment with standard metrics visualizations using direct Augur database connections, or through the Augur API. + - Refine metrics, and metrics model visualizations using Jupyter Notebooks are similar technology. + - Transform visualizations, as they are completed, into Augur API endpoints, following the pull request, and contributor reports examples. + +* _Difficulty:_ Medium +* _Requirements:_ Strong interest in data visualization. +* _Recommended:_ Experience with Python is desirable, and experience designing, or developing visualizations is desirable. +* _Mentors:_ Isaac Milarsky, Andrew Brain + +## Idea: Build Knowledgebase Application on CHAOSS Website + +**Hours: 350** + +Micro-tasks and place for questions: https://github.com/chaoss/website/issues/708 + +While much of the work of CHAOSS is done in GitHub, the CHAOSS website is often the first place people visit to get information [https://chaoss.community/](https://chaoss.community/). The goal of the website is to create clear paths for new members who want to contribute, metrics users who want information about metrics, and existing members who need information about project operations. As project grows, there is a need for alternative display and categorization options for knowledgebase topics to reduce the burden on website visitors in finding the information that they need. + +Work on this project would require the student to work closely with the mentors and the community to come up with different display and categorization options for CHAOSS Knowledge Base topics. CHAOSS knowledgebase topics that are currently under consideration for this application are released metrics, metrics models, and contributor handbook information. Information about these topics are captured and stored in GitHub repositories by the relevant working groups. The application will need to pull information from github markdown documents to display on the website knowledge base application (we have existing code that does this). + +The aims of the project are as follows: +* Use Wordpress to implement a knowledge application (example knowledgebase plugin up for consideration - https://wordpress.org/plugins/basepress/) +* Research and ideate different display options and categorizations for knowledgebase topics. +* Build web pages to display different knowledgebase topics. + +The aims will require working with front-end web development technologies and WordPress to build a knowledgebase application that can display information about knowledgebase topics. + +* _Difficulty:_ Low +* _Requirements:_ Interest in front-end web development +* _Recommended:_ Experience with Wordpress, HTML, CSS, JavaScript, and GitHub Markdown +* _Mentors:_ John McGinness, Sean Goggins * _Difficulty:_ Medium * _Requirements:_ Interest in software analytics. Python programming. Conceptual understanding of machine learning, and an eagerness to learn maching learning, and SQL knowledge. * _Recommended:_ Experience with Python