Statistical Analysis of Duplicated Questions for google-apps-script tag in Stackoverflow

Gists

Introduction

At Stackoverflow, a lot of people post the questions and answer to the questions every day. By this, there are a lot of important information in Stackoverflow. I have already reported “Trend of google-apps-script Tag on Stackoverflow” using the data retrieved from Stackoverflow. Ref. 1 It is found that the important statistical result can be obtained by analyzing the data on Stackoverflow. In this report, I would like to introduce the statistical analysis of duplicated questions for the google-apps-script tag in Stackoverflow. When the duplicated question is analyzed, it is considered that the important issues for users can be known. As the result, it was found that there are the trend that the duplicated questions related to Javascript, Google Spreadsheet , the process cost and the cooperation with HTML and Javascript are posted.

Experimental procedure

When the history of Stackoverflow and Google Apps Script is confirmed, the following list can be shown.

1. 2008-09-15: Stackoverflow was launched. Ref. 2
2. 2009-08-19: Google Apps Script was released. Ref. 3
3. 2011-08-29: Tag of “google-apps-script” was created in Stackoverflow. Ref. 4

This list indicates that the history of Stackoverflow is older than that of Google Apps Script, and also the tag of “google-apps-script” at Stackoverflow is introduced by the Google’s official document. This also indicates that a lot of information about the history of Google Apps Script can be obtained from all questions with tag of “google-apps-script” in Stackoverflow.

All questions and the related data can be retrieved by a tag using Stackexchange API. Ref. 5 In this report, “google-apps-script” was used as the base tag. And all duplicated questions including the “google-apps-script” tag were retrieved and statistically analyzed. At Stackoverflow, when the issue of the posted question has already been posted before, it is flagged as the duplicated question by linking to the old questions. When a question is used as the origin of the duplicated question from several newer questions, it is considered that the original question has the important information as the issue that many users have. The data was retrieved from 2008-01-01 to 2020-10-06. At Stackoverflow, users can edit the old questions and answers. So please be careful that the data which introduce in this report is the data when was retrieved. The data used in this report was retrieved at 2020-10-06.

Results and discussions

The duplicated questions including google-apps-script tag are introduced.

Fig. 1. Year vs. number of duplicated questions. This data is at October 6, 2020.

Fig. 2. Year vs. Total questions, answered, solved and closed questions. These all questions include the tag of “google-apps-script” in the tags. This data is at October 6, 2020.

Figure 1 shows years vs. number of duplicated questions. This figure shows that the duplicated questions are increased with year. At Fig. 2, the number of duplicated questions is about 5 % of the total questions. And, when the ratio of increase of posted questions at 2019 and 2020 (at 2020-10-06), the ratio of 2020 is lower than that of 2019. From the changing from 2019 to 2020 at Figs. 1 and 2, it is considered the following 2 points. It indicates that the existing questions are useful for users. And also, it is considered that the result due to the active users at Stackoverflow is included.

Table 1. Table that the number of duplicated questions are summarized. “Number of duplicated questions” means the number which was used as the duplicated question. All data from 2008-01-01 to 2020-10-06 were summarized.

From Table 1, it is found that there are the trend that the duplicated questions related to Javascript, Google Spreadsheet , the process cost and the cooperation with HTML and Javascript are posted.

Summary

In this report, the following results were obtained by analyzing the duplicated questions with google-apps-script tag in Stackoverflow.

• Duplicated questions are increased with year.

• It was found that there are the trend that the duplicated questions related to Javascript, Google Spreadsheet , the process cost and the cooperation with HTML and Javascript are posted.

I think that to publish the sample scripts educate by considering these results will be more useful for a lot of users. I would like to continue to investigate this statistics.

Appendix

I think that this method introduced in this report can be used for various tags. Here, as appendix, 2 sample data are shown.

For google-sheets tag without google-apps-script tag

This is the ranking for the duplicated questions with google-sheets tag without google-apps-script tag. In this case, the data from “2008-01-01” to “2020-10-06” was used for analyzing.

For javascript tag

This is the ranking for the duplicated questions with javascript tag. In this case, the data from “2008-01-01” to “2020-10-06” was used for analyzing.