-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCh.40_Discussion.tex
61 lines (30 loc) · 6.16 KB
/
Ch.40_Discussion.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
\chapter{Discussion\label{discussion}}
\textcolor{red}{indications}
\textcolor{red}{follow-up observation}
\textcolor{red}{observation 1}
\textcolor{red}{observation 2}
\textcolor{red}{sum-up from those two}
\section{Implications for research}
\textcolor{red}{how to improve scientific scene 1}
\textcolor{red}{how to improve scientific scene 2}
\textcolor{red}{how to improve scientific scene 3}
\section{Implications for software engineering professionals}
\textcolor{red}{how to improve professional scene 1}
\textcolor{red}{how to improve professional scene 2}
\textcolor{red}{how to improve professional scene 3}
\textcolor{red}{overall}
\section{Limitations and threats to validity}
The major limitation of this study is that the subjective results could not be validated by multiple researchers. In a systematic review, it is standard practice and highly recommended to have at least two, if not more, individuals independently conduct the review processes and then cross validating the findings. This would result in the possibility of comparing individual exclusion decisions and other decicions, thereby increasing the credibility of the study. However, in this study, the methodology was thoroughly documented, which allows us to assert with confidence that the study has an appropriate level of of validity.
As a work of single researcher, there is also a chance of inaccuracy and bias in the literature selection and filtering process. As much of the literature had to be reviewed manually and then included/excluded on a qualitative basis, this is a known limitation and a threat to validity. Multiple rounds of documented filtering and a clear paper trail of all decisions made keeps this threat in the acceptable levels.
\subsection{Limitations of literature selection for review}
Efforts were made to ensure the inclusion of comprehensive set of literature in the search process. This was achieved by setting the starting point of PCL lists to the Wikipedia article of the MIT license.
However, as with all systematic literature reviews, a comprehensive manual review of all literature would have been a formidable task. Therefore, additional filtering was conducted. This filtering was carried out in two phases, starting with the application of inclusion/exclusion criteria, followed by a second phase focused on evaluating the nature of the PCLs and conducting a manual review. As a result of this second phase, a set of literature were excluded following a critical appraisal, with documentation and reasoning provided for each section.
The first phase of filtering has some notable limitations starting with the two PCL listing websites: SPDX and DFSG. Since the material was gathered to a spreadsheet program the duplicates were removed using the short identifier the listing page was using. Let's look at this validity threat using an example. Suppose our spreadsheet program has acquired the PCL with an identifier ''MIT''. The results of phase 1 will not include any other PCL marked with the identifier ''MIT''. In the worst case the identifier ''MIT'' could have actually been ''MIT-DFSG-edition'' but with the identifier of ''MIT''. Since there were so many PCLs in phase 1 it would not have been possible to check the uniqueness of all removed duplicates. One of the reasons why this would not have been feasible is that the listing sites would fetch the PCL contents from another webpage or at the second worst case, from another website. The worst case is that the URl is dead and we get HTTP 404. The amount of PCLs, duplicates and the lack of already existing tools makes this problem multilayered. However this is the integrity level we decided to live with.
FSF's PCL listing introduced us to pick another limitation for the scope of this thesis. The license shortcoded as ''other'' was not a PCL but instead a hyperlink to another listing webpage that listed programs that the FSF has no yet managed to document the license which the program uses. Although the one of the programs called ''babl'' was licensed as with ''gplv3'' the amount of undocumented programs was over 5200 at the time of observation. For this reason we are excluding the PCLs found indirectly from the category ''other''.
\textcolor{red}{tell about the validity threats of osi literature selection for review}
Lastly, GNU project's listing site allowed us to use a shortcut of sorts which we will document here for the purposes of acknowleding the limitations of it. The table of contents at the listing site marked certain consequtive PCLs as software PCLs. On top of this the PCLs were not organized into easily processable tables but rather in stacked on one another in rich text format. Although we decided to use regex on the HTMl file the included PCLs were only the ones that were simply under the header ''Software licenses''. In the worst case scenario GNU project could have misinterpreted some PCLs as non-software licenses thus making this thesis exclude them with a wrong reason. While from a quick glance and the existence of the other four PCL listing sites, we think it is still worth documenting when it comes to validity and the integrity of this thesis.
On top of too heavy filters we would also like to document the too light filters in the literature selection for review. We can see from \hyperref[appendix:a]{Appendix A} that for exmaple PCLs with the literature identifiers L777 and L780 are almost the same regarding the shortcoded identifiers: ''ZPL - 2.1'' and ''ZPL-2.1''. The duplicate removal would have been seemingly simple to execute on phase 1. However with the presence of over 700 pieces of literature we decided not to give special treatment to any potential set of duplicates. While it is most possible that OSI's ''ZPL - 2.1'' is equivalent exactly to SPDX's ''ZPL-2.1'' we could not be sure without looking at their contents. This could have resulted duplicate PCLs in the literature selection for review but these type of duplicates are removed in phases 2 and 3 due to the PCLs being read in full.
As such we can note that the literature selection was done in a sufficient manner.
\subsection{Limitations in data extraction}
\textcolor{red}{importance of data extraction}
\textcolor{red}{lack of measurements and tooling}