-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-15963: add r tweedie package to our test images #15964
GH-15963: add r tweedie package to our test images #15964
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @tomasfryda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, we really overlooked that
2891293
to
8d71e58
Compare
DEBIAN_FRONTEND=noninteractive apt-get install -y \ | ||
texlive \ | ||
texlive-fonts-extra \ | ||
texlive-htmlxml \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
texlive-htmlxml
doesn't exist anymore. In 18.04 it used to be a transitional package that installed
- texlive-formats-extra (>= 2017.20170628)
- texlive-latex-extra (>= 2017.20170628)
- texlive-latex-recommended (>= 2017.20170628)
- texlive-plain-generic (>= 2017.20170628)
So I added those.
See https://answers.launchpad.net/ubuntu/bionic/amd64/texlive-htmlxml/2017.20180305-1
@@ -1,17 +1,20 @@ | |||
ARG FROM_VERSION | |||
FROM harbor.h2o.ai/opsh2oai/h2o-3/dev-r-3.4.1:${FROM_VERSION} | |||
|
|||
RUN \ | |||
curl -sL https://deb.nodesource.com/setup_16.x | bash - && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way of installing Nodejs is deprecated and the script will get removed. So I updated the way to install nodejs==16
.
texinfo \ | ||
texlive-bibtex-extra \ | ||
texlive-formats-extra \ | ||
texlive-generic-extra && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
texlive-generic-extra
doesn't exist anymore. It used to be a transitional package that installed texlive-plain-generic
which I added in the previous step when removing texlive-htmlxml
.
See https://answers.launchpad.net/ubuntu/bionic/amd64/texlive-generic-extra/2017.20180305-1
969b6a5
to
3b36142
Compare
apt-get clean && \ | ||
rm -rf /var/cache/apt/* | ||
rm -rf /var/cache/apt/* && \ | ||
wget "https://www.googleapis.com/download/storage/v1/b/chromium-browser-snapshots/o/Linux_x64%2F1244635%2Fchrome-linux.zip?alt=media" && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Ubuntu 22.04, the chromium-browser
deb package installs just a script that tells you to install chromium using snap
, however, snap requires running systemd which is not ideal in docker images.
So we need to install chromium using some other way.
3107348
to
1a809bd
Compare
RUN \ | ||
add-apt-repository -y ppa:deadsnakes && \ | ||
curl -sL https://deb.nodesource.com/setup_16.x | bash - && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script is deprecated and doesn't work anymore
@@ -8,7 +8,7 @@ if [[ ! ${H2O_BRANCH} ]]; then | |||
fi | |||
if [[ ! -n ${PYTHON_VERSION} ]]; then | |||
echo "Using default Python version" | |||
PYTHON_VERSION='3.6' | |||
PYTHON_VERSION='3.7' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python 3.6 is not supported anymore by the ubuntu 22.04 and is installed (built from source) separately for the py-3.6 tests.
# How to get dependencies? | ||
#========================= | ||
# Since we can't rely on `versions` package to get older packages (Microsoft's CRAN (MRAN) is discontinued), we have to | ||
# install the packages in some other way. What I did was opened the dev-r-3.3.3-jdk-8 and got the installed packages by | ||
# using the following: | ||
# | ||
######################################################################################################################## | ||
# # Get packages and freeze them | ||
# write.csv(installed.packages(), "pkgs.csv") | ||
######################################################################################################################## | ||
# | ||
# Now that I got the packages, their versions and their dependencies, I am able to topologically sort them to ensure | ||
# I won't try to install a package that has is missing some dependencies. Then I install it from CRAN using the above | ||
# defined shell function `getAndInstallFromCRAN`. | ||
# | ||
# To make it repeatable I created the following script in R that uses `igraph` package for creating the graph representing | ||
# packages and its dependencies that are then topologically sorted and printed out so it can be easily copy-pasted here. | ||
# Some packages may never reach CRAN and those should be added to the exclude_pkgs vector (like the `isofor` package) | ||
# and then installed separately. | ||
# | ||
######################################################################################################################## | ||
# exclude_pkgs <- c("isofor") | ||
# | ||
# # Load them, sort them topologically and generate the code to install them | ||
# pkgs <- read.csv("/Users/tomasfryda/tmp/jenkins_logs/pkgs.csv") | ||
# pkgs <- pkgs[pkgs$Priority != "base" | is.na(pkgs$Priority),] | ||
# rownames(pkgs) <- pkgs$Package | ||
# | ||
# library(igraph) | ||
# | ||
# i2n <- setNames(pkgs$Package, seq_len(nrow(pkgs))) | ||
# n2i <- setNames(names(i2n), i2n) | ||
# | ||
# g <- make_empty_graph(directed = TRUE) | ||
# g <- add_vertices(g, nrow(pkgs), label = i2n) | ||
# | ||
# for (n in i2n) { | ||
# deps <- paste0(pkgs[n, "Depends"], ", ", pkgs[n, "Imports"]) | ||
# cat(n, "->", deps, "\n") | ||
# if (is.na(deps)) | ||
# next | ||
# deps <- strsplit(deps, ",\\s*", perl = TRUE)[[1]] | ||
# for (d in deps) { | ||
# print(d) | ||
# d <- strsplit(d, "\\s+", perl = TRUE)[[1]][[1]] | ||
# print(d) | ||
# if (d %in% i2n) { | ||
# g <- add_edges(g, c(n2i[[n]], n2i[[d]])) | ||
# } | ||
# } | ||
# } | ||
# | ||
# ts <- topo_sort(g, mode = "in") | ||
# ts <- ts[!ts %in% n2i[exclude_pkgs]] | ||
# cat(paste("getAndInstallFromCRAN", i2n[ts], pkgs[i2n[ts], "Version"]), sep = "\n") | ||
######################################################################################################################## |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MRAN doesn't work anymore so I needed some other way of installing old R packages. Even installing devtools
is very hard for old R versions as some packages are not available for the old versions so I decided to do equivalent of pip freeze
and manually resolve the dependencies and order the packages in such a way that they are installed in correct order. The little R script I have in the comment is responsible for sorting the packages in the correct order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tomasfryda for the change! I prefer to merge this change after the major release. The release pipeline has been moved from Jenkins master to regular workers. I would like to reduce a number of things that could break the release.
h2o-py/test-requirements.txt
Outdated
pygments==2.7.4; python_version <= '3.7' | ||
pygments==2.15.1; python_version > '3.7' | ||
sphinx==3.0.4; python_version <= '3.8' | ||
pygments==2.7.4; python_version < '3.5' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line can be deleted, Python 3.5 is not currently supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I wasn't sure if we don't use python 2.7 for building the documentation (pygments does syntax highlighting) so I kept it here. I think we don't but could you confirm that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The release pipeline uses python 3.6: https://github.com/h2oai/h2o-3/blob/master/scripts/jenkins/jenkinsfiles/Jenkinsfile-Release#L20
h2o-py/test-requirements.txt
Outdated
sphinx==3.0.4; python_version <= '3.8' | ||
pygments==2.7.4; python_version < '3.5' | ||
pygments==2.15.1; python_version >= '3.7' | ||
sphinx==3.0.4; python_version < '3.5' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line can be deleted, Python 3.5 is not currently supported
@mn-mikke I have mixed feelings about this. I completely agree that these changes increase the chance of breaking something during the release but this PR also contains fixes that were discovered thanks to adding the necessary packages. There is fix for Tweedie dispersion estimation which is non-reproducible (on jenkins, locally I was not able to reproduce it) in current master and rel-3.44. And it also disabled the usage of polars and datatable by default as it can in some cases change types. I can backport the fixes but they'd be untested in the current master due to the missing packages. |
@tomasfryda ok, let's merge it then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Basically, some files are not used in the final package probably due to some non-deterministic issue in roxygen2 which fills our NAMESPACE file and creates our R man pages.
73bb536
…nother process' on windows in python
h2o-py/tests/testdir_algos/xgboost/pyunit_PUBDEV_6117_xgboost_compare.py
Show resolved
Hide resolved
89791f8
to
80a6a23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tons of effort put into it. Good job and thank you!
The R medium stages didn't run tests that contained missing package so it's understandable that the runtime there increased. Java JUnit tests also are aborted very often in master so this PR likely doesn't cause the issue.
#15963