Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geocodificado recursivo #5

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Authors@R: c(person("Carlos J.", "Gil Bellosta", email="[email protected]", r
Author: Carlos J. Gil Bellosta, Luz Frías
Maintainer: Carlos J. Gil Bellosta <[email protected]>
Description: Access to Cartociudad cartography API, which provides mapping and other related services for Spain.
Imports: httr, jsonlite, xml2, plyr, geosphere
Imports: httr, jsonlite, xml2, plyr, geosphere, purrr
Copy link
Member

@koldLight koldLight Aug 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sé si el uso actual justifica incluir una nueva dependencia (purrr). Actualmente solo se utiliza para hacer 2x rbind de lista de data.frames, que se puede sustituir por funciones de R base: do.call(rbind, res_list)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sí, en principio lo hice así. He de reconocer que soy un adicto al paquete purrr... :) Lo corrijo.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He pensado en usar plyr::rbind.fill(), puesto que los data.frame's pueden tener diferente número de columnas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Genial

Depends: R (>= 3.0.0)
Suggests: ggmap, testthat
URL: https://github.com/cjgb/caRtociudad
Expand Down
59 changes: 45 additions & 14 deletions R/cartociudad_geocode.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
#' @param output_format Character string. Output format of the query: "JSON" or
#' "GeoJSON".
#'
#' @return A data frame consisting of a single row per guess. See the reference
#' @return A data frame consisting of a single row per query. See the reference
#' below for an explanation of the data frame columns.
#'
#' @author Carlos J. Gil Bellosta
Expand All @@ -27,23 +27,54 @@
#' \url{http://www.cartociudad.es/recursos/Documentacion_tecnica/CARTOCIUDAD_ServiciosWeb.pdf}
#'
#' @examples
#' # using full address
#' my.address <- cartociudad_geocode(full_address = "plaza de cascorro 11, 28005 madrid")
#' # Query a single address
#' address <- "plaza de cascorro 11, 28005 madrid"
#' my.address <- cartociudad_geocode(full_address = address)
#' print(my.address)
#'
#' # Query multiple addresses
#' address <- c(address, "plaza del ayunamiento 1, valencia")
#' my.address <- cartociudad_geocode(full_address = address)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misma recomendación que con cartociudad_reverse_geocode: sugiero incluir un test para múltiples direcciones

#' print(my.address)
#'
#' @export
#'
cartociudad_geocode <- function(full_address, output_format = "JSON") {
api.args <- list(q = full_address, outputformat = output_format)
ua <- get_cartociudad_user_agent()
res <- httr::GET("http://www.cartociudad.es/geocoder/api/geocoder/findJsonp",
query = api.args, ua)
httr::stop_for_status(res)

res <- jsonp_to_json(httr::content(res, as = "text", encoding = "UTF8"))
res <- jsonlite::fromJSON(res)
res <- as.data.frame(t(unlist(res)), stringsAsFactors = FALSE)
res[, c(grep("lat", names(res)), grep("lng", names(res)))] <-
apply(res[, c(grep("lat", names(res)), grep("lng", names(res)))], 2, as.numeric)
return(res)
stopifnot(class(full_address) == "character")
stopifnot(length(full_address) >= 1)
no_geocode <- which(nchar(full_address) == 0)
res_list <- list()

for (i in seq_along(full_address)) {
if (!i %in% no_geocode) {
api.args <- list(q = full_address[i], outputformat = output_format)
ua <- get_cartociudad_user_agent()
res <- httr::GET("http://www.cartociudad.es/geocoder/api/geocoder/findJsonp",
query = api.args, ua)
if (httr::http_error(res)) {
warning("Error in query ", i, ": ", httr::http_status(res)$message)
res_list[[i]] <- data.frame(address = full_address[i],
stringsAsFactors = FALSE)
} else {
res <- jsonp_to_json(httr::content(res, as = "text", encoding = "UTF8"))
res <- jsonlite::fromJSON(res)
res <- res[-which(names(res) %in% c("geom", "countryCode", "refCatastral"))]
if (length(res) == 0) {
warning("The query has 0 results.")
res_list[[i]] <- data.frame(address = full_address[i],
stringsAsFactors = FALSE)
} else {
res_list[[i]] <- as.data.frame(t(unlist(res)), stringsAsFactors = FALSE)
}
}
} else {
warning("Empty string as query: NA returned.")
res_list[[i]] <- data.frame(address = NA, stringsAsFactors = FALSE)
}
}
results <- purrr::map_df(res_list, rbind)
results[, c("lat", "lng")] <- apply(results[, c("lat", "lng")], 2, as.numeric)

return(results)
}
63 changes: 31 additions & 32 deletions R/cartociudad_reverse_geocode.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,50 +16,49 @@
#' @param longitude Point longitude in geographical coordinates (e.g.,
#' -3.7227241)
#'
#' @return A list with the following items:
#' \item{tipo}{type of location.}
#' \item{tipo.via}{road type.}
#' \item{nombre.via}{road name.}
#' \item{num.via}{road number.}
#' \item{num.via.id}{internal id of this address in cartociudad database.}
#' \item{municipio}{town.}
#' \item{provincia}{province.}
#' \item{cod.postal}{zip code.}
#' @return A data frame consisting of a single row per query. See the reference
#' below for an explanation of the data frame columns.
#'
#' @author Luz Frias
#'
#' @references
#' \url{http://www.cartociudad.es/recursos/Documentacion_tecnica/CARTOCIUDAD_ServiciosWeb.pdf}
#'
#' @examples
#' # Query one point
#' cartociudad_reverse_geocode(40.473219, -3.7227241)
#'
#' # Query multiple points
#' cartociudad_reverse_geocode(c(40.473219, 39.46979), c(-3.7227241, -0.376963))
#'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Como recomendación, estaría bien añadir un test en tests/testthat.R con el caso de llamar a cartociudad_reverse_geocode para múltiples localizaciones

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yo me ocupo de las pruebas.
He pensado en comprobar el funcionamiento ante casos típicos que estoy viendo, como una dirección sin número (devuelve la geometría de toda la vía y la latitud y longitud del portal 0), portales y pk's inexistentes (devolución del punto más próximo), y viales inexistentes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfecto!

#' @export
#'
cartociudad_reverse_geocode <- function(latitude, longitude) {

query.parms <- list(
lat = latitude,
lon = longitude
)


stopifnot(length(latitude) == length(longitude) | length(latitude) == 0)

names_res <- c("type", "tip_via", "address", "portalNumber", "id",
"muni", "province", "postalCode", "lat", "lng")
res_list <- list()

url <- "http://www.cartociudad.es/services/api/geocoder/reverseGeocode"
ua <- get_cartociudad_user_agent()


res <- httr::GET(url, query = query.parms, ua)
httr::stop_for_status(res)
info <- httr::content(res)
# Parse the response
res <- list(
tipo = info$type,
tipo.via = info$tip_via,
nombre.via = info$address,
num.via = info$portalNumber,
num.via.id = info$id,
municipio = info$muni,
provincia = info$province,
cod.postal = info$postalCode
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sobre la eliminación de esto, creo que conviene mantener en el paquete el mapeo de nombres, para:

  • Mantener coherencia: son nombres con mismo idioma, mismo formato, no como los que devuelve directamente la API (hay inglés, español, underscored, camelCased, ...)
  • Mantener retro-compatibilidad con los usuarios actuales
  • Asegurar futura compatibilidad: si se lanza una nueva versión de la API y los campos tienen otros nombres, cambiamos el mapeo en el paquete, y la actualización será transparente para los usuarios

Dime cómo lo ves. Si estás de acuerdo, sería bueno mantener el trozo de documentación eliminado sobre los nombres

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me parece perfecto, aunque creo que sería mejor devolver un data.frame en lugar de una lista. ¿Conservamos esa parte?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, totalmente de acuerdo, tiene más sentido ahora que está vectorizado

return(res)

for (i in seq_along(latitude)) {
query.parms <- list(lat = latitude[i], lon = longitude[i])
res <- httr::GET(url, query = query.parms, ua)

if (httr::http_error(res)) {
warning("Error in query ", i, ": ", httr::http_status(res)$message)
res_list[[i]] <- data.frame(lat = latitude[i], lng = longitude[i],
stringsAsFactors = FALSE)
} else {
info <- httr::content(res)
res_list[[i]] <- as.data.frame(t(unlist(info)),
stringsAsFactors = FALSE)[, names_res]
}
}

results <- purrr::map_df(res_list, rbind)
return(results)
}
12 changes: 9 additions & 3 deletions man/cartociudad_geocode.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 6 additions & 9 deletions man/cartociudad_reverse_geocode.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.