Visualising and Analysing Geographic and Movement Data
In this take-home exercise, we work on bullet points 1 and 2 of Challenge 2 of VAST Challenge 2022 to reveal the social areas and traffic bottlenecks of the city of Engagement, Ohio USA.
Before we get started, it is important for us to ensure that the required R packages have been installed. If yes, we will load the R packages. If they have yet to be installed, we will install the R packages and load them onto R environment.
The chunk code below will do the trick.
packages = c('tidyverse', 'knitr', 'sf', 'tmap', 'lubridate', 'clock', 'sftime',
'rmarkdown', 'dplyr')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
In the code chunk below, read_sf()
of sf
package is used to parse School.csv Pubs.csv, Apartments.csv,
Buildings.csv, Employer.csv, and Restaurants.csv into R as sf
data.frames.
schools <- read_sf("data/wkt/Schools.csv",
options = "GEOM_POSSIBLE_NAMES=location")
pubs <- read_sf("data/wkt/Pubs.csv",
options = "GEOM_POSSIBLE_NAMES=location")
apartments <- read_sf("data/wkt/Apartments.csv")
buildings <- read_sf("data/wkt/Buildings.csv",
options = "GEOM_POSSIBLE_NAMES=location")
employers <- read_sf("data/wkt/Employers.csv",
options = "GEOM_POSSIBLE_NAMES=location")
restaurants <- read_sf("data/wkt/Restaurants.csv",
options = "GEOM_POSSIBLE_NAMES=location")
In the chunk code below, we will extract combine the pubs and
restaurant data tables using bind_rows()
of dplyr
package. In addition, we will create 2 new data tables using filter()
which is also from the dplyr package; One containing only
commercial and school (school_commercial) buildings and the other only
containing residential (residental) buildings. Lastly, left_join
is used to add the average rental cost to the residental data table.
pub_res <- bind_rows(select(pubs %>% mutate(type = "pubs"),
c(location, type)),
select(restaurants %>% mutate(type = "restaurants"),
c(location, type)))
school_commercial <- filter(buildings, buildingType %in% c("School","Commercial"))
avg_rental <- mutate(apartments, rentalCost = as.numeric(rentalCost)) %>%
group_by(buildingId) %>%
summarize(rental = mean(rentalCost))
residental <- filter(buildings, buildingType == "Residental") %>%
left_join(avg_rental, by = "buildingId")
In this section, we will visualise the layout of buildings in the
city of Engagement. The code chunk below plots the building polygon
features by using tm_polygon()
from tmap. tm_layout()
is used to add the plot title while tm_compass
is used to add a compass.
tmap_mode("plot")
tm_shape(buildings)+
tm_polygons(col = "buildingType",
border.col = "black",
border.lwd = 1,
palette = "Accent")+
tm_shape(pub_res) +
tm_symbols(col = "type",
size = 0.15,
palette = "Set1",
border.col = "black")+
tm_layout(title= 'City Layout',
title.position = c('right', 'top'),
legend.title.color = "white") +
tm_compass(type = "4star",
size = 3)
From the plot above, we can see that the city is made up of several commercial hubs surrounded by residential buildings. The 3 commercial hubs are in the top left, center right and bottom right of the map. The pubs and restaurants are found closer to commercial buildings. There are a total of 4 schools spread out across the city.
In this section, we will visualise the cost of residential rental throughout the city to determine which areas are more expensive to live in.
tmap_mode("plot")
tm_shape(residental)+
tm_polygons(col = "rental",
border.col = "black",
border.lwd = 1,
palette = "Reds",
textNA = "school/commercial/no rental data",)+
tm_shape(school_commercial)+
tm_polygons(col = "grey",
border.col = "black",
border.lwd = 1)+
tm_layout(title= 'Rental Prices',
title.position = c('right', 'top')) +
tm_compass(type = "4star",
size = 3)
As can been seen from the plot above, apartments with high rental are mostly found within the 3 commercial hubs.
In this section, we will be using a square binning map that displays the roads that participants frequently travel on.
In the code chunk below, read_sf()
of sf
package is used to parse ParticipantStatusLogs1.csv into R as sf
data.frames.
logs <- read_sf("data/wkt/ParticipantStatusLogs1.csv",
options = "GEOM_POSSIBLE_NAMES=currentLocation")
In the chunk code below, we will convert timestamp field from
character data type to date-time data type by using date_time_parse()
of clock
package. We will derive a day field by using get_day()
of clock package. Lastly, we will extract records where by currentMode
field is equal to Transport class by using filter().
logs_selected <- logs %>%
mutate(Timestamp = date_time_parse(timestamp,
zone = "",
format = "%Y-%m-%dT%H:%M:%S")) %>%
mutate(day = get_day(Timestamp)) %>%
filter(currentMode == "Transport")
In the code chunk below, st_make_grid()
of sf
package is used to create squares. Next, st_join()
is used to perform point in squares overlay while count()
of dplyr is used to count the number of points falling withing
each square. Lastly, left_join() is used to perform a
left-join by using sqr as the target table and points_in_sqr as the join
table. The join ID is sqr_id.
sqr <- st_make_grid(buildings,
n = c(200, 200)) %>%
st_sf() %>%
rowid_to_column('sqr_id')
points_in_sqr <- st_join(logs_selected,
sqr,
join=st_within) %>%
st_set_geometry(NULL) %>%
count(name='pointCount', sqr_id)
sqr_combined <- sqr %>%
left_join(points_in_sqr,
by = 'sqr_id') %>%
replace(is.na(.), 0)
In the code chunk below, tmap package is used to create the square binning map.
tm_shape(sqr_combined %>%
filter(pointCount > 0))+
tm_fill("pointCount",
n = 10,
style = "quantile",
palette = "Reds") +
tm_borders(alpha = 0) +
tm_layout(title= 'Traffic Congestion',
title.position = c('right', 'top')) +
tm_compass(type = "4star",
size = 3)
As can be seen from the plot above, the roads with the most vehicles are those connecting the commercial centers in the top left, centre right and bottom right.