2017 RSE-MOST workshop prelude – Perspectives for Data Science

Time: 10/23/2017 3:30PM-4:30PM

Location: Science Building 3 B1 Conference Room

Talk Title 1 : Big Questions, Informative Data, Excellent Science
Professor Adrian Bowman FRSE
School of Mathematics and Statistics, University of Glasgow

Talk Abstract:
The quantity of data available in the world is increasing at an astonishing rate.
There is a welcome move towards more ‘open data’ practices while at the same time the technology
for data capture has developed very rapidly. Huge databases are being assembled in genetics,
astronomy, physics, neuroscience, finance, commerce and many other areas. This raises exciting
prospects for new analysis and understanding. However, the size of a dataset is not always its most
important aspect. The driving force should be `big questions’ and the extent to which the data carry
information to help in answering them. In the inevitable presence of variation, expressing the
processes at work through an imaginative statistical model provides a hugely valuable tool,
although new ways of implementing, analysing and interpreting the outcomes are likely to be
required. Quantifying uncertainty is also crucial for good decision-making. These issues will be
illustrated in the context of medical imaging, environmental modelling and urban social science.

Talk Title 2: Boolean Networks for Big Data
Professor Henry Horng-Shing Lu
Institute of Statistics, National Chiao Tung University

Talk Abstract:
One great challenge of genomic research is to efficiently and accurately identify
complex gene regulatory networks. The development of high-throughput technologies provides
numerous experimental data such as DNA sequences, protein sequence, and RNA expression
profiles makes it possible to study interactions and regulations among genes or other substance in
an organism. However, it is crucial to make inference of genetic regulatory networks from gene
expression profiles and protein interaction data for systems biology. We will discuss the approach to
reconstruct time delay Boolean networks as a tool for exploring biological pathways. Specifically,
we will show that O(log n) state transition pairs are sufficient and necessary to reconstruct the time
delay Boolean network of n nodes with high accuracy if the number of input genes to each gene is
bounded. The other related approaches will be discussed as well.