Sunday, 8 June 2014

Web Science: It's All in the Mind 

University of Southampton, Electronics & Computer Science

OVERVIEW: This year we celebrate the 25th Anniversary of the World Wide Web. Twenty-five years ago there were no web sites, by 1994 there were 800, today it is estimated there are nearly a billion. The reason for this is not solely down to the technology, it is because we - as individuals, organisations and society - create the content that makes the Web grow. This socio-technical aspect of the Web was the founding principal of Web Science. In this talk we will discuss the theory and practice of Web Science – past, present and future – and conjecture the nature of collective intelligence on the Web. Will the Web ever develop a mind of it’s own?

    Berners-Lee, T., Hall, W., Hendler, J., Shadbolt, N., & Weitzner, D. (2006). Creating a Science of the WebScience, 313(5788), 769-771.
    Berners-Lee, T., Hall, W., Hendler, J. A., O'Hara, K., Shadbolt, N., & Weitzner, D. J. (2006). A framework for web scienceFoundations and trends in Web Science, 1(1), 1-130.
    Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., & Weitzner, D. (2008). Web science: an interdisciplinary approach to understanding the webCommunications of the ACM, 51(7), 60-69.
    O'Hara, K., Contractor, N. S., Hall, W., Hendler, J. A., & Shadbolt, N. (2013). 
Web Science: understanding the emergence of macro-level features on the World Wide WebFoundations and Trends in Web Science4(2-3), 103-267

    Tiropanis, T., Hall, W., Shadbolt, N., De Roure, D., Contractor, N., & Hendler, J. (2013). The Web Science ObservatoryIEEE Intelligent Systems28(2), 100-104.

Towards a Global Brain: 
The Web as a Self-organizing, Distributed Intelligence

Vrije Universiteit Brussel, ECCO - Evolution, Complexity and Cognition research group


OVERVIEW: Distributed intelligence is an ability to solve problems and process information that is not localized inside a single person or computer, but that emerges from the coordinated interactions between a large number of people and their technological extensions. The Internet and in particular the World-Wide Web form a nearly ideal substrate for the emergence of a distributed intelligence that spans the planet, integrating the knowledge, skills and intuitions of billions of people supported by billions of information-processing devices. This intelligence becomes increasingly powerful through a process of self-organization in which people and devices selectively reinforce useful links, while rejecting useless ones. This process can be modeled mathematically and computationally by representing individuals and devices as agents, connected by a weighted directed network along which "challenges" propagate. Challenges represent problems, opportunities or questions that must be processed by the agents to extract benefits and avoid penalties. Link weights are increased whenever agents extract benefit from the challenges propagated along it. My research group is developing such a large-scale simulation environment in order to better understand how the web may boost our collective intelligence. The anticipated outcome of that process is a "global brain", i.e. a nervous system for the planet that would be able to tackle both global and personal problems.


    Heylighen, F. (2014). Return to Eden? Promises and Perils on the Road to a Global SuperintelligenceThe End of the Beginning: Life, Society and Economy on the Brink of the Singularity, B. Goertzel and T. Goertzel, Eds.
    Heylighen, F. (2013). Self-organization in Communicating Groups: the emergence of coordination, shared references and collective intelligence. In Complexity Perspectives on Language, Communication and Society (pp. 117-149). Springer Berlin Heidelberg.

Mapping the Brain Connectome

Montreal Neurological Institute 
McGill University, Biomedical Engineering

OVERVIEW: The study of macroscopic neural connectivity using neuroimaging has exploded in recent years, with applications in many areas of clinical and basic neuroscience.  These approaches yield metrics of information flow across a network that are not accessible with focal metrics such as functional activation, metabolism or anatomical morphometry. However, there remain fundamental issues, both technical and conceptual, in reducing connectivity information from different imaging techniques into a holistic model of neural connectivity.  We will discuss different forms of connectivity, as defined by structural and functional correlation (MRI, fMRI, PET) and DTI tractography, with illustrations in normal and disordered brain.
    He, Y., & Evans, A. (2010). Graph theoretical modeling of brain connectivityCurrent opinion in neurology, 23(4), 341-350.
    Bullmore, E. T., & Bassett, D. S. (2011). Brain graphs: graphical models of the human brain connectomeAnnual review of clinical psychology, 7, 113-140.
    Sporns, O., Tononi, G., & K├Âtter, R. (2005). The human connectome: a structural description of the human brainPLoS computational biology, 1(4), e42.

Web Impact on Society 

University of Southampton, Web Science

OVERVIEW: The Web is not just an engineered technical artefact because the Web architecture (HTTP, HTML and URIs) is only the kernel of an enormously complex social-technical machine. Phenomena like online banking, Web TV, internet shopping, e-government and social networking are the names that we give to human activities and human agendas that have co-opted the capabilities of this web architecture. While we may look to the Web to offer a source of "big data" for "social analytics", one of the goals of Web Science is to try to find a perspective that helps us to understand the bigger "socio-technical" picture of the Web, and hence to better interpret the data that we harvest from the Web. By looking at specific examples of how the Web has grown and developed (such as open access, open government data), we can start to see some of the principles and mechanisms of the socio-technical Web.
    Tinati, R.,  Carr, L., Halford, S., Pope, C. (2013) The HTP Model: Understanding the Development of Social Machines, WWW2013 Workshop: The Theory and Practice of Social Machines,
    Tinati, R., Carr, L., Halford, S., Pope C. (2014) (Re)Integrating the Web: Beyond ‘Socio-Technical’, WWW2014 

Open Science and the Web

Microsoft Research Connections


OVERVIEW: Turing award winner, Jim Gray, envisioned a world where all research literature and all research data were online and interoperable. He believed that such a distributed, global digital library could significantly increase the research "information velocity" and improve the scientific productivity of researchers. The last decade has seen significant progress in the move towards open access to scholarly research publications and the removal of barriers to access and re-use. But barrier-free access to the literature alone only scratches the surface of what the revolution of data intensive science promises. Recently, in the US, the White House has called for federal agencies to make all research outputs (publications and data) openly available. But in order to make this effort effective, researchers need better tools to capture and curate their data, and Jim Gray called for 'letting 100 flowers bloom' when it came to research data tools. Universities have the opportunity and obligation to cultivate the next regeneration of professional data scientists who can help define, build, manage, and preserve the necessary data infrastructure. This talk will cover some of the recent progress made in open access and open data, and will discuss some of the opportunities ahead.

    Fox, G., Hey, T., & Trefethen, A. (2013). Where Does All the Data Come From?. Data-Intensive Science, 115.
    Hey, T. (2010). 
The next scientific revolutionHarv Bus Rev88(11), 56-63. The Fourth Paradigm: Data-Intensive Scientific Discovery Book 2009

Scholarly Big Data: Information Extraction and Data Mining

Pennsylvania State University
Information Sciences and Technology

Overview: Collections of scholarly documents are usually not thought of as big data. However, large collections of scholarly documents often have many millions of publications, authors, citations, equations, figures, etc., and large scale related data and structures such as social networks, slides, data sets, etc. We discuss scholarly big data challenges, insights, methodologies and applications. We illustrate scholarly big data issues with examples of specialized search engines and recommendation systems based on the SeerSuite software. Using information extraction and data mining, we illustrate applications in such diverse areas as computer science, chemistry, archaeology, acknowledgements, citation recommendation, collaboration recommendation, and others.

    Khabsa, M & Giles, C.L. (2014) The Number of Scholarly Documents on the Web. PLOS ONE 10.1371/journal.pone.0093949
    Caragea, C., Wu, J., Ciobanu, A., Williams, K., Fernandez-Ramrez, J., Chen, H. H., ... & Giles, L. (2014). 
CiteSeer x: A Scholarly Big Dataset. In Advances in Information Retrieval (pp. 311-322). Springer International Publishing.

    Flake, G. W., Lawrence, S., Giles, C. L., & Coetzee, F. M. (2002). Self-organization and identification of web communitiesComputer35(3), 66-70.

New Models of Scholarly Communication for Digital Scholarship

University of Pittsburgh, School of Information Science

OVERVIEW: Contemporary research and scholarship increasingly uses large-scale datasets and computationally intensive processing.  Cultural shifts in the scholarly community challenge long-standing of academic institutions and call into question the efficacy and fairness of traditional models of scholarly communication. Scholars are also calling for greater authority in the publication of their works and rights management.  Agreement is growing on how best to manage and share massive amounts of diverse and complex information objects.  Open standards and technologies allow interoperability across institutional repositories.  Content level interoperability based on semantic web and linked open data standards is becoming more common.   Information research objects are increasingly thought of as social as well as data objects - promoting knowledge creation and sharing and possessing qualities that promote new forms of scholarly arrangements and collaboration. This talk will present alternative paths for expanding the scope and reach of digital scholarship and robust models of scholarly communication necessary for full reporting.  The overall goals are to increase research productivity and impact, and to give scholars a new type of intellectual freedom of expression.

    Griffin, S. (2013) Scholarly Communication: New Models for Digital Scholarship Workflows Coalition for Networked Information, Spring 2013 Meeting
    Griffin, S. et al (2014) The Denton Declaration: An Open Data Manifesto 
    Borgman, C.L. (2013) Digital Scholarship and Digital Libraries: Past, Present, and Future Theory and Practice of Digital Libraries Conference, September 2013

    Calhoun, K (2014) Exploring Digital Libraries: Foundations, practice, prospects Facet Publishing London, UK