Category Archives: Knowledge Domain Visualization
The great physicist Niels Bohr once said something to the effect that anyone who didn’t come away from reading about quantum mechanics with their head spinning probably hadn’t understood it properly. I have found that, in addition to quantum mechanics, any significant time spent thinking about the nature of disciplines and how best to describe Disaster Studies and Sciences in terms of disciplinarity, also tends to make my head spin.
There is at the moment no shortage of descriptors for disciplinarity: unidisciplinary, multidisciplinary, interdisciplinary, crossdisciplinary, metadisciplinary, hyperdisciplinary, superdisciplinary, supradisciplinary, and transdisciplinary can all be found in the literature. Despite the multitude of terms, no clear, consistent, and agreed typology currently exists. So one has an veritable ocean of ideas within which to swim about. Or drown in.
Of course, there is much that is historical, sociological, political, economic, and just flat out arbitrary in the development of academic and professional “disciplines.” The quest for knowledge using systematic methods predates any attempts to make the pursuit of that knowledge the primary domain of particular individuals. So to some extent in trying to develop a logical, coherent conception of academic, scientific, and professional disciplines, we are trying to logically explain what was originally created without logic, or “big picture” in mind.
This does not mean that the concept of disciplines is of no value. It means that a degree of fuzziness and inconsistency in our concepts should be expected.
In my thesis, I assert that based upon the existence and structure of its body of knowledge, what is often referred to as “Emergency Management” involves far more disciplines than recognized. I suggest that this field/discipline is more accurately called Disaster Studies and Sciences (DSS). I am beginning to see, however, that simply calling DSS a “discipline” is more problematic. DSS is certainly not unidisciplinary and does not fit within any traditional disciplinary map of knowledge, yet it does have an organized structure. There can be no doubt that DSS is multidisciplinary but it does not appear to transcend its individual disciplines to the level of an interdiscipline or transdiscipline….
And this is the point where my head starts to hurt and I begin to think that some part of the puzzle is just out of reach…..
Have been working on some visualizations using the very large dataset. Here are two of them, a Journal Co-Citation (JCA) Network (above), and a Document Co-citation (DCA) network (below). Enjoy!
Here is a first view of a 5000 node network of most frequently occurring terms in the titles and abstracts of the large, 34,000 record dataset.
The term network provides an aggregate glimpse into how research interests in Disaster Studies and Sciences have been structured across the 110-year period. As is immediately apparent, a substantial volume of the research space is occupied by work on seismic hazards and earthquakes. Additionally, a distinction can be seen between the topics more of interest to pure seismology on the far right, transitioning into topics of more interdisciplinary interest as one moves to the left.
As one continues moving towards the left, other areas of hazard study are encountered, including tsunamis, volcanology, landslides/avalanches, floods, and meteorological events. What is also found is something of a transitional area bridging the more natural science-oriented hazards branch to the social and medical science-orientation of the human dimensions branch: pure and applied sciences and technology, including optics, GIS, remote sensing, and decision-support systems.
It is near this area (near the center left of the network) one may be quite surprised to find homeland security. However (and this came as a surprise to me as well), there is a substantial research body in both pure and applied sciences to topics with homeland security and/or emergency management applications, such as biosensors, nuclear weapons material detectors, explosive detectors, biological warfare agent detection, robust ad hoc communication networks, and more.
The center of the human dimensions branch is represented by the “disasters” hub. Around this hub, one can find the research concerns of the geographers, sociologists, city and urban planners, political scientists, physicians, psychiatrists, psychologists, public health researchers, public administration scholars, and others. Some of these clusters are more well-defined than others, but all broadly focus on disasters as human events.
For those who would like to more closely explore the network visualization, you can download VOS Viewer here: http://www.vosviewer.com/download/. You will need to download the map file (https://docs.google.com/file/d/0B7GBghcJr6aJaDc2TDF6NUFMaDg/edit?usp=sharing) and network file (https://docs.google.com/file/d/0B7GBghcJr6aJM002cmtaSU5GcGc/edit?usp=sharing) OR the normalized network file (https://docs.google.com/file/d/0B7GBghcJr6aJRjRtQTRrR0lHTm8/edit?usp=sharing).
These files can then be used in VOS Viewer to generate and explore the network visualization in detail.
Any reports of my demise have been greatly exaggerated….
The past 6-8 weeks have been both hectic and productive….turned 45 years old…..my family finally placed the ashes of my stepfather in their permanent resting place…..upgraded my internet and satellite tv……bought a new car….I am getting all of my teeth replaced in a few days….and I have built and am currently refining a dataset of over 34,000 un-duplicated, raw, Web of Science records related to disasters (mostly), 1903- May 2013.
This is not as easy as it may sound. The dataset combines all of my previous WoS searches with the results of many new results pulled from WoS during August, including contents of the Bulletin of the Seismological Society of America 1950-2013 (6,000 or so articles) . The entire dataset was fed into both EndNote and CiteSpace to remove any duplicates, and to verify the number of records. The precise total number of records, if you wish to know, is 34, 553.
This number will likely decrease somewhat in the future, as I have not yet started in earnest weeding out results that are not of at least minimal relevance to the study of disasters and hazards.
There are two dataset versions: 1) a raw dataset that except for some capitalization differences in some records that are the result of my decision early in my thesis work to make those records all uppercase, has not been edited; and 2) a standardized dataset that will, as best as possible, correct variations in author and source spellings that are widespread in WoS records.
The standardized version is an ongoing, tedious, literally unending process that can only approach, but never reach, perfection. At the present time I have standardized approximately 800 of the top 1200 Authors of the articles in the dataset. This small task took approximately two weeks. Unfortunately for some names, particularly the Chinese and Taiwanese authors, this appears nearly impossible using only surnames and initials. Many of these authors share names and initials. It may be necessary in some cases to use full names to distinguish between different authors. I think I will tackle this issue at a later date.
My plan is to eventually standardize most of the top 3000 authors. I will then move on to the authors of the cited references, then the sources in those references. This may well take up the remainder of the year.
From the dataset, here is the bibliographic coupling network of the 1200 most frequent authors of the 34,553 articles in the dataset using VOS Viewer…some clear clusters appear to emerge:
Here is the Bibliographic Coupling Network of Journals in the dataset:
I am making both datasets publicly available. The raw, unedited dataset and the most recent Standardized dataset are both contained in a folder that can be accessed using this Google Drive link: https://drive.google.com/folderview?id=0B7GBghcJr6aJM3NUOUJwWXFKZGs&usp=sharing . I also intend to make an EndNote file available in the near future.
I will be posting more network images soon. Also, please note my primary contact email has now changed to firstname.lastname@example.org. You can still also reach me at email@example.com.
As you may recall in my last post, I have just discovered overlay mapping and its use in mapping knowledge domains. I am reading and digesting several papers written by several prominent scholars, including Loet Leydesdorff, Ismael Rafols, Chaomei Chen, and Alan Porter.
To refresh your memory: in overlay mapping, data is superimposed upon an existing base map. In this case, the base maps are of science, as represented in the journal-to-journal citing and cited by patterns among the journals indexed in Web of Science. The overlay data can consist of the WoS-indexed articles published by specific authors, group of authors, institutions, specialties, fields, disciplines, etc., within a particular time period. You can also choose to create an overlay based on the source article data, or the cited references within those articles.
The method of overlay mapping, thus allows comparison between the data and the base map, and between different overlay maps (though according to Leydesdorff there are limits to the types of between-map quantitative comparisons that can be made if using VOS Viewer to visualize the data…will not bother you with the exact details but just know that because a journal node in one map is twice as big as the same journal node on a different map (provided the viewing settings are the same for both maps), it does not mean one node is two times more frequent….it only means the node is more frequent).
On this page of his website, Leydesdorff has instructions and programs that will convert WoS data into overlay maps, as well as links to PDF versions of papers on overlay mapping. The programs are fairly easy to use, and one can create a large number of different maps that can be viewed in VOS Viewer in a relatively short amount of time. CiteSpace can also be used, according to some of Chen’s most recent documentation on CiteSpace, but I am only beginning to investigate that process.
So let us see a few more examples of the nifty things you can do with the overlays:
Neat, eh? Now let’s take a look at using the method to see what is revealed about different journals and different authors in the world of disasters:
There are limitations to these overlays, of which the main one is that these maps will only show journals that are indexed in WoS. So for example, articles published in the Australian Journal of EM or the International Journal of EM, or citations to these journals in WoS-indexed articles, do not appear on the base map or overlays. Despite this fact, the technique appears to offer another way to view the field, one that is complimentary to the results produced by co-citation analysis.
I was originally planning to write about some interesting work being done by several information science researchers, including Loet Leydesdorff, Ismael Rafols, Chaomei Chen, and Alan Porter.
Their work improves on earlier work by Vargas-Quesada, De-Moya-Anegón, Chinchilla-Rodríguez, and González-Molina (2006), as well as others, to map the entire intellectual domain of science. The recent work involves creating overlays that can be combined with base structural maps of science derived from citation patterns within Web of Science-indexed journals. The overlays allow the output of a particular author, journal, institution, or field, to be visualized within the entire domain of science. These base maps, one of which is shown above, might help provide secondary confirmation of the structures I have found in my co-citation networks.
So, I was preparing images of the base maps and marking where different disaster-related journals referenced in my networks were located within. Then I noticed something. I had seen the structure of the base map before.
With my original background in neuropsychology, I have spent a lot of time looking at images of the human brain (and in one undergraduate course was tested on neuroanatomy using actual human brain sections the instructor kept in a large glass jar of formaldehyde in his office…..). The structure of science looks remarkably similar to a cerebral hemisphere seen from the side (lateral view). Even the way the major disciplines cluster along the outside of the network, with the curve creating an interior space, is similar to the curve of the temporal lobe and the relationship between gray and white matter:
The resemblance was so peculiar that I emailed Dr. Leydesdorff, who is in Amsterdam, to ask if anyone had noticed the similarity before. He actually replied back rather quickly that he had not noticed it before but that yes, the similarity was striking. Whether there is any significance to the similarity, he could not say one way or another. I can only speculate that if the structural similarity can not be shown to be coincidental or arbitrary, then it suggests there is something necessary about that structure that makes it desirable to have, for both brains and scientific knowledge domains. But the idea that the structure of knowledge somehow echoes the structure of our own brain is a very odd idea indeed!
Here are some additional images of neural pathways and networks that have been produced by the Human Connectome Project, which seeks to completely map all of the brain’s neural pathways and connections–a definitive wiring diagram for the human brain.
Perhaps someone else sees the similarity I do between these images and the network of science…..
In Part II, I will return to my original intention of discussing the base maps of science in relation to my own attempts to map the structure of Disaster Studies and Sciences.
Having thoroughly depressed myself in the last post by pondering one of many possible unpleasant futures awaiting humanity (and that does not even include possibilities such as the “technological singularity” some futurists have hypothesized may be approaching, perhaps as early as mid-century, beyond which future human history becomes impossible to predict) , let us return to the year-by-year document co-citation analysis of my modified thesis dataset.
We now arrive at 2004.
Compared to the previous year, this year is somewhat more fragmented, with a larger number of individual structures visible, as well as a greater number of clusters. We do however, see a broad mix of hazard science and the human dimensions of disaster, including a structure that includes a social science and hazard science cluster linked together (Cluster 7 and Cluster 8). The two largest clusters this year are the humanitarianism/development cluster (Cluster 9) and the landslide/susceptibility cluster (Cluster 6). Some of the significant works cited this year include those by Guzzetti, the Sphere Project, Peacock, and McCarthy’s contribution to the IPCC 2001 climate change report.
Network Summary, Organized by Cluster Membership:
Citation Burst Information:
The 2003 visualization is notable for a couple of reasons. First, the visualization is dominated by a single, multi-cluster, multi-disciplinary, structure. This structure contains 6 of the 9 total clusters, and 87 of the 100 most cited references for 2003.
The left, less dense, side of the structure contains 5 clusters: Cluster #7 (emergencies; systems; humanitarian) includes four works related to complex emergencies and humanitarianism that have consistently appeared in previous years; Cluster #6 (emergency; preparedness) includes key works by Mileti, Tierney, and Drabek. Cluster #5 (vulnerability; impact; hazard; flood) includes the IPCC 2001 Summary for Policy Makers, Ian Davis’ 1978 work on disaster shelter, and Etkins’ 1999 article on risk transference; Cluster #4, (which includes the IPCC 2001 scientific report, Pielke and Landsea’s 1998 work on normalized hurricane damage, and Changnon et al.’s 2001 paper on losses from extreme weather events) pertains to the possible meteorological impacts of climate change. Cluster #2 is the largest, with 44 members. it is also the most heterogeneous of the clusters, making it difficult to classify with a single label. Works in this cluster include: Cannon’s vulnerability analysis contribution to 1994’s Disasters, Development, and Environment; White’s 1974 Natural Hazards; and Kunkel et al.’s 1999 paper on extreme precipitation trends. The cluster boundaries encompass Granger et al.’s 1999 multi-hazard risk assessment of Cairnes, Australia; Fell’s 1997 book on landslide risk assessment; and extend across the right side of the structure to include several other works on landslide hazards.
The remainder of nodes on the right side of the structure belong to Cluster #3 (maps; susceptibility), the second largest cluster with 28 members. Works in the cluster pertain to remote sensing/GIS applications and landslide hazards, including Varnes’ 1984 work on landslide hazard zonation.
Two of the remaining three small clusters (#0 and #1) pertain to seismic hazards. The final cluster (#8) relates to tsunamis.
What also makes the 2003 network notable is the large number of high centrality nodes (nodes with pink rings). Eleven nodes have centrality values equal to, or greater than 0.10. This is by far the greatest number of high centrality nodes to appear so far. These nodes, whether they are cited frequently or infrequently, usually serve as “gateways” between clusters or different parts of the network. Nodes with both high citation frequency and high centrality (such as the works by Mileti, Hewitt, Chambers, Granger, Fell, White, Brabb, and Varnes) are, in keeping with Chen’s purpose in developing CiteSpace, candidates as possible turning points in a knowledge domain. This is especially the case if examination of the merged network (which includes all of the individual time slices) shows the node is a link between different time periods. In some studies using CiteSpace, such as examining research on dinosaur extinction, Chen would confirm these turning points by sending questions about the importance of particular works to key authors identified in the network. This works well for scientific research domains within a single discipline, where theories and schools of thought are more clearly delineated, and more linear, than they may be in an evolving, multi-disciplinary knowledge domain.
Network Summary, Organized by Cluster Membership:
Citation Burst Information:
The 100 most cited documents in the dataset for 2002 form thirteen clusters. This year sees Mileti’s Disasters by Design occupy the spotlight as the primary landmark node in the network. Co-citation links are found between it, At Risk, and Hewitt’s Regions of Risk, among others. Included in this co-citation cluster (#11-drought) is also a 1991 paper by DE Alexander on use of information technology for real-time disaster monitoring. The works in this cluster were cited in a variety of contexts, as evidenced by the cluster summary. Other possible labels for this cluster include: community resilience; vulnerability; evacuation; impact; health; volcanic hazards; and others.
Additional social science clusters can also be found. These include clusters related to seed relief/seed security (#2 diversity); and a small humanitarianism cluster (#5- humanitarian).
Another social science article of note is one not linked by co-citation to any other paper in 2002’s 100 most cited. It is a member of a small cluster (#10-earthquake/community preparedness) that includes DE Alexander’s 2000 paper in Disaster Prevention and Management on use of scenarios for teaching emergency management; and JI Abrams 1993 paper on earthquake prehospital mortality patterns. The article is Wise’s December 2002 “Organizing for Homeland Security”. This makes it the first post 911-related paper to appear among the 100 (although it was published in Dec 2002, it was cited in articles by Waugh and Kirlin in the same issue of Public Administration Review ). It also may mark the entry of Public Administration as a significant disciplinary input into the disaster literature. This is not to say that PA was not involved previously, only that its participation and importance will become more pronounced from this point forward.
Within hazards research, there are two structures of note. The first is a well-formed structure of four, linked clusters (#6, 7, 8, 9), related to seismic hazards. This contains works by Papazachos, Kanamori, Cornell, Tselentis, Tinti, and others. Cluster 6 was cited in relation to the 1999 Athens earthquake. Cluster 7 was cited in relation to tsunamis and the Cascadia subduction zone. Cluster 8 relates to seismic hazard assessment in general. Cluster 9 pertains to seismic assessment of Lake Nasser and proposed Kalabsha Dam in Egypt.
The second is a well-formed hazard cluster pertaining to avalanches, snow avalanches, and avalanche forecasting (#3- large).
Citation Burst Information:
The 100 most cited documents in the dataset for 2001 are organized into eleven identifiable clusters. This year, the work of two geographers, Hewitt’s Regions of Risk, and Alexander’s Confronting Catastrophe, take center stage. Both belong to a multi-disciplinary “vulnerability” cluster (#5) that also includes At Risk, and Hoffman and Oliver-Smith’s The Angry Earth. This cluster is linked to what should by now be becoming a familiar cluster of documents: the conflict/humanitarian/chronic emergencies cluster (#4-“context”). There is also a secondary un-linked “emergencies” cluster related to child refugee health and nutrition deficiencies.
Other important cited references standing out in 2001 are Newhall et al.’s 1982 paper introducing the Volcanic Explosivity Index (VEI); Hanks and Kanamori’s 1979 Moment Magnitude Scale paper. These are part of a three-cluster structure (#2, #7, #8) cited in relation to tsunami hazards, which also includes Bryant et al.’s 1996 paper on tsunamis’ role in coastal evolution. There is a fairly dense cluster (#6- system) connecting papers on expert systems to assessment of landslide hazards. Climate change issues are also beginning to enter the top 100 (#8- climate).
Absent from the network are any references connected to the 9-11 terrorist attacks. This is to be expected. Due to the nature of academic publication, there is a lag between actual events and publication of journal articles related to those events. From previous year’s results, this lag time can be roughly estimated at a minimum of 1-2 years (this is roughly estimated by looking at the difference between papers’ year of publication and the year of first appearance in the network) . We might expect to see some influence of 9-11 in 2002’s most cited references, but is more likely to be seen in 2003 and beyond.
Citation Burst Information: