Home » Library and Information Sciences (LIS) » Bibliometrics » Citation and Co-citation Analysis (Page 2)

Category Archives: Citation and Co-citation Analysis

April 2024
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
2930  

Document Co-Citation Analysis 1994-2013: 1999

DCA 1999-100

DCA 1999 100 Most Cited (click to enlarge)

The 100 Most Cited documents for 1998 organize into 18 individual clusters.  The year is dominated by: 1) a large, and well-cited landslide hazard and GIS research cluster (Varnes, Chung, and Carrara); and 2) two joined clusters of social science research. The Quarantelli, Bolin, and WA Anderson cluster concerns community disaster management as well as ethnicity. It is linked to the humanitarian/household needs assessment cluster containing Chambers and Sphere Project. The two clusters are linked together by At Risk. The majority of remaining clusters are related to geomorphology (landslide and seismic processes), including Smalley’s 1987 work examining earthquakes as a critical process.

Cluster summary:

7-25-2013 4-53-56 PM

Network Narrative: 

7-25-2013 5-00-40 PM

Citation Burst Information:

7-24-2013 3-00-28 PM

Document Co-Citation Network of 100 Most Cited, 1995-2013: 1998

7-24-2013 1-21-03 PM1998 DCA 100 Most Cited: Wide-angle view

DCA 1998

DCA 1998: Large Version. (Click to enlarge).

1998 heralds the first appearance of At Risk, and the appearance of two multi-cluster structures. 

The At Risk cluster also includes Neal and Phillips’ paper on the emergent human resources model; and Fothergill’s article on gender in disaster.  This is the first emergence of vulnerability/social vulnerability in the dataset. This cluster  is linked linked to a complex emergencies/humanitarianism/development cluster (with Anderson’s Do No Harm) through Slim’s article on humanitarianism, and De Waal’s paper on contemporary warfare in Africa.  Thus, early on, At Risk was closely associated with international development issues.  Another vulnerability-related cluster can be seen to the upper right (#0-child survival) with several works concerning women and children’s health, particularly in terms of refugees and conflict.  Several seismic/earthquake-related clusters have also now linked together.

A reminder: the cluster terms generated by CiteSpace (selected using the results of three different algorithms, using either keywords, title terms, or abstracts) in the visualizations are based on terms used in the citing/source articles, not the cited articles that make up the visualizations.  Thus, CiteSpace generates cluster labels based on how the cluster is cited by the articles in the dataset, not through analysis of the cited references (which would not be possible without having at least an abstract for every cited reference). These terms are not always very illuminating, and examination of the content of the cited articles (thank you Google!!!)  is almost always necessary for understanding a cluster.

Cluster Summary and Narrative:

7-25-2013 10-00-33 AM

7-25-2013 10-03-19 AM

Citation Bursts:

7-22-2013 8-22-20 PM

Document Co-Citation Analysis 1995-2013: 1997

DCA 1997

1997 DCA. Quite a few nice looking clusters have emerged…and we now have two linked clusters- (#13 Preparedness and #11 management) still awaiting the development of more between-cluster links.

Cluster summary:

7-22-2013 1-53-33 AM

Network Narrative: 

7-24-2013 11-25-56 PM

Citation Burst Information:

7-19-2013 6-31-06 AM

In 1997 we see 14 clusters encompassing a variety of research topics. Key works by Hewitt, Smith, and Drabek appear for the first time.  Two linked clusters have emerged this year, and for the first time, a multi-disciplinary social science/geography collection of articles represents the largest cluster (#13-preparedness-and even includes a  seismic work of Ambraseys). Beginning to see some works appear consistently from year to year.  Hewitt’s work shows a relatively strong 5-year citation burst, and Smith has a weaker but longer 12-year burst.  These bursts, again, will not start until after 2000. Complexity science and criticality’s relationship to hazard phenomenon has now become a research topic, as Bak is present in two separate hazard-related clusters.

DCA 1995-2013: 1996

DCA 1996

Continued evolution and cluster consolidation is evident in 1996’s 100 most cited, particularly in the areas of seismic hazard analysis/engineering, and disasters/development. Some well-known authors and works also make their appearance in the 1996 network (click image to enlarge).

CiteSpace-generated cluster summary and network narrative:

7-19-2013 5-11-53 PM

7-24-2013 12-34-15 PM7-24-2013 12-34-47 PM7-24-2013 12-35-13 PM

Citation Bursts:

7-19-2013 6-40-13 PM

For 1996, we have for the first time some nodes that stand out in the visualization, relative to the others: Ian Burton’s classic work Environment as Hazard (Environmental Science/Human Ecology); CA Cornell (seismic engineering/hazard assessment); MB Anderson, FC Cuny, and A De Waal (disasters and development); and DE Alexander’s classic 1993 work that is now part of a small Geography and Hazard cluster..  In addition to the the large seismic hazard cluster, a volcanic hazard, and tsunami hazard cluster has emerged.  What is interesting about the volcanic co-citation cluster is that this is where Burton’s work makes its first appearance. LK Comfort’s 1988 work on managing disasters also appears in a cluster this year.

Still, no between-cluster links have emerged…according to Chen, works that are considered “turning points” often link different clusters together, particularly when links are established across periods of time. Perhaps in later years we will see some works emerge among the 100 most cited that  link different clusters together.  This would be indicated by a document node between clusters that has high centrality.

Also, notice that there is quite a delay between the year of publication, year of first citation (1996), and the “burst” years for Burton, Cornell, and Anderson. It appears as if their works will not be taken up by many writers for nearly 10 more years.

What will 1997 look like?…

The Cited References of Disaster Studies and Sciences 1995-2013, One Year at a Time: 1995

DCA 1995

The 100 most cited references for 1995 from the revised thesis dataset of 2930 articles from 13 journals. CiteSpace’s clustering function has been used to determine and name the clusters (in gray). Click to enlarge image.

At the workshop, the esteemed Claire Rubin asked if it was possible to visualize individual years rather than aggregating the entire time period in one slice.  Yes it is…though for 19 years, having 19 visualizations running on your computer simultaneously eats quite a bit of processing and video graphics resources.  So for practical reasons it is better to run a five-year period in 1-year slices.  So here is the first of what will be 20 posts covering a Document Co-Citation Analysis of the 100 most cited for each year, 1995-2013. The 20th post will show the merged network of all 19 individual years as one-slice. To understand these one year slices, keep the following in mind:

1. Node size and labels are generally proportional to the citation frequency of the document.

2.  The first year a document appears it’s node and label will be proportional to the total citation frequency across the entire time period.  The size and color of the rings in the node indicate the year and number of citations received. The color bar across the top of the visualization displays the colors for each year.

3.  Each subsequent year the same document appears its node and label will only indicate number of citations received for that year.

4.  Red rings indicate a “burst” of citations received during that time, which indicates an increase in interest in that work.

5. A pink ring around a node indicates high betweenness centrality (>= 0.10).

Here are images of the CiteSpace-generated cluster summary and network narrative:

Cluster Summary

1995Narrative

Citation Bursts for 1995, indicating strength and duration of burst:

7-24-2013 11-56-33 AM

So what do we see in this slice, derived from 24 source articles?

We have multiple, small, well-defined, but isolated, clusters.  The clusters have no links to each other.  In the knowledge domain visualization literature, I have seen this structure in the visualizations of early years in other fields/disciplines.  The knowledge base is constructed from discrete, unconnected “pockets” of knowledge/research.  The  majority of these clusters are related to natural hazard science (particularly seismology) and the epidemiology of disasters.  However, in the lower right corner, we see a cluster of sociological/geographical disaster references (including WA Anderson’s 1970 work on organizational change following disasters; and RC Bolin’s work on the impacts of the Loma Prieta earthquake).  We also see the appearance of a disasters and development link, with MB Anderson’s 1985 Disasters article co-cited with ASEAN.

None of the 100 works in this slice, however, were heavily cited….no particular work appears (if the dataset is somewhat representative) to have caught  the disaster research world on fire in 1995.

The disciplinary body of knowledge visualized in this slice might best be described as “embryonic.”

Next up..you guessed it….1996.

Time Slicing in CiteSpace

I was asked during my presentation yesterday whether it is possible  to break down a time span into increments.  CiteSpace can indeed break down any time span into increments of one year or more and visualize the network for each slice in one go…though your processor may start to feel the strain, for example,  as it tries to crank out a 20 year time period in 10, 2-year slices.  You can also chose to simply look at a single year in a one-year slice.

A couple of Document-Term network examples from my presentation data that I generated last night using the latest version of CiteSpace, which came out just earlier this month:

1995-2000 75 Most Cited-Occurring

1995-2000 Top 75 Cited Documents/Terms (noun phrase length 2-4 words)

2012-Top 75 Most-Cited-Occurring

2012 Top 75 Documents-Terms (phrase length 2-4)

Power Point Presentation Added

Slide1

Slide7

Slide16

Slide21

The presentation that I had originally planned for the Natural Hazards Workshop was quite a bit longer and more detailed than I actually used based on the time limits given and the strong discouragement repeatedly given concerning use of  PP slides at the Workshop.  Some of these slides were eventually added to a supplemental handout, and others were dropped.

The full version with all slides has now been given a couple of pages of their own.  You can see them here.

Natural Hazards Workshop Presentation-New Visualizations

I am in the middle of final preparations to leave for the Natural Hazards Workshop in Colorado tomorrow, where I will be making my first post-Master’s presentation of my research work.  I will be part of a “New Researchers” session on Monday.  I am a little nervous.

For the presentation I have created a new set of visualizations of Disaster Studies and Sciences, as well as prepared a supplemental handout, as 8 minutes is not a lot of time to present a topic that most will never have heard of before.   Power Point presentations are discouraged at the Workshop, but they have made an exception for me because of the inherently visual nature of the work.

I have updated the thesis dataset to include articles from all of 2012 and some of 2013. Here are some of the new images I will be showing:

Author Co-Citation Network, 1994-2013

ACACloseup2

Detail of the ACA Network

DCA-Term Complete

Document-Term/Keyword Network, 1994-2013

JCA750-2

Journal/Source Co-citation Network, 1994-2013

ACA-Term 3000

A 3000 node Author-Term/Keyword network….looks remarkably like a galactic nebula, or an aerial photo of a city at night…and can be interpreted similarly…brighter areas indicate areas of greater activity.

JCA750  copy

The general organization of the discipline remains basically the same as observed in the thesis visualizations….

A Preliminary Look at the Co-citation Network from the 15,000+ Article Dataset

Initial Author Co-citation Network of 2000+nodes and 13000+ links from my dataset of 15000+ articles, reviews, etc.. Data cleaning is still a long way from completion, so this is just an initial look.

Initial Author Co-citation Analysis (ACA) network visualization of 2000+nodes and 13000+ links from my dataset of 15000+ articles, reviews, etc.. Data cleaning is still a long way from completion, so this is just an initial look.

Here are some images of various visualizations of the entire, partially standardized, 15,000+ records in my database.  I have not yet removed articles that may not be particularly relevant to Disaster Studies and Sciences, but I believe the effects on the visualizations are less direct than you may think.  They do not completely change the visualizations (it cannot “create” co-citation relationships that do not exist) but can instead result in the references of some fields that are heavily represented in the dataset  (i.e. Ecology; Health/Medicine/Psychology) “drowning out” the references of papers from fields that are less numerous in the dataset (e.g. Economics; Business).  This means co-citation relationships within the references of the less numerous fields may not appear, depending on how many nodes and links are shown in a visualization.

So here are the images, with some notations to indicate interesting features:

Same ACA Visualization as above, but I have added groupings by disciplinary/subject areas

Same ACA Visualization as above, but I have added groupings by disciplinary/subject areas

This visualization shows the country of affiliation for the authors in the 15,000 records of the dataset.

This visualization shows the country of affiliation for the authors in the 15,000 records of the dataset.

This is the network visualization of the Subject (SC) and WoS Categories of the source articles in the dataset that provides a broad glimpse into the various disciplines/subject areas  involved in the disaster study arena.  The nodes are labeled with both the SC and WoS Category, where available.  This does result in what may appear to be duplicates.

This is the network visualization of the Subject (SC) and WoS Categories of the source articles in the dataset that provides a rough glimpse (taking into account limitations of the records) into the various disciplines/subject areas involved in the disaster study arena. The nodes are labeled with both the SC and WoS Category, where available. This does result in what may appear to be duplicates.

Close-up of the central cluster within the visualization shown above.

Close-up of the central cluster within the visualization shown above.

Document Co-citation Network (DCA).

Core area of Document Co-citation  Analysis (DCA) network visualization.

Area of the DCA visualization slightly above the core area.

Area of the DCA visualization slightly above the core area.  This is the second largest cluster.  It is interesting to note the link between Disasters by Design and the Psychology/Psychiatry articles.

More of the DCA visualization.

More of the DCA visualization.

Relationship of core area to the second largest cluster.

Relationship of core area to the second largest cluster.

Journal Co-citation Analysis (JCA) network  visualization of journals and book references.
Journal Co-citation Analysis (JCA) network visualization of journal and book references.

When Scholars Attack…

Luckily, Scholarly and Scientific dislikes rarely result in bodily assaults. Intellectual

assaults, on the other hand…..

A couple of weeks ago I was doing some searching in Google Scholar and by chance came across a paper with an interesting title and link to the  full-text pdf.  Being the inquisitive sort,  I decided to take a look at the 2008 paper, written by a not insignificant scholar within his discipline- a critique of some highly cited  papers on a  set of concepts that have become increasingly central within the theoretical literature of  Disaster Studies and Sciences.  The paper is unpublished but can be found on a page of the author’s website (he even encourages responses if anyone feels he has made any factual errors).  For a variety of reasons I will not specifically identify the author or paper…those with the curiosity and the time should be able to find it without too much difficulty.

To call the paper a “critique” is like calling the Inquisition an “investigation”.   In fact, the paper can very well be described as a scholarly inquisition of four highly-cited academic papers, done with all of the energy and ferocity of a hungry lion chasing down a limping gazelle.  Use of words and phrases such as “conflict of interest”, ” unscientific approaches and analyses”, and “sentences that demonstrate verbiage more than useful commentary” are not frequently encountered in scholarly reviews of other scholars’ work.  It is also rare for scholarly works to be criticized almost line-by-line.  As one of the papers placed on the chopping block is a bibliometric analysis by Janssen, Schoon, and  Borner (2006), I take particular interest (Borner is a  key figure in bibliometrics and the development of Knowledge Domain Visualization, and has collaborated with Dr. Chen, whose own work inspired my own).

Being that my undergraduate minor was philosophy, I am quite use to very detailed critiques of arguments and ideas, and strong debates, even when one actually agrees with a particular position being taken.  The purpose is ultimately (in theory) to find the weaknesses in an argument which must be addressed, and if addressed satisfactorily, will make the argument stronger.  If the weakness cannot be addressed satisfactorily, then the position must be abandoned.  In reality, one learns that the nature and manner of criticism often reveals much about the position of the person offering the criticism.  Some of the possibilities:

A) The critic comes from the same school of thought and agrees with the argument.

B)  The critic comes from the same school of thought but disagrees with the argument.

C) The critic comes from another school of thought not  fundamentally incompatible with the school of thought making the argument.

D) The critic comes from another school of thought fundamentally incompatible/opposed to the school of thought making the argument.

E) The critic personally has something to gain/lose from either supporting or opposing the argument, or personally likes/dislikes/despises the proponent of the argument.

If you combine motivations D and E, you now have a potential for manners and styles of criticism that are ferocious, unrelenting, and even nasty.  Sometimes the disagreements even find their way into the public discourse, such as Noam Chomsky’s repeated criticisms of B.F. Skinner and behaviorism that first started with book reviews of Skinner’s work in the New York Times Review of Books.

So of course, the paper in question’s disdainful tone made me wonder if there isn’t an interesting story explaining the genesis of the paper. There probably is. There always is.

The simple fact is….and I hope this doesn’t come as a shock to anyone….please sit down if you still believe in Santa Clause, the Easter Bunny, and the wholly noble pursuit of knowledge……every scientist and scholar (and I am not excluded either) has a  potentially serious, and possibly overlooked, conflict-of-interest: their own ego, needs, and desires.  Scientists and scholars, particularly leading figures in fields, disciplines, and schools of thought, or those wishing to become leading figures, have deep, vested, interests in being mostly right.  To admit, or to be shown to be wrong is to risk the loss of power, prestige, influence, income, etc.  As Kuhn noted in his model of scientific progress, The Structure of Scientific Revolutions, the nature of natural science is generally conservative and is loathe to enter a revolutionary phase until it has no other choice.  Even at that point, many who hold to the status quo may still cling to the dying paradigm, for it is likely that no matter what they do, they will not be a part of the new scientific power structure. Some will go down with the ship while others secure an honored place in history.  Darwin is remembered as a scientific and intellectual giant; Lemarck is remembered as a bit of a joke, and as the guy who “got it wrong.” His theory is frequently presented with the mandatory giraffe illustration (see below) to simplify his theory and maximize his “wrongness.”

This is how the history of science remembers the French biologist Lemarck...

This is how the history of science remembers the French biologist Lemarck.       Erasmus Darwin, Charles Darwin’s grandfather, was a contemporary of Lemarck and independently developed a very similar explanation for the evolution of species.

In the history of philosophy, there is only one philosopher known to have concluded that he “got it wrong”: Ludwig Wittgenstein.  In his lifetime he published only one book, but was working on a second at the time of his death, which was translated and published two years later.  The first presents one view of language and its relation to the world.  The second contends that his view of language in the first book is wrong, and a new view is offered.  This also leads to Wittgenstein’s distinction as the only philosopher known to have fathered two competing schools of thought, and he still holds a place as one of the more significant philosophers of the 20th Century.

Ludwig Josef Johann Wittgenstein (1889-1951). One of my favorite philosophers.  The only philosophical treatise he published during his lifetime is presented as a series of logical propositions that begins with "1*  The world is all that is the case."  Sixty-nine pages later it concludes with the final proposition: "7  What we cannot speak about we must pass over in silence."  These lines are rather famous in the world of philosophy.

Ludwig Josef Johann Wittgenstein (1889-1951). One of my favorite philosophers. His only philosophical treatise published during his lifetime (in 1918) is presented as a series of logical propositions that begins with “1* The world is all that is the case.” Sixty-nine pages later it concludes with the final proposition: “7 What we cannot speak about we must pass over in silence.” These lines are somewhat famous in the world of philosophy.

Off the top of my head, I cannot think of any similar examples from other disciplines.  Many people know of Einstein’s famous quote, “God does not play dice with the universe.” What many may not know is the context of the quote, which reflected his negative view of quantum theory, which he never fully accepted, even when it was largely accepted by the rest of physics.  But Einstein got it wrong- God does play dice with the universe.  More importantly, some have expressed the opinion that had Einstein embraced quantum theory, he might have been the one theoretical physicist capable of finding an answer to the elusive problem of a unified field theory.

But the problem of personal motivation takes on importance for scholars and scientists for another reason: power, manipulation, and the possibility for abuse.  This I learned from my time in psychology. In clinical psychology/psychiatry/psychotherapy, there is a tradition going back to at least Freud, that those who wish to be therapists must themselves undergo therapy.  In part  this is to create awareness of one’s own personal issues that can impair clinical judgements, and even harm clients, whether intentionally or accidentally.  It is one of the only fields I know of that requires demonstrating such self-awareness….how many fields require the practitioner to ask  following a bad outcome, “Did I in some way want this to happen, and in some way contribute to it happening?”  Yet many fields and professions give power that can lead to manipulation and abuse in ways great and small.  For the scientist and scholar, personal motivations and agendas can easily cloak themselves behind the supposedly unbiased, objective, pursuit of knowledge.  Rationalists believed that the mind is the ultimate master of emotion.  Science is usually imagined as a rational endeavor.  The reality is that the mind all too easily becomes the blind servant to emotion (Schopenhauer referred to it as “The Will”, which is  insatiable and unending in its wants, and it uses reason as a tool to achieve its wants).  This is true regardless of whether you are a ditch digger or a Nobel Prize-winning physicist .  If unrecognized, it can lead to questionable scholarship and science.

This brings me back to the critique in question. The author claims that his concerns are purely scientific.  Are they?  Besides the fact that feeling the need to point out the “scientific” nature of one’s own scathing critique of other scholars raises a red flag (if one is a scientist or a scholar, there really doesn’t seem to be a need to self-proclaim one’s work as being scientific or scholarly…unless one is actually saying something like  “I am a real scientist, and I am far superior to these other bozos who only think they are scientists, and who are being cited much more frequently than my own work ), are there any other indications that the paper is less than objective?

Perhaps we should look at the truth-value, or accuracy, of the criticisms offered?  This assumes that perfectly truthful statements cannot be used for self-serving purposes.  That they can indeed be used for such purposes is exactly the problem.  It is perhaps better to ask if the nature and manner of criticisms are legitimate, fair, customary, and/or consistent.  Is there evidence of such failings?  I believe there is.  It is not customary in most fields to specifically single out four works (two by the same author) for microscopic scrutiny, and in a tone that borders on contemptuous.  Even if one of the legitimate purposes is to support a claim that citation totals cannot be used to indicate the quality or importance of a work, the method employed cannot support such a claim, as it is selective not systematic.  On the surface this would appear to contradict the stated scientific nature of the paper’s criticism.  The method and manner of critique effectively implies,  “I have shown that these four works are worthless crap and you shouldn’t pay attention to them.  But the works of the other authors I cite in this paper are good and I have no criticisms of them.”  Thus, the critique exceeds not only what is customary, but to discredit four works while implying they are the only four worth discrediting, also exceeds what would seem to be fair.

Even more importantly, can the paper itself meet the very standard of scholarship it applies to the works critiqued?  If it cannot, then the standard applied is perhaps excessive, or it is simply the case that the author’s very own paper is of equally poor quality as the works criticized.  In the paper’s criticisms, much is made of the authors’ lack of knowledge of the literature of the field, documented with numerous examples.  The author himself, however, fails to cite a single author or paper on the topic of citation analysis when criticizing the 2006 Janssen et al. paper, or presenting his own thoughts regarding the use of citation analysis.  In 2008 there was a large and substantial literature in existence on the subject that had been developing for nearly 40 years.  Some specific criticisms offered also indicate a lack of awareness of not only the literature, but of the very nature of the method used in the Janssen paper.   The author is not only doing  what he accuses the other scholars of doing, he is actually doing it to a greater degree by completely ignoring the body of work within another discipline.

Thus, there seems to be more going on in this paper than what is claimed.  What exactly is going on really doesn’t matter…the author might simply have been having a really bad month….maybe he wanted to call attention to authors and works more closely aligned to his views that he felt were being neglected (though the very history of science is filled with good papers and authors that ultimately get lost within the immense volume of work constantly produced, or they fail to get the credit they perhaps “deserve” for ideas that another author becomes more widely known for).

The fact that there is likely a very loudly expressed unspoken agenda to this paper is what matters. I consider this a form of deception.  It is dishonest.

I hope there is a lesson in this. Socrates’ words from long ago apply just as much, if not more, to scholars and scientists as they do to everyone else: “Know thyself.”  We must try to be be clear and careful about the masters, internal and external, we serve in the name of science and knowledge.

If nothing else, it should serve as a warning to scholars in this digital age to perhaps think twice before putting unpublished papers on your website.  You never know when some upstart researcher, motivated by an underlying need to confront arrogant authority figures as well as by the need to establish his/her own reputation,  will discover something you might have forgotten ever writing, and turn it into the subject of a blog posting.