Can the Journal Citation Index truly evaluate the value of scientific research?

In this blog post, we examine the problems with the current system of evaluating scientific research, which centers on the Journal Citation Index, and consider whether it accurately reflects the true value of research.

 

In the scientific community, researchers’ qualifications and the importance of their papers are evaluated using the “journal impact factor.” However, criticism of this practice has been growing recently. Last May, over a hundred science and technology researchers gathered in San Francisco to announce the “San Francisco Declaration on Research Assessment.” Currently, tens of thousands of researchers have joined this movement. They point out several critical flaws in the evaluation method based on the Journal Citation Index. In this article, I will examine the specific criticisms of the Journal Citation Index and argue that it cannot serve as a yardstick for evaluating scientific research.
The journal impact factor is a numerical measure of a journal’s influence. It was originally designed as a metric for librarians. This was because libraries needed to evaluate the relative importance of various candidates when selecting which academic journals to subscribe to on a regular basis. The method for calculating the journal citation index is simple. For example, if there is a journal called “Science and Technology,” and if a total of 20 papers were published in “Science and Technology” over the past two years, and these papers were cited a total of 200 times, then the citation index for “Science and Technology” would be 200/20, or 10. In other words, a “citation index of 10” means that the average number of citations per paper published in “Science and Technology” over the past two years is 10. Thus, the journal citation index is a numerical value that quantitatively expresses the importance of a journal.
However, the problem is that the journal citation index is often applied directly to evaluate the importance of individual papers. The value of a journal’s impact factor directly serves as the influence score for the papers published in that journal. For example, if there are two journals—‘Science and Technology’ with an impact factor of 10 and ‘Monthly Engineering’ with an impact factor of 90—then all papers published in ‘Science and Technology’ are evaluated at 10 points, while all papers in ‘Monthly Engineering’ are evaluated at 90 points. Scores evaluated in this manner are naturally applied to the authors of those papers as well. Consequently, Researcher A, who published a paper in ‘Science and Technology’, earns 10 points, while Researcher B, who published a paper in ‘Monthly Engineering’, earns 90 points. Regardless of their qualifications as researchers or the originality of their papers, an 80-point gap emerges between A and B.
The first problem with journal impact factors is the “statistical trap.” Since a journal impact factor is an average, the number of citations can vary widely even among papers published in the same journal. The influence of an individual paper does not simply correlate with the impact factor of the journal in which it was published. For example, while A’s paper may have an impact factor of 10, it could actually be cited dozens or even hundreds of times. However, it is possible that the overall average dropped to 10 simply because the citation counts of other papers published in ‘Science and Technology’ were very low. Conversely, while Author B’s paper may have been cited only once or twice, it could have benefited from a spillover effect, with the journal’s citation index jumping to 90 because other papers in ‘Monthly Engineering’ were cited very frequently. In such a situation, it is meaningless to consider the journal’s impact factor when comparing the research of A and B. Instead, it is reasonable to compare the citation counts of A’s and B’s papers individually and conclude that A’s paper has a much greater impact. According to the San Francisco Declaration, roughly 25% of the papers in a journal account for 90% of the total citations. In other words, if the journal impact factor is applied directly to individual papers, there will inevitably be papers that are disadvantaged and others that benefit unfairly.
Second, research evaluation based on the journal impact factor fails to reflect the unique characteristics of each academic discipline. For example, in fields such as medicine or biology, when a new theory is proposed, numerous experiments are conducted to verify its validity. Since clinical trials related to a single paper are often followed by subsequent studies, the impact factors of biology and medical journals are inevitably higher than those in other natural sciences or engineering fields. On the other hand, in pure mathematics, a single paper is usually self-contained. Subsequent research or experiments are unnecessary. Therefore, mathematics papers tend to have relatively few citations, and the impact factors of pure mathematics journals are inevitably low. Additionally, in highly specialized fields with few researchers, the number of citations is relatively low. Conversely, journals in large fields with active research and a large number of researchers will have relatively high citation counts. Thus, the evaluation method based on the Journal Citation Index has the limitation of failing to account for fundamental differences stemming from the nature of each discipline.
The third issue is the adverse effect of the Journal Citation Index, which leads to a concentration of papers in a small number of popular journals. Researchers preparing to submit a paper naturally want their work to be published in world-renowned journals such as ‘Cell’, ‘Nature’, and ‘Science’. This is because these journals have the highest impact factors. If this concentration in a few journals becomes excessive, publishing a paper may come to be regarded as more important to researchers than the research itself. A culture that recognizes only papers published in top-tier journals has, before we knew it, become a global phenomenon. If this “luxury brand mentality”—which undervalues the diligent research process and emphasizes only visible results—continues, it could distort the very essence of science. Furthermore, some journals encourage “self-citation,” or citations of papers published in their own journals, in an effort to boost their own impact factors. Thus, evaluation based on journal impact factors is giving rise to unreasonable and unethical competition.
Of course, journal impact factors are not without their merits. The reason they are widely used is their ability to facilitate quick and convenient evaluation of researchers. The editors of each journal serve as “expert evaluators.” They swiftly identify important and noteworthy research amid the deluge of research outputs. In today’s rapidly changing and expanding scientific community, this is an undeniable benefit.
However, as we have seen, there are critical flaws lurking behind this convenience. The first is the statistical trap: the journal impact factor may differ from the actual number of citations a single paper receives. The second is that the impact factor fails to account for the unique characteristics of individual disciplines. It has been pointed out that in some fields, the number of citations is virtually unrelated to a paper’s influence. The final issue is that this evaluation method fosters wasteful competition within the scientific community, such as by creating a monopoly structure dominated by a few prestigious journals.
To overcome these issues and achieve proper scientific development, the scientific community is currently seeking new evaluation criteria. The simplest approach is to incorporate individual researchers’ citation counts into the evaluation. This serves as an alternative to address the pitfalls of statistics. Another method involves using an adjusted index that reflects the unique characteristics of each discipline. By using the adjusted index—calculated by dividing a journal’s impact factor by the average citation count of the top 20% of journals in its field—it is possible to normalize the data to account for the specificities of each academic discipline. A more fundamental alternative would be to revitalize peer review and develop qualitative evaluation criteria rather than relying on quantitative metrics such as journal impact factors or citation counts. What is currently required of the scientific community is self-reflection and communication to devise rational evaluation methods.

 

About the author