A world map of esophagus cancer research: a critical accounting

Background Esophageal cancer (EC) is one of the deadliest cancers worldwide. The contemporary strong increase of the adenocarcinomas in Western countries and the high mortality rates require the intensification of prospective multinational studies. Methods Therefore, this global health issue has been chosen for the bibliometric review of the global publication output. As source for meta and citation data, the Web of Science has been used and Density Equalizing Maps were applied for visualization. Results 17,387 articles on EC could be identified. The years with publication and citation maxima correspond to the appearance of the most prolific articles. China is the most publishing country, followed by Japan and the USA. Germany and the UK ranked 4th and 5th. The analysis of the ratios articles and socio-economic parameters emphasizes the leading position of the Scandinavian countries and Japan. Here, the high-income countries come out on top. The high incidence regions are mainly represented by Chinese and Japanese research. The association of the publication output and the overall research funding could be shown. Conclusions A strengthened international network increasingly consisting of the scientifically best positioned countries as well as more of the high incidence countries worldwide is mandatory for future research. The findings deliver scientists, clinicians and decision makers backgrounds for future decisions all over the world.

Asia on the other hand, the rate of ESCC increased considerably [8].
The major risk factors are remaining unclear. Even so, smoking, alcohol abuse, deficient or extremely salty diet [10], obesity and Barrett's esophagus or GERD can be declared as the main risk factors for the development of EC [1,2]. In some populations, the interaction of poverty or low socioeconomic status [11], nutritionally and social habits seem to increase the risk. Further risk factors of EC are the consumption of very hot liquids [12], the lack of fruits and vegetables [13], the drinking of mate tea through very hot metal tubes, the eating of residues form opium pipes, or chewing of betel nuts [14], to name just a few [1]. The reduced risk by the consumption of acetylsalicylic acid [15] may be explained by the rates of stem cell divisions. Here, the negative effects of acetylsalicylic acid intake must be mentioned as well [16].
The familial accumulation of EC has been shown in high incidence regions such as China [17]. If this is in fact due to a hereditary context is still not resolved until now.
Regarding the estimation of risk factors a differentiation between the ESCC and the EAC is absolutely necessary. Not every risk factor increases the likelihood of getting both types. Alcohol and hot liquids for instance are not linked to EAC, and the reflux diseases do not increase ESCC risk [2].
In contrast to the epidemiologic research on other cancer types [18,19], there are still many unknown circumstances or uncertainties regarding the epidemiology of EC, e.g. the association between EC and Helicobacter pylori is still unclear [20].
The increasing incidence rates all around the world has not reached the zenith yet. The high mortality rate manifests the need to strengthen the research efforts to meliorate the prevention and the treatment of EC. The importance for better monitoring systems has been highlighted in other studies as well [21].
Therefore, the working group took the opportunity to evaluate the research output to depict a new map of the scientific approaches all over the world. The focus has been laid on the international networks, and the development of main topics. In this comprehensive survey the question of what are the most important influences of the research history has been addressed and answered as well as an outlook was given. This is not only important for the scientist but also for the planners, fund raisers and decision makers worldwide.

Methodological platform and data source
This study on the global research output on EC is part of the scientometric platform NewQIS (New Quality and Quantity Indices in Science) that has been generated by Groneberg-Kloft in 2009 [22] to provide widespread information on a multitude of biomedical issues.
As data base, the Web of Science of Clarivate Analytics has been used. Its Core Collection supplies the interested and scientific users with publications and the respective bibliometric data. Additionally, it provides the associated citation numbers, so that the analysis of semi-qualitative aspects of the publication output can be carried out. The socio-economic data has been collected from the World Factbook [23].

Search strategy
The search term has been generated according to the principle of an extensive and representative literature study using the Entry Terms of the MesH Database (Medical Subject Headings) of the US National Library of Medicine (National Institutes of Health) [24]. Since these Entry Terms functions as a thesaurus for cataloguing the diseases, the completeness of the search query is guaranteed. The used search term was: "?esophag* AND (*cancer* OR *carcinom* OR *neoplasm*)". The asterisks replace an indefinite number of characters and has been applied to find different variations of the individual terms. The title search function was used. To include only the original articles in the investigation, the corresponding filter function of WoS was applied.

Data analysis
The resulting data pool has been integrated in a data base according to its bibliometric information and served as analysis basis. In the focus stood the development of the publication performance regarding EC and its global distribution. The influences over time and the impact of socio-economic features were evaluated too. Additionally, the citation analysis allowed qualitative insights of the scientific efforts and their backdrops. Here, the h-index in a modified version (applied to the performance of countries) and the average citation rate were calculated and analyzed. Usually, the h-index is used as key indicator for the global reputation of an author in the scientific community. It is calculated by the number of publication that at least received the same number of citations each [25]. For further geographical evaluations, the number of articles was put into relation to the gross domestic product (GDP) and the population size of each country. The association between the number of articles and expenditures for research funding, respectively epidemiological conditions set another focus to this study by using linear regression [26,27]. Furthermore, an analysis of the research areas and author's keywords has been carried out that permit statements on the main issues, chronologically and geographically.

Illustration of findings
In part, the findings have been displayed by means of density equalizing map projections (DEMP). This technique represents a method to grab complex global circumstances at one glance by distorting the country sizes. An anamorphic map is the result of the compensation of the osmotic density gradient of the evaluated parameter (e.g. publication number, citation number) [28].

Chronological analyses
All in all, the bibliometric data of 17,387 original articles (n) was retrieved. Out of this pool, 16,230 articles were written in English (93.3%). The second most used language was French with 2.47% (n = 430), followed by German (n = 395) and Russian (n = 172).
The number of articles on EC were initially remaining below n = 10 until the end of the Second World War 1945. Afterwards, a moderate but steady increase could be observed until 1980.

Geographical analyses
The information of the country of origin was given in the field tags of 16,686 articles (95.9%). This pool was the source of the geographical analyses and represents the timespan between 1973 and 2017. Previous to this time frame, the geographical data is non-existent, respectively very limited.
The evaluation of absolute numbers (Fig. 1a) shows that China is the most publishing country with n = 4448 articles on EC, followed by Japan (n = 3828) and the USA (n = 3125). Following behind with some distance, Germany and the UK reached n = 999 and n = 952, respectively.
The chronological development of the relative proportion of the most publishing countries is shown in Fig. 1b The USA received the most citations (c = 103,833), followed by Japan (c = 81,099), China (c = 61,726), and also at some distance by the UK (c = 27,607) and Germany (c = 26,095) (Fig. 2a).
To give consideration to the differences regarding the number of inhabitants and the economic strength (threshold ≥ 30 articles), the ratios of the number of articles/population in mill. (R POP ) and the number of articles/GDP in 1000 bn USD (R GDP ) were analyzed (Fig. 3). It is notable that High-income (HI) countries were ranked higher in principle regarding both parameters, especially Japan (R POP = 30.21, R GDP = 776.15), Netherlands (R POP = 29.97, R GDP = 588.98), and Sweden (R POP = 29.55, R GDP = 586.23). In this analysis, the countries with a lower income level are set back. Especially China-ranking first regarding the absolute evaluation numbers-has been thrown back.

Influence of research funding and incidence rates
Looking at the association between the publication output of the countries and the overall expenditures on Research and Development in mill. US-Dollars (R&D), [26] respectively the countries' epidemiological burdens [27] (OECD countries), differences regarding the adequacy of individual research endeavors reveals. For the linear regression of the numbers of articles and the R&D expenditures a coefficient of determination (r 2 ) = 0.79 can be calculated. The correlation is strongly significant (Spearman p < 0.0001), while the correlation between the number of articles and the incidence crude rate (ICR) with r 2 = 0.44 and p = 0.0122 shows only a weak significant association. Therefore, the influence of the funding seems to be stronger that the disease burden or the associated expenses.
Despite the significant correlation of the funding expenditures and the publication output, the individual countries showed a very different publication behavior (Fig. 4). To analyze the deviations from the regression line, the residuals were calculated and compared (Fig. 5). Striking is the negative position of the USA (Fig. 4a). Analyzing the publication output regarding the epidemiological EC burden of the OECD countries measured by the ICR, the picture is different (Fig. 4b). Here, UK (ICR = 14) and Netherlands (ICR = 12.5) with a very high rate published only little, while the USA a higher contribution made referring to the values of 2012 [27]. Also, Ireland (ICR = 9.3) and Belgium (ICR = 9) following regarding their ICR were participating very little EC articles. With even higher ICRs, China (ICR = 16.4) and Japan (ICR = 15.6) worked more on EC, although on slightly negative positions.

Collaboration analyses
The international collaboration network is wide ranging (Fig. 6), but the highest number of cooperation articles were worked out under participation of China and USA with 513 common articles. In addition, the cooperation between Japan and USA with 129 collaboration articles is worth mentioning. China produced 783 of its overall 4448 articles on OC together with another country (17.60%). In contrast, Japan wrote only 7% (n = 268 of 3828) in international cooperation. With 33.73% the USA published more than one-third of its overall articles together with other countries (n = 1054 of 3125), whereof nearly half has been worked out with China (48.87%).

Research areas
The Looking at the alteration in the relative distribution of the ten most assigned subject areas in 5-year intervals from 1968 until 2017 (Fig. 7a), it can be stated that Oncology und Gastroenterology & Hepatology relatively increased from 10.16 to 42.13%, while the proportion of Surgery decreased from 28.34 to 14.95%. Only since 1993, Biochemistry & Molecular Biology appeared among the ten best with only 0.13%, but increased on a moderate level until 2017 (3.21%). The comparison of the most publishing countries (Fig. 7b) revealed differences regarding the proportion of Surgery among the ten most assigned subject areas. In China, this area is clearly underrepresented (3.27%), whereas here Biochemistry & Molecular Biology is more noticeable than in the other countries.

Methodical strength and limitations
The enormous amount of scientific publications and the increasing range of journals makes it absolutely necessary to separate the wheat from the chaff. Not only for the individual scientists, but also for planners and funders it is getting more and more difficult to assess the available literature as well as the scientific environment. A long-sighted and globally adapted research is in demand, especially regarding medical issues. So, the NewQIS platform select important medical topics to carry out comprehensive bibliometric reviews. All bibliometric studies depend on the representative status of the database. The WoS is certainly one of the most prolific literature and citation data bases worldwide, but nevertheless, some disadvantages have to be discussed. The English-bias has already been shown in a variety of articles [29,30]. Even so, possible faulty citing [31,32] leads to methodological limitation, since the citation numbers are underlying all citation parameters. Therefore, the interplay of the applied citation parameters seems to provide the highest benefit. Likewise, the generation of the applied search term is controversial. The more extensive the data base, the more faultily integrated entries can disturb its quality, and the resulting validity can be brought into question. On the other hand, a data base should be as complete as possible because otherwise it harbours the risk of neglecting important data. Hence, it is a tightrope walk to elaborate a scientifically adequate term and the appropriate strategy.

Discussion of results
The development of the number of publications follows a known pattern. It has been proofed earlier that the increase of articles number show an exponential progression (Table 1) [33]. Nevertheless, the visible maxima of average citation numbers per year illustrate an association with the most prolific articles (Table 2) [34][35][36][37]. Additionally, the incidence of EAC in the Western countries has increased tremendously in the last decades [8], so that the significant increase of publications on EC in the last years seems reasonable [38]. The high incidence and prevalence rates of the so called Esophageal Cancer Belt leading from Iran, Central Asia to North China cause the high participation of China ranking first. Nearly half of the new OC cases worldwide occurred in China in 2012 [6]. Here, OC ranked 4th regarding the incidence rate after lung, stomach and liver cancer [38]  and also regarding the mortality rate [39]. After all, Iran still ranks 15th with 229 articles. Other belt-countries can be neglected. The publication output regarding the ratios of the socio-economic factors mat the findings of Man et al. which emphasize the dominancy e.g. of Scandinavian countries, certainly because of the National Cancer Registries that are delivering the clinical data [40][41][42]. The main difference of both study results is the position of Japan. Despite the low incidence and mortality rate in Japan [43] the research effort on EC is remarkable. Man et al. [44] positioned Japan in the lower part of the ranking regarding the R POP value. In our review, Japan ranks second regarding the absolute numbers of the publication output, and ranks first regarding both socio-economic features (R POP , R GDP ) ( Table 2). Regarding the influence of the Total Research Spending, Man et al. [44] found Japan equally high ranked as in our study. This was confirmed by our findings that show that Japan position regarding their expenditures on R&D was extremely positive. It showed a slightly discarded position in comparison to their high ICR. Japan already established several institutions, working groups, and studies with the focus on EC. The Japan Esophageal Society for example, publishing the Esophagus Journal, generated the Comprehensive registry of EC. Additionally, the Japan Esophageal Cancer Group has been created as one of the first two Groups of the Japan Clinical Oncology Group (JCOG) in 1978. However, with only 7% international collaborations, for Japan can be calculated the lowest cooperation percentage of all HI-countries publishing on EC.
Despite their ranking among the best five, the publication performance on EC of the USA and the UK is relatively low compared with other studies [45][46][47]. An analysis of research funding on cancer burden measured by estimated medical costs and years life lost (YLL) stated the underfunding of EC research in both countries [48]. The correlation analysis of this study regarding the connection of the article numbers and the ICR showed a positive participation of US-American scientists based on the Globocan numbers of 2012 [27]. However, there has recently been an extreme accumulation of  number of articles on EC in comparison to their overall R&D expenditures last among the OECD countries.
After the findings of this study, the UK positioned itself with a high ICR of 14 and n = 952 article behind. However, the UK works on EC seems to be in line when looking at their overall R&D expenditures. Also, Carter et al.
[49] found that the under-funded research on EC, shifted somewhat towards an improved funding from 2000 to 2010. This may be due to the largely nationalized medical system, while in the US only a scarce state health system is established. This led to more prevention measures, more check-ups and more available medical data in UK [49].
Comparing the publication output of other European Countries, a back lying can be stated when looking at the high ICR in 2012. Here, the Netherland, Ireland, Belgium has to be highlighted. Especially the most affected countries in Europa are publishing comparatively little on EC. Looking at the highly affected countries worldwide with ICR over 15, only China and Japan show a justified research endeavor on EC. Other highly affected countries-especially developing countries-did not play a role in the research landscape of EC. Therefore, the targeted promotion of established science systems and their scientists under the current conditions is highly required and the most affected countries without financial recourses for adequate research efforts have to be supported and included in the global scientific network. Here, a rethink must take place, which leads to a redistribution of resources.

Conclusions
The unclear epidemiological correlations and the involved bad basis for evaluation complicates the possibility to develop successful prevention measures for EC. Additionally, the extremely geographically and histologically varying incidence rates make the generation of solution approaches very difficult. Therefore, the need for further research seems essential. The increase of new cases in Western countries and the under-funding of EC research shows a significant demand for decision makers, funders and scientists. The low collaboration preparedness of the most acting countries and the weak participation of highly affected countries elucidate the support of countries with an elaborated scientific basis. The challenge in this respect is to optimize research and research funding in accordance with the cancer burden. The establishing of functional health systems can improve the access to reliable data so that adequate strategies can be developed. There is a large demand for multidisciplinary approaches to fulfil the complex scientific issues related to EC.