Using association rules mining to explore pattern of Chinese medicinal formulae (prescription) in treating and preventing breast cancer recurrence and metastasis

Background Chinese herbal medicine is increasingly widely used as a complementary approach for control of breast cancer recurrence and metastasis. In this paper, we examined the implicit prescription patterns behind the Chinese medicinal formulae, so as to explore the Chinese medicinal compatibility patterns or rules in the treatment or control of breast cancer recurrence and metastasis. Methods This study was based on the herbs recorded in Pharmacopoeia of the People’s Republic of China, and the literature sources from Chinese Journal Net and China Master Dissertations Full-text Database (1990 – 2010) to analyze the compatibility rule of the prescription. Each Chinese herb was listed according to the selected medicinal formulae and the added information was organized to establish a database. The frequency and the association rules of the prescription patterns were analyzed using the SPSS Clenmentine Data Mining System. An initial statistical analysis was carried out to categorize the herbs according to their medicinal types and dosage, natures, flavors, channel tropism, and functions. Based on the categorization, the frequencies of occurrence were computed. Results The main prescriptive features from the selected formulae of the mining data are: (1) warm or cold herbs in the Five Properties category; sweet or bitter herbs in the Five Flavors category and with affinity to the liver meridian are the most frequently prescribed in the 96 medicinal formulae; (2) herbs with tonifying and replenishing, blood-activating and stasis-resolving, spleen-strengthening and dampness-resolving or heat-clearing and detoxicating functions that are frequently prescribed; (3) herbs with blood-tonifying, yin-tonifying, spleen-strengthening and dampness-resolving, heat-clearing and detoxicating, and blood-activating with stasis-resolving functions that are interrelated and prescribed in combination with qi-tonifying herbs. Conclusions The results indicate that there is a close relationship between recurrence and metastasis of breast cancer with liver dysfunctions. These prescriptions focus on the herbs for nourishing the yin-blood, and emolliating and regulating the liver which seems to be the key element in the treatment process. Meanwhile, the use of qi-tonifying and spleen-strengthening herbs also forms the basis of prescription patterns.


Background
Breast cancer is one of the most common malignant tumors among women, and the incidence increases every year in both developed and developing countries [1]. Every year, among the 1.2 million women diagnosed with breast cancer worldwide, 500 thousand cases die of the disease. Along with a sharp increase in life expectancy, expansion of urbanization and adaptation of western lifestyle, the increase in incidence rates is even more obvious in developing countries [2][3][4][5]. In China, the number of cases increased by 38.5% from 2000 to 2005. Compared with the early surveys in the 1990s, breast cancer accounted for the largest increase in mortality rates in 2005 [6].
Today, the standard therapies for breast cancer include surgery, chemotherapy, radiation therapy, and hormonal therapy. However, even though patients receive systemic treatment, there is still 10% to 30% chance of recurrence and metastasis. Among the patients with local recurrence, 75% to 93% will eventually develop distant metastasis with an extremely low 5-year survival rate [7,8]. Visceral metastasis is the main reason for treatment failure and cause of death. Lung, bone, liver and brain are the most common sites of distant spread of breast cancer [9,10]. Since metastasis is the main reason for cancer treatment failure, management of metastasis is the key factor for determining the prognosis of the patients [11].
Recently, the use of natural Chinese herbal medicine with anti-tumor effects is receiving more and more attention from the public [12]. In traditional Chinese medicine (TCM), the treatment and prevention of breast cancer recurrence and metastasis is a holistic approach through multi-level, multi-target and multi-channel control. CTM differs from Western medicine, which adopts ways to block a single transfer in a particular process. In comparison, Chinese medicine adopts an overall therapeutic approach to treat and prevent recurrence and metastasis, to improve the immune system of patients, and to strengthen the body's susceptibility to diseases. Meanwhile, Chinese medicine also aims at reducing the side effects of radiotherapy and chemotherapy, reversing drug resistance and improving quality of life and survival for patients. Therefore, these unique advantages have gradually made the Chinese medicinal approach in combating breast cancer recurrence and metastasis the research focus of both the local and overseas scholars [13,14].
In Chinese medicinal therapy, experienced Chinese medical practitioners prescribe a medicinal formula-a combination of various single herbs-for the treatment of ailments. According to TCM theories, pharmacological and pharmacodynamic relationship exists among herbs, which is deemed as Chinese medicinal compatibility. The compatibility of Chinese herbal medicine has particular rules and patterns. In Chinese medicinal database, there are over ten thousand medicinal formulae which enclose complicated information. However, a well-established and orderly system for organizing the information of Chinese medicinal formulae does not exist. This implies that a large amount of implicit prescription patterns behind the formulae have not been fully disclosed [15,16].
Association rules mining is one of the methods for discovering meaningful associations or correlations between variables in large databases. It identifies frequent item sets from the data sets, and then uses these frequent item sets to form their association rules. To select meaningful rules from the set of all possible rules, minimum thresholds on support and confidence are the two important constraints. An association rule has the form LHS⇒RHS, where LHS and RHS are sets of items, and the RHS set is likely to occur whenever the LHS set occurs. One of the applications of association rules mining is to mine association rules in medical record data [17,18]. Since association rules mining is a popular and well-researched method, it can be used to investigate the Chinese herbal medicine compatibility patterns, and to reflect the interdependence and relationship between the variables. Therefore, it can provide scientific evidence for clinical applications of Chinese medicine, and thereby offer an implication for the integration of Chinese medicinal therapy with modern Western medical therapies to better treatment or prevention of breast cancer recurrence and metastasis [19]. The support supp(X) of an item set X is defined as the proportion of transactions in the data set containing the item set. It is a function used for evaluation of the potential usefulness of the rules. The confidence of a rule is defined as conf(X => Y), which can be interpreted as an estimate of the probability P (Y|X) [20].

Sources of literature
This study was based on Pharmacopoeia of the People's Republic of China [21] recorded to investigate the prescription patterns of using Chinese medicine for treatment and prevention of breast cancer recurrence and metastasis. The sources of literature included the Chinese Journal Net and the China Master Dissertations Full-text Database (1990 -2010) ( Table 1). The name of each herb was used as a keyword to obtain the relevant literature, and only the literature which focused on "breast cancer", "advanced stage of breast cancer" and/or "post-operation of breast cancer" was eligible for inclusion. According to the following inclusion and exclusion criteria, a total of 131 papers describing various medicinal formulae for clinical applications were included (96 medicinal formulae with a total of 180 Chinese herbal medicines (herbs); the total cumulative occurrences of 180 herbs appearing in 96 formulae were 1001 times). The terminologies used in this article refer to 'WHO International Standard Terminologies on Traditional Medicine in the Western Pacific Region', which has documented the common technical terms used in traditional medicine.

Inclusion criteria
There were five types of literature included, including literature: (1) related to clinical research on using Chinese medicine for the prevention and treatment of breast cancer recurrence and metastasis; (2) related to clinical research on using Chinese medicine for the treatment of advanced stage breast cancer; (3) related to clinical research on using Chinese medicine for the prevention of postoperative breast cancer recurrence and metastasis (especially at stage III or later when metastasis had occurred); (4) with randomized controlled trials as the study design; and (5) where the clinical study aims to prove the efficacy of experimental group with Chinese medicinal treatment over control group.

Exclusion criteria
Literature with the following criteria were excluded: (1) small-sample-sized studies with less than 20 cases; (2) studies which primarily aimed to treat complications of operations or to reduce the side effects of chemotherapy; (3) studies without investigation into the use of Chinese medicine for the treatment and prevention of breast cancer recurrence and metastasis; (4) studies which provided only the names of formulae but without descriptions of herbal ingredients; (5) duplicate publications reporting the same group of participants; and (6) literature in which the clinical trial received a Jadad score of less than 2.

Statistical analysis
Association rules mining is a popular and wellresearched method for discovering interesting relations between variables in large databases [22]. We used the following definition for item sets and association rules.
An association rule has the form LHS⇒RHS, where LHS and RHS are sets of items and the RHS set is likely to occur whenever the LHS set occurs [23].
Two parameters (support factor and confidence factors) were essential in association rules mining. With regard to support and confidence in discovering the association rules, the user shall set the minimum support (min-sup) and the minimum confidence (min-conf) as critical values providing the baselines for discovery. Only the combinations that satisfy the minimum thresholds on support and confidence were considered to mine meaningful rules. The selection of thresholds (support and confidence) was always an issue. If the minimum confidence is set too high, a lot of useful data will be missed. To find an effective drug compatibility mode, we discovered central tendency of association rules to be more obvious at the support of 0.1 and confidence of 0.6 in the two correlation analysis of these herbs (used pairs of couplet herbs) and the pairs of herbal functions. So the minimum support of 0.1 and the minimum confidence of 0.6 were specified in this study.
Based on Pharmacopoeia of the People's Republic of China, the ingredients of Chinese medicine were listed according to the selected medicinal formulae and were organized to establish a database. The computing software Microsoft ACCESS was used as a storage tool, and then the SPSS Clenmentine Data Mining System was used as a platform to analyze the frequency and the association rules of the prescription patterns. An initial statistical analysis of the database was carried out to categorize the herbs according to their medicinal types and dosage, natures, flavors, channel tropism, and functions. The frequencies of occurrence and use were then computed based on the categorization. In additionthe associations between different functions of Chinese herbs from the formulae were also examined using the association rules mining. "breast cancer" and/or "advanced stage" and/or "postoperation""clinical research " "TCM", "prevention and treatment of breast cancer recurrence and metastasis" and be eligible for selection criteria

Associations between Five Properties and Five Flavors from 180 herbs prescribed in 96 formulae
The 180 herbs were categorized according to the Five Properties and Five Flavors (

Frequency distribution of categorized herbs according to their functions
Herbs with tonifying and replenishing (qi-tonifying, blood-tonifying, yin-tonifying and yang-tonifying), blood-activating and stasis-resolving, spleen-fortifying and dampness-resolving or heat-clearing and detoxicating functions appeared to be most frequently prescribed for the treatment and prevention of breast cancer recurrence and metastasis ( Table 6). The top three functions included herbs with qi-tonifying, heat clearing and detoxicating, and blooad-activiating and stasis-resolving functions.

Associations between pairs of herbs functions from the formulae
Association rules mining was applied to investigate the associations between pairs of herb functions from the formulae, and to examine the Chinese medicinal compatibility patterns ( Table 7). The minimum support of 0.1 and the minimum confidence of 0.6 were specified.
The top three pairs of herbal functions with the highest confidence included the blood-tonifying paired with qi-tonifying functions (93.18%), the qi-regulating paired with qi-tonifying functions (93.10%) and the yin-tonifying paired with qi-tonifying functions (92.50%).

Associations between pairs of couplet herbs from the formulae
Couplet herbs are two herbs used in pair to increase the therapeutic effect or reduce the toxic effect. To    Frequency of use = number of formulae recording the use of the herbs / total number of selected formulae Occurrence frequency = number of occurrences for the herbs appearing in various formulae / total cumulative occurrences for 180herbs appearing in 96 formulae ( i.e.: 1001); Frequency of use = number of formulae recording the use of the herbs / total number of selected formulae.
further examine the compatibility patterns of coupletmedicinal prescriptions, we targeted the herbs for healthy-qi reinforcement (including qi-tonifying, yintonifying, blood-tonifying, yang-tonifying and spleenfortifying and dampness-resolving), and the herbs for pathogenic-factor elimination (including heat-clearing and detoxicating, blood-activating and stasis-resolving, and qi-regulating), which were frequently prescribed for the treatment and prevention of breast cancer recurrence and metastasis ( Table 8). The minimum support of 0.1 and the minimum confidence of 0.6 were specified. The top three pairs of couplet herbs with the highest confidence included the Tai Zi Shen paired with Bai Zhu (86.36%), the Bai Zhu paired with Huang Qi (84.44%), and the Bai Zhu paired with Fu ling (77.78%).

Discussion
From the herbal perspective, breast cancer is the local manifestation of a whole-body disease, referred to as an intrinsically deficient but extrinsically excessive syndrome. Based on TCM theories, deficiency of spleen qi, inadequate source of engendering transformation, deficiency of qi and blood, and excess of phlegm-dampness are believed to be the main mechanism responsible for development of breast cancer [24,25].

Medicinal formulae often include herbs that are sweet or bitter
The 180 herbs were classified according to the Five Flavors, and herbs that were sweet or bitter were the top two most frequently prescribed herbs in the formulae. In TCM theories, herbs that taste sweet can be used for supplementation, moderation and harmonization, referred to as tonifying and replenishing herbs. Herbs that taste bitter can be used for discharging and downbearing, referred to as heat-clearing and detoxicating herbs. However, sweet tasting herbs with spleenstrengthening functions were prescribed and used more frequently than herbs with a bitter taste for clearing heat.
There is a close relationship between recurrence and metastasis of breast cancer and liver, and herbs for nourishing the yin-blood, emolliating and soothing the liver, and smoothing the meridians are the keys of breast cancer treatment Breast cancer is different from the other cancer types, as the onset of this disease usually peaks at menopausal   [26]. The pathological characteristic of this period is marked by exhaustion of heavenly tenth. During this period, the body suffers from yin-blood deficiency, and liver-kidney depletion. Liver is the organ for storing blood. Liver functions in free coursing, and its functions are based on sufficiency of yin-blood. In other words, the free coursing relies on the sufficiency of yin-blood stored in the liver. Therefore, not only herbs for soothing the liver and regulating qi are needed, but also the herbs for emolliating the liver blood are essential for the treatment and prevention of breast cancer recurrence and metastasis. From the association rule mining, the herbs, such as Shao Yao, Wu Wei Zi, Ji Xue Teng, Sheng Shu Di, Gou Qi Zi, Nu Zhen Zi, and Dang Gui, are used directly for blood-tonifying and liver-emolliating in treatment of breast cancer. In general, herbs for nourishing the yin-blood, emolliating the liver, soothing the liver and smoothing the meridians play a key role in breast cancer treatment.
Ample clinical research of Chinese formulae reinforces the spleen to regulate qi and soothe the liver to alleviate pain. Thus, they do not only resist tumor and strengthen the body, but also have anti-cancer effects on metastatic breast cancer [27,28].
The use of herbs for reinforcement of healthy qi and elimination of pathogenic factors is a common Chinese medicinal combination From the TCM perspective, the etiology of breast cancer is due to deficiency of the healthy qi, which is related to spleen qi deficiency, and liver-kidney depletion. This deficiency will result in malfunctioning of spleen, liver and kidney for transportation and transformation, and free coursing. Without the proper functioning, stagnation and obstruction of the breast collaterals will ultimately be developed and transformed into breast cancer [29].
The use of qi-tonifying and spleen-fortifying herbs is the basis of prescription patterns for preventing breast cancer recurrence and metastasis Restoration of healthy qi is an effective way to treat diseases and to prevent further progression. The use of qitonifying and spleen-fortifying herbs is to replenish the source of engendering transformation for qi and blood, and to achieve qi-tonifying, blood-replenishing and harmony of the five visceral functions. This is particularly essential for nourishing the liver and smoothing the qi movement. At the same time, spleen-strengthening and qi-replenishing herbs also have the functions for resolving dampness and dispelling phlegm. Therefore, the formulae prescribed herbs such as Huang Qi, Bai Zhu, and Fu Ling, among others.. From the association rules mining, the results showed that the combination of the herbs should also focus on the functions for qi-tonifying. The use of couplet herbs involving Huang Qi and Bai Zhu is to achieve the effects of spleen-strengthening and qi-replenishing, and dampness-drying and water-draining; the use of couplet herbs involving Bai Zhu and Tai Zi Shen is to achieve the effects of fluid-engendering and lung-moistening; the use of couplet herbs involving Bai Zhu and Fu Ling is to achieve the effects of dampness-resolving. The effectiveness of these tonifying and replenishing herbs on tumor resistance and immunity enhancement has also been proven by clinical studies [30,31].

Conclusions
The results showed that recurrence and metastasis of breast cancer is considered to have a close relationship with liver dysfunctions. These prescriptions focus on the herbs for nourishing the yin-blood, and emolliating and regulating the liver. Strengthening of liver function seems to be the key to successful treatment. Meanwhile, the use of qi-tonifying and spleen-strengthening herbs also forms the basis of prescription patterns. It is also noteworthy that liver function is promoted by strengthening the spleen.