Healthy women and women with CFS, meeting the Fukuda diagnostic criteria case definition , were recruited from an existing, prescreened database maintained by Stanford’s CFS Research Team. During an initial phone screen we identified potential participants between the ages of 18 and 62 who were fluent in English, capable of transporting themselves to Stanford daily, and able to receive daily venous blood draws, for 25 days in a row. Exclusionary criteria included untreated or uncontrolled significant psychological comorbidities, blood or clotting disorders, rheumatologic disease, autoimmune disease, Lyme’s disease, elevated viral load, active infection, smoking, pregnancy or plans to become pregnant, and the use of blood thinning medications or antibiotics. Because inflammation increases with age, and circulating leptin increases with BMI, healthy control participants were age-matched and BMI-matched to the participants with CFS. During the phone screen, 33 patients were excluded from participation due to conflicts with daily transportation to Stanford (16), Lyme’s disease (6), Hashimoto’s disease (3), not meeting diagnostic criteria (3), resolved symptoms (2), sex (2), and age (1). Twelve healthy controls were excluded for pain conditions (4), thyroid conditions (3), conflicts with daily transportation to Stanford (2), age- and BMI-matching (2), and smoking (1).
Participants were screened again during a site visit at Stanford’s Adult and Pediatric Pain Laboratory. Each participant completed screening questionnaires and blood tests. The questionnaires included a demographic form, the Hospital Anxiety and Depression Scale (HADS) , and the Fibromyalgia Assessment Form . A score of 16 or higher on the HADS Depression subscale was exclusionary. Participants were also excluded if the blood tests showed erythrocyte sedimentation rate (ESR) greater than 60 mm/hr, thyroid stimulating hormone (TSH) outside the range of 0.4 – 4 mIU/L, C-reactive protein (CRP) over 0.9 mg/dL, positive rheumatoid factor, or positive anti-nuclear antibody (ANA).
Twelve patients and eleven controls met the study inclusion/exclusion criteria, and each provided written informed consent for this study in accordance with a protocol approved by the Stanford University Institutional Review Board. One patient withdrew participation for reasons unrelated to the study, and one for privacy concerns. One control withdrew participation due to an unrelated accident affecting her ability to drive to Stanford for daily blood draws. Results for the twenty completers are reported. The study was run between September 2011 and October 2012.
The study protocol consisted of a two-week observational baseline period followed by 25 consecutive days of blood draws. At the site screening visit, participants were given an Android-based device equipped with software (Dooblo’s SurveyToGo, Kefar Sava, Israel) to assess the severity of their CFS symptoms on a visual analogue scale (VAS). The primary outcome, daily fatigue severity, was assessed by asking, “Overall, how severe has your fatigue been today?” The far left of the scale was anchored at “no fatigue” and the far right was anchored at “severe fatigue”. Similar questions about the severity of muscle and joint pain, as well as the quality of sleep were included in the assessment. Participants completed the measures twice per day – once in the morning and once at night. Following the two-week observational phase, participants underwent 25 consecutive daily visits to Stanford’s Clinical and Translational Research Unit (CTRU) for their blood draws, and continued completing the symptom severity surveys twice a day throughout this period. During the CTRU visits, participants’ vital signs (blood pressure, heart rate, and body temperature) were also taken to assess general health and to screen for acute infection.
Blood was drawn by trained phlebotomists or research nurses with a 23-gauge butterfly needle into two 4 cc clot-activator tubes. The site of the blood draw was rotated daily to minimize participant discomfort and maintain vein integrity. After 30 minutes at room temperature, the blood samples were centrifuged at 350 × g for 15 minutes, and the serum was extracted, divided into four cryovials, and stored in a -80 °C freezer for later testing. For each participant, CTRU visits were held within a two-hour window (or narrower) throughout the protocol to control for known diurnal fluctuations in cytokines [16, 17]. Individual schedule conflicts prevented routine appointments in only a few cases. If unforeseen events prevented participants from keeping an appointment (one patient and two healthy controls missed between one and three appointments each), a blood draw for each missed appointment was added to the end of the 25 day protocol so that a total of 25 blood samples were collected for each of the twenty participants.
Serum samples were processed by Stanford’s Human Immune Monitoring Center. Human 51-plex Luminex kits were purchased from Affymetrix and used according to the manufacturer’s recommendations, with modifications as described below. Briefly, samples were mixed with antibody-linked polystyrene beads on 96-well filter-bottom plates and incubated at room temperature for 2 hours, followed by overnight incubation at 4°C. Plates were vacuum filtered and washed twice with wash buffer, then incubated with biotinylated detection antibody for 2 hours at room temperature. Samples were then filtered and washed twice as above and resuspended in streptavidin-PE. After incubation for 40 minutes at room temperature, two additional vacuum washes were performed, and the samples resuspended in Reading Buffer. Each sample was measured in singlet. Plates were read using a Luminex 200 instrument with a lower bound of 100 beads per sample per cytokine. Median fluorescence intensity (MFI) values were reported for each cytokine, using Masterplex software (Hitashi Corp.), after data quality control to remove samples with low bead counts or other technical abnormalities.
All data were processed and analyzed in SPSS Statistics 20 (Armonk, NY: IBM Corp) unless otherwise noted. Values of cytokine MFI and fatigue severity were first subject-centered using z-score transformation. The z-scores served two purposes. First, they allowed cytokine levels and fatigue severity to be plotted on the same scale in order to visualize trends. Second, they allowed analyses to be run between subjects, without being adversely affected by large baseline between-subject variability.
Daily data (cytokines and fatigue) were temporally smoothed with a 3-day moving average. The moving average is a procedure commonly employed with time series data, to remove high-frequency variability and minimize the impact of random noise on detecting significant trends . Temporally smoothed data were used in all analyses except for cytokine network mapping.
NodeXL  was used to construct and depict a network diagram of fatigue and 51 cytokines, creating a visual display of the relationships among variables for both the CFS and control groups. Group bivariate correlations were fed into a spring-embedded Fruchterman-Reingo algorithm in NodeXL. Variables (fatigue and cytokines) are represented by squares or “nodes,” and variables that are significantly correlated are connected by lines or “vectors.” The p-value was adjusted for multiple comparisons using a 0.01 false-discovery rate, which yielded a corrected p < 0.0012 statistical threshold for each of the links displayed in the network diagram. For visualization in both diagrams, fatigue and leptin are highlighted in red. In the CFS diagram, the relationships between leptin and other cytokines that are statistically significant are also highlighted by red vertices. The multitude of vertices within the network diagrams illustrates the degree to which inflammatory factors fluctuate together.
In a separate analysis, as a post-hoc, proof-of-concept test, we utilized a machine learning algorithm in Weka  to test the ability of cytokines to distinguish high from low fatigue days. Our goal was to determine whether cytokines alone could accurately predict daily fatigue severity in participants with CFS. Machine learning algorithms use multivariate approaches to achieve greater sensitivity than massive univariate tests for identifying complex predictor-outcome relationships. For each of the ten participants with CFS, the dataset included the nine most severe fatigue days and nine least severe fatigue days, for a total of 180 cases. We used Weka’s LibLINEAR support vector machine algorithm with a cost function C = 1 and a 10-fold independent cross-validation.
The model built for the participants with CFS was also tested on the control group to see if this trained model could also predict fatigue in healthy individuals. For the healthy controls, the data were likewise divided into the nine most severe and nine least severe days, but fatigue scores that were the same for both high and low fatigue days were excluded. For example, two participants rated their fatigue as “0” for each of the 25 days, so their data were excluded from this analysis. A total of 136 cases were used for the controls.