Skip to main content
Fig. 1 | Journal of Translational Medicine

Fig. 1

From: Language and cultural bias in AI: comparing the performance of large language models developed in different countries on Traditional Chinese Medicine highlights the need for localized models

Fig. 1

Performance of 8 Large Language Models (LLMs) on Traditional Chinese Medicine (TCM)-related Questions. A Accuracy rates of different models. Horizontal lines represent the accuracy rates of Chinese-developed and Western-developed LLMs on Traditional Chinese Medicine questions, respectively. P values were calculated using the Wilcoxon test to compare the accuracy rates of Chinese-developed and Western-developed LLMs on Traditional Chinese Medicine questions. B Heatmap showing the P values for comparisons of accuracy between each Chinese-developed LLM and each Western-developed LLM. P values were calculated using McNemar’s test with Bonferroni correction. ns, no significance. *P < 0.05, **P < 0.01, ***P < 0.001

Back to article page