The Emergence of General AI for Medicine: Medical Applications of ChatGPT
The International Center for Genetic Disease (iCGD) is a platform that analyzes patients and healthy subjects from different parts of the world for research into the causes and consequences, prevention, and treatment of disease. Recently, the iCGD invited Peter Lee, Ph.D., Corporate Vice President, Research and Incubations at Microsoft, to present “The Emergence of General AI for Medicine” as part of their keynote series, showcasing examples of how general AI (e.g., ChatGPT) can be used in health care and medicine, and then discuss the implications for the future as these systems continue to evolve, becoming increasingly more intelligent and capable.
Related article: A New Phase in Drug Development: Programming the Genome
GPT-4 Represents a Major Milestone in AI Development
GPT-4, the latest model released by OpenAI, unlike its predecessor, GPT-3.5, has been specifically designed to acquire general cognitive intelligence. This advancement aims to enhance the model’s ability to comprehend, reason, and respond to complex queries, mimicking human-like intelligence while surpassing the capabilities of its predecessor.
In the speech, Peter mentioned that he wants to explore the potential applications and impact of GPT-4 in the field of medicine, including its use in medical examinations, patient care, medical education, also clinical documentation and discusses its limitations and challenges.
When It Comes to Medicine, What Can ChatGPT Do?
About six months before the release of GPT4, Microsoft and Open AI had an interest in understanding what would be the potential implications of this new language model on areas of direct human benefit, such as medicine. And so that ended up being Peter’s project before the release of GPT-4 in March.
The process of subjecting the system to a professional certification exam has become almost a standard feature of artificial intelligence in recent years. Taking the United States Medical Licensing Examination (USMLE), a three-step examination for medical licensure in the United States, would be an example of a typical problem. There is no doubt that GPT-4 is capable of answering these questions correctly when prompted, in addition to giving its reasoning behind the answer. GPT-4 appears to engage in causal inferences as well as reasons for eliminating the other choices in the multiple-choice test when it provides its reasoning for its answers.
GPT-4, a State-of-the-Art General-Purpose Model Well-Used in Medical Field
Scientists at Stanford University’s Institute for Human-Centered Artificial Intelligence (HAI) have examined typical curbside consultations and discovered that GPT-4 gives accurate answers to 93% of them. Also, a recent research paper published in JAMA Internal Medicine, from the team led by Dr. John W. Ayers from the Qualcomm Institute at the University of California San Diego, sheds light on the potential of AI assistants in the field of medicine. The study aimed to compare the written responses of physicians with those generated by ChatGPT in response to real-world health questions. Remarkably, a panel of licensed healthcare professionals favored ChatGPT’s responses 79% of the time, rating them as superior in terms of quality and empathy. “It’s a little bit mysterious because GPT-4 had no specialized medical training. The GPT program so far has been focused entirely on acquiring general cognitive intelligence,” said Peter.
In the speech, Peter gave an example of a girl named Kim asking GPT-4, “What do you think Kim describing a problem might be thinking and feeling?” There was a brief disclaimer provided in the GPT-4, stating that it is merely a language model and cannot be used to assess a person’s feelings or thoughts. And then it delved into some details, opining about the possibility that Kim would be feeling worried or scared, and might need some words of assurance. Caregivers may also find this useful in their daily activities. For instance, “If I’m Kim’s doctor, what could I say to her to provide comfort and support?” GPT-4 will provide words of support and advice that not only helps the doctor engage with a scared or desperate patient, but words that also connect well with that patient’s state of mind.
According to Peter, GPT-4 could also have implications for medical education. When undergoing training as a doctor, GPT-4 can simulate the character of Kim and engage in a conversation as if she was a patient visiting. While engaging in these interactions, GPT-4 demonstrates an impressive ability to effectively carry out its duties as instructed. It displays common sense and a profound understanding of the world, even acknowledging the presence of an examination table for the patient to sit on. Once the conversation concludes, GPT-4 can provide a detailed evaluation.
Peter highlighted that not only Microsoft + Nuance but also numerous companies are now engaging in the field of medical documentation services. GPT-4 plays an active role in automating tasks and alleviating the burden of clinical documentation. For instance, it can be utilized to analyze open-source transcripts and generate comprehensive clinical notes. Furthermore, GPT-4 can read personal lab test results and provide insights on any potential concerns. These applications demonstrate how GPT-4 can effectively relieve the workload and streamline processes in the medical field.
The Three Limitations of GPT-4
Regarding the current limitations of GPT-4, three key aspects deserve public attention. First, the well-known issue of hallucinations. Second, the peculiar behavior observed in math and logic. Last, an almost existential question that, although less significant in the context of medicine, sparks curiosity and concerns about whether the system truly comprehends its actions.
Hallucination is an alluring aspect to consider in the context of GPT-4. When requesting citations or specific information, GPT-4 may generate answers with fabricated elements rather than regurgitating its training data directly. This behavior stems from GPT-4’s design as a reasoning engine, aiming to avoid easily reproducing its training data. However, when utilized in conjunction with the Bing search engine, GPT-4 can provide accurate and grounded responses by searching the web for relevant information. This retrieval-augmented generative AI approach, where GPT-4 is guided by external tools, holds great potential, as demonstrated by its application in the medical field.
For instance, GPT-4 can assist in reading and comprehending research papers, offering informed speculation and insightful discussions on topics that may not have known answers. The ability to summarize papers, establish connections, suggest follow-up studies, and identify precursor studies has proven to be a valuable asset for researchers, enhancing productivity and accelerating the research process. While hallucination remains a challenge, it has evolved into a form of informed speculation within the realm of research, yielding significant advancements in various fields.
GPT-4 also faces challenges when it comes to math and logic, showcasing its alien intelligence in these domains. While it can correctly solve complex problems, it often falters with simpler math tasks, such as calculating correlations. To mitigate errors and hallucinations, researchers have found that cross-checking with a second instance of GPT-4 or requesting step-by-step work can yield better results. Integration of GPT-4 with Microsoft 365 shows promise, allowing users to fill in Word documents and generate reports efficiently. However, GPT-4 struggles with problems requiring backtracking, in which it’s unable to engage in guess and verify problem-solving processes. Despite these limitations, GPT-4’s reasoning capabilities can still astound, such as writing computer programs for Sudoku. The exploration of GPT-4’s limitations and capabilities has spurred intensive efforts in AI ethics and responsible AI practices across organizations like Microsoft, focusing on principles of fairness, safety, privacy, inclusiveness, transparency, and accountability.
Lastly, Peter pointed out that the existential concerns and personal angst surrounding this new form of artificial intelligence. Noam Chomsky’s insightful Op-ed in the New York Times captures this sentiment well. Chomsky, unlike many others, experimented with ChatGPT and raised valid points about the limitations of intelligence. He emphasized the importance of not only describing what is, was, and will be, but also what is not and what could or could not be—an essential aspect of true intelligence.
Chomsky even presented a specific example involving counterfactual reasoning. The system’s training involves solving fill-in-the-blank problems through feedback loops, which raises questions about gaining true intelligence solely through this approach. However, when the entire Op-ed is fed into GPT-4, it produces a remarkably coherent and convincing rebuttal to Chomsky, showcasing its ability to engage in counterfactual reasoning. This paradoxical outcome continues to baffle us as we strive to understand its mysteries. Nevertheless, GPT-4’s benefits and risks in the medical context cannot be ignored.
Future Interaction between GPT-4 and the Society
In conclusion, the immense publicity surrounding this new form of AI is a positive development. It is encouraging to witness the world taking notice and engaging with these important matters.
Someday, society must make a crucial decision. We can opt to deem the technology too dangerous and restrict or eliminate its use. Alternatively, we can act recklessly and unleash it without proper consideration. The third option involves a deliberate and collaborative effort to explore the synergies between humans and AI, leveraging their combined strengths to achieve remarkable outcomes. This will be a significant decision that the society has to made together. However, it requires a mindful and wise approach to navigating the challenges ahead.©www.geneonline.com All rights reserved. Collaborate with us: firstname.lastname@example.org