242 PGR Llms

242.1 Summary

Large Language Medicine: LLMs and academic P/CCM
“Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
How (and why) do LLMs work?
Embeddings: ‘word vectors’
Transformer Models:
What is a large-language model (LLM)?
What does this enable?
Transfer learning: (this is the secret sauce)
Review: How (and why) do LLMs work?
Problem: PHI cannot go to these companies
Local deployments

242.2 Slide outline

242.2.1 Slide 1

Large Language Medicine: LLMs and academic P/CCM
Disclosure: I hold a financial stake and advise a local startup called Mountain Biometrics (focused on data processing of wearable sensor data). We use machine learning.
BSc in computer science (2009).
(Twitter) ### Slide 2
“Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
Do not anthropomorphize
They’re Statistical Models
They’re tools that require some expertise to use
Goal 1: Demystify how they work
Goal 2: Tips for use
Goal 3: Highlight current uses
“ChatGPT, make me an image that represents the following quote:”Any sufficiently advanced technology is indistinguishable from magic” by Arthur C. Clark. It’s for a presentation.” ### Slide 3
How (and why) do LLMs work?
What is machine learning?
Logistic Regression
→ yes/no
Linear prediction
→
man
woman ### Slide 4
How (and why) do LLMs work?
Neural Networks
Parameters
https://mlu-explain.github.io/ ### Slide 5
Embeddings: ‘word vectors’ ### Slide 6
Transformer Models:
Variant of neural network + attention Transformer.
Context window how big the attention frame is
This, and overall model size, are the main metrics of performance. ### Slide 7
What is a large-language model (LLM)?
Foundational model trained on an enormous amount of text ### Slide 8
What does this enable?
Parallel processing on GPUs (instead of CPUs)
You can make HUGE models [e.g. the entire internet]
1. predict the next token [~word]
1. See if you were right
1. re-weight parameters to improve
1. repeat for billions of times ### Slide 9
Transfer learning: (this is the secret sauce)
Previously: define a task, label a set to train the model in (expensive) and then evaluate it’s performance. Repeat. $$$
limited by the amount of annotated data.
Now: ”Pre-train” ( unsupervised learning) on a huge amount of data to learn general data – termed a foundational model. Do this once ($$$)
Can do tasks not in the training data (actual learning) ### Slide 10
Review: How (and why) do LLMs work?
They are statistical machine learning algorithms called transformers
Flexible machine-learning method built on neural networks, runs on GPUs
Attention mechanism allows it to handle references (context window)
They learn patterns in massive data sets, then apply elsewhere.
Massive Scale and Transfer Learning → Good performance without task-specific training (and costs) ### Slide 11
Problem: PHI cannot go to these companies
Leaves protected environment: would need a Business Associate Agreement.
Real issue of consent ### Slide 12
Local deployments ### Slide 13
TODO: No text extracted from this slide. ### Slide 14
(Google’s regular LLM)
Differential Diagnosis ### Slide 15
Clinical Reasoning ### Slide 16
TODO: No text extracted from this slide. ### Slide 17
Jesse’s CaseDifferential diagnosis?
Consider: your brain is a black box.
Before Scott’s clarification (vs after →)
1. Bronchogenic Carcinoma (35%)
1. Recurrent Coccidioidomycosis (Valley Fever) (25%)
1. Pulmonary Tuberculosis (15%)
1. Pulmonary Vasculitis (e.g., Granulomatosis with Polyangiitis) (15%)
1. Pulmonary Embolism with Infarction (10%) ### Slide 18
Prompt Engineering:
Zero shot prompting just ask for the answer out of the blue ”The patient has heart failure and rales. Why is she short of breath?”
N-shot prompting ask a sequence of questions and rely on in-context learning. “What’s your differential?” ”Why do you think heart failure is most likely?” “What tests would you order?”
Chain of thought prompting: “The patient has xyz medical history and presents with shortness of breath. First, summarize the case, then provide an ordered different diagnosis. Gives pros and cons for each possibility. Then, provide recommendations on which test to order. ### Slide 19
Prompt engineering is hard
Act like a [Specify a role],
I need a [What do you need?],
you will [Enter a task],
in the process, you should [Enter details; often in a chain of steps],
please [Enter exclusion],
input the final result in a [Select a format],
here is an example: [Enter an example].
Act like an expert diagnostician,
I need a prioritized differential diagnosis that includes all of the likely diagnoses and all the diagnoses it would be important not to miss,
you will first generate a list of the most likely diagnoses, you will then order them from most to least likely, then you will give a probability for each,
in the process, you should consider all the most likely or dangerous diagnoses. Consider the pretest probability of each diagnosis, then modify it by the cumulative strength of evidence to get a final probability.
please Prioritize accuracy.,
input the final result in a list format, with percentage likelihoods and a short explanation,
Here is the case: … ### Slide 20
TODO: No text extracted from this slide. ### Slide 21
Objective To determine whether an LLM can transform discharge summaries into a format that is more readable and understandable.
→less accurate
Not a promising use ### Slide 22
Clinical Research Informatics ### Slide 23
Clinical Research Informatics
Goal: Predict who has hypercapnia
Current: use structured data like lab values in a logistic regression
Ideal: extract unstructured data like symptoms to use a predictors
ED Triage Notes
(Data must be present when the model runs)
Locally-deployed LLM
Possible tasks
Use as predictor in diagnostic model
Tasks:
“Did this patient present for confusion?”
“How likely is it this patient has confusion?”
“How likely is it this patient has hypercapnia?”
Evaluation:
How well does the LLM identify patients with confusion (vs. human reader)
Requires annotation of ‘answers’
How much does the LLM improve hypercapnia predictions?
Does not require annotation ### Slide 24
Scientific Writing?
Hard and
Fraught…
Act like the editor for a popular science publisher.
I need feedback on how to make the following sentence easier to read and less ambiguous.
you will restate the main point of the following sentence (it’s OK to do this in several sentences). Then, give 5 reformulation of the sentence. Each should be in a different style, but all should emphasize clarity.
in the process, you should explain what changes you made. ,
please do not use jargon.
The sentence is for you to rewrite is… “ ### Slide 25
Statistical Coding: ### Slide 26
Questions?

242.3 Learning objectives

Large Language Medicine: LLMs and academic P/CCM
“Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
How (and why) do LLMs work?
Embeddings: ‘word vectors’
Transformer Models:

242.4 Bottom line / summary

Large Language Medicine: LLMs and academic P/CCM
“Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
How (and why) do LLMs work?
Embeddings: ‘word vectors’
Transformer Models:

242.5 Approach

TODO: Outline the initial assessment or decision point.
TODO: Outline the next diagnostic or management step.
TODO: Outline follow-up or escalation criteria.

242.6 Red flags / when to escalate

TODO: List red flags that require urgent escalation.

242.7 Common pitfalls

TODO: Capture common errors or missed steps.

242.8 References

TODO: Add landmark references or guideline citations.

242.9 Slides and assets

Presentations/PGR LLMs.pptx

242.10 Source materials

Presentations/PGR LLMs.pptx