Draft

242  PGR Llms

242.1 Summary

  • Large Language Medicine: LLMs and academic P/CCM
  • “Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
  • How (and why) do LLMs work?
  • Embeddings: ‘word vectors’
  • Transformer Models:
  • What is a large-language model (LLM)?
  • What does this enable?
  • Transfer learning: (this is the secret sauce)
  • Review: How (and why) do LLMs work?
  • Problem: PHI cannot go to these companies
  • Local deployments

242.2 Slide outline

242.2.1 Slide 1

  • Large Language Medicine: LLMs and academic P/CCM
  • Disclosure: I hold a financial stake and advise a local startup called Mountain Biometrics (focused on data processing of wearable sensor data). We use machine learning.
  • BSc in computer science (2009).
  • (Twitter) ### Slide 2
  • “Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
  • Do not anthropomorphize
  • They’re Statistical Models
  • They’re tools that require some expertise to use
  • Goal 1: Demystify how they work
  • Goal 2: Tips for use
  • Goal 3: Highlight current uses
  • “ChatGPT, make me an image that represents the following quote:”Any sufficiently advanced technology is indistinguishable from magic” by Arthur C. Clark. It’s for a presentation.” ### Slide 3
  • How (and why) do LLMs work?
  • What is machine learning?
  • Logistic Regression
  • → yes/no
  • Linear prediction
  • man
  • woman ### Slide 4
  • How (and why) do LLMs work?
  • Neural Networks
  • Parameters
  • https://mlu-explain.github.io/ ### Slide 5
  • Embeddings: ‘word vectors’ ### Slide 6
  • Transformer Models:
  • Variant of neural network + attention Transformer.
  • Context window how big the attention frame is
  • This, and overall model size, are the main metrics of performance. ### Slide 7
  • What is a large-language model (LLM)?
  • Foundational model trained on an enormous amount of text ### Slide 8
  • What does this enable?
  • Parallel processing on GPUs (instead of CPUs)
  • You can make HUGE models [e.g. the entire internet]
    1. predict the next token [~word]
    1. See if you were right
    1. re-weight parameters to improve
    1. repeat for billions of times ### Slide 9
  • Transfer learning: (this is the secret sauce)
  • Previously: define a task, label a set to train the model in (expensive) and then evaluate it’s performance. Repeat. $$$
  • limited by the amount of annotated data.
  • Now: ”Pre-train” ( unsupervised learning) on a huge amount of data to learn general data – termed a foundational model. Do this once ($$$)
  • Can do tasks not in the training data (actual learning) ### Slide 10
  • Review: How (and why) do LLMs work?
  • They are statistical machine learning algorithms called transformers
  • Flexible machine-learning method built on neural networks, runs on GPUs
  • Attention mechanism allows it to handle references (context window)
  • They learn patterns in massive data sets, then apply elsewhere.
  • Massive Scale and Transfer Learning → Good performance without task-specific training (and costs) ### Slide 11
  • Problem: PHI cannot go to these companies
  • Leaves protected environment: would need a Business Associate Agreement.
  • Real issue of consent ### Slide 12
  • Local deployments ### Slide 13
  • TODO: No text extracted from this slide. ### Slide 14
  • (Google’s regular LLM)
  • Differential Diagnosis ### Slide 15
  • Clinical Reasoning ### Slide 16
  • TODO: No text extracted from this slide. ### Slide 17
  • Jesse’s CaseDifferential diagnosis?
  • Consider: your brain is a black box.
  • Before Scott’s clarification (vs after →)
    1. Bronchogenic Carcinoma (35%)
    1. Recurrent Coccidioidomycosis (Valley Fever) (25%)
    1. Pulmonary Tuberculosis (15%)
    1. Pulmonary Vasculitis (e.g., Granulomatosis with Polyangiitis) (15%)
    1. Pulmonary Embolism with Infarction (10%) ### Slide 18
  • Prompt Engineering:
  • Zero shot prompting just ask for the answer out of the blue ”The patient has heart failure and rales. Why is she short of breath?”
  • N-shot prompting ask a sequence of questions and rely on in-context learning. “What’s your differential?” ”Why do you think heart failure is most likely?” “What tests would you order?”
  • Chain of thought prompting: “The patient has xyz medical history and presents with shortness of breath. First, summarize the case, then provide an ordered different diagnosis. Gives pros and cons for each possibility. Then, provide recommendations on which test to order. ### Slide 19
  • Prompt engineering is hard
  • Act like a [Specify a role],
  • I need a [What do you need?],
  • you will [Enter a task],
  • in the process, you should [Enter details; often in a chain of steps],
  • please [Enter exclusion],
  • input the final result in a [Select a format],
  • here is an example: [Enter an example].
  • Act like an expert diagnostician,
  • I need a prioritized differential diagnosis that includes all of the likely diagnoses and all the diagnoses it would be important not to miss,
  • you will first generate a list of the most likely diagnoses, you will then order them from most to least likely, then you will give a probability for each,
  • in the process, you should consider all the most likely or dangerous diagnoses. Consider the pretest probability of each diagnosis, then modify it by the cumulative strength of evidence to get a final probability.
  • please Prioritize accuracy.,
  • input the final result in a list format, with percentage likelihoods and a short explanation,
  • Here is the case: … ### Slide 20
  • TODO: No text extracted from this slide. ### Slide 21
  • Objective To determine whether an LLM can transform discharge summaries into a format that is more readable and understandable.
  • →less accurate
  • Not a promising use ### Slide 22
  • Clinical Research Informatics ### Slide 23
  • Clinical Research Informatics
  • Goal: Predict who has hypercapnia
  • Current: use structured data like lab values in a logistic regression
  • Ideal: extract unstructured data like symptoms to use a predictors
  • ED Triage Notes
  • (Data must be present when the model runs)
  • Locally-deployed LLM
  • Possible tasks
  • Use as predictor in diagnostic model
  • Tasks:
  • “Did this patient present for confusion?”
  • “How likely is it this patient has confusion?”
  • “How likely is it this patient has hypercapnia?”
  • Evaluation:
  • How well does the LLM identify patients with confusion (vs. human reader)
  • Requires annotation of ‘answers’
  • How much does the LLM improve hypercapnia predictions?
  • Does not require annotation ### Slide 24
  • Scientific Writing?
  • Hard and
  • Fraught…
  • Act like the editor for a popular science publisher.
  • I need feedback on how to make the following sentence easier to read and less ambiguous.
  • you will restate the main point of the following sentence (it’s OK to do this in several sentences). Then, give 5 reformulation of the sentence. Each should be in a different style, but all should emphasize clarity.
  • in the process, you should explain what changes you made. ,
  • please do not use jargon.
  • The sentence is for you to rewrite is… “ ### Slide 25
  • Statistical Coding: ### Slide 26
  • Questions?

242.3 Learning objectives

  • Large Language Medicine: LLMs and academic P/CCM
  • “Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
  • How (and why) do LLMs work?
  • Embeddings: ‘word vectors’
  • Transformer Models:

242.4 Bottom line / summary

  • Large Language Medicine: LLMs and academic P/CCM
  • “Any sufficiently advanced technology is indistinguishable from magic” Arthur C. Clark
  • How (and why) do LLMs work?
  • Embeddings: ‘word vectors’
  • Transformer Models:

242.5 Approach

  1. TODO: Outline the initial assessment or decision point.
  2. TODO: Outline the next diagnostic or management step.
  3. TODO: Outline follow-up or escalation criteria.

242.6 Red flags / when to escalate

  • TODO: List red flags that require urgent escalation.

242.7 Common pitfalls

  • TODO: Capture common errors or missed steps.

242.8 References

TODO: Add landmark references or guideline citations.

242.9 Slides and assets

242.10 Source materials