101 AI For Housestaff
101.1 Summary
- ARTIFICIAL INTELLIGENCE FOR HOUSESTAFF
- Goal 1: Demystify LLMs (etc.) work
- Logistic Regression
- Neural Networks
- Embeddings: represent the meaning of text with numbers (‘word vectors’)
- Transformer Models:
- What is a large-language model (LLM)?
- What does this enable?
- Transfer learning: (this is the secret sauce)
- Review: How (and why) do LLMs work?
- Mundane Stuff
- GPT 3-4 vs Physician’s on their boards (Israel)
101.2 Slide outline
101.2.1 Slide 1
- ARTIFICIAL INTELLIGENCE FOR HOUSESTAFF ### Slide 2
- Goal 1: Demystify LLMs (etc.) work
- Goal 2: Current uses
- Goal 3: Pitfalls
- BSc in computer science (2009).
- (Twitter)
- Disclosure: I claim equity in Mountain Biometrics, Inc., a local startup using machine learning on continuous sensor data to detect decompensation and withdrawal in lightly monitored settings. ### Slide 3
- Logistic Regression
- man
- woman
- What is machine learning?
- Need to know if M/F (labels)
- Linear Regression ### Slide 4
- Neural Networks
- Perceptron
- Parameters
- https://mlu-explain.github.io/ ### Slide 5
- Embeddings: represent the meaning of text with numbers (‘word vectors’) ### Slide 6
- Transformer Models:
- Variant of neural network + attention Transformer
- Context window how big the attention frame is
- This, and overall model size, are the main metrics of performance. ### Slide 7
- What is a large-language model (LLM)?
- Foundational model trained on an enormous amount of text ### Slide 8
- What does this enable?
- Parallel processing on GPUs (instead of CPUs)
- You can make HUGE models [e.g. the entire internet]
- predict the next token [~word]
- See if you were right
- re-weight parameters to improve
- repeat for billions of times ### Slide 9
- Transfer learning: (this is the secret sauce)
- Previously: define a task, label a set to train the model in (expensive) and then evaluate it’s performance. Repeat. $$$
- limited by the amount of annotated data.
- Now: ”Pre-train” ( unsupervised learning) on a huge amount of data to learn general data – termed a foundational model. Do this once ($$$)
- Can do tasks not in the training data (actual learning) ### Slide 10
- Review: How (and why) do LLMs work?
- They are statistical machine learning algorithms called transformers (work very well on text)
- Flexible machine-learning method built on neural networks
- Attention mechanism allows it to handle references (context window)
- Images, time-series, etc. require modified algorithms.
- They learn patterns of co-occurrence in massive data sets and numerical representations of meaning.
- It turns out this works well enough to apply to novel problems (transfer learning) ### Slide 11
- Mundane Stuff ### Slide 12
- TODO: No text extracted from this slide. ### Slide 13
- TODO: No text extracted from this slide. ### Slide 14
- TODO: No text extracted from this slide. ### Slide 15
- GPT 3-4 vs Physician’s on their boards (Israel) ### Slide 16
- Are we going to be out of job?
- Pulm/CC Case conference for recurrent hospitalizations with hemoptysis
- Before vs after clarifying a recent pulmonary vein ablation
- Bronchogenic Carcinoma (35%)
- Recurrent Coccidioidomycosis (Valley Fever) (25%)
- Pulmonary Tuberculosis (15%)
- Pulmonary Vasculitis (e.g., Granulomatosis with Polyangiitis) (15%)
- Pulmonary Embolism with Infarction (10%) ### Slide 17
- TODO: No text extracted from this slide. ### Slide 18
- Objective To determine whether an LLM can transform discharge summaries into a format that is more readable and understandable.
- →less accurate
- Not a promising use ### Slide 19
- TODO: No text extracted from this slide. ### Slide 20
- Statistical Coding: ### Slide 21
- Clinical Research Informatics ### Slide 22
- Clinical Research Informatics
- Goal: Predict who has hypercapnia
- Current: use structured data like lab values in a logistic regression
- Ideal: extract unstructured data like symptoms to use a predictors
- ED Triage Notes
- (Data must be present when the model runs)
- Locally-deployed LLM
- Possible tasks
- Use as predictor in diagnostic model
- Tasks:
- “Did this patient present for confusion?”
- “How likely is it this patient has confusion?”
- “How likely is it this patient has hypercapnia?”
- Evaluation:
- How well does the LLM identify patients with confusion (vs. human reader)
- Requires annotation of ‘answers’
- How much does the LLM improve hypercapnia predictions?
- Does not require annotation ### Slide 23
- Scientific Writing?
- Act like the editor for a popular science publisher.
- I need feedback on how to make the following sentence easier to read and less ambiguous.
- you will restate the main point of the following sentence (it’s OK to do this in several sentences). Then, give 5 reformulation of the sentence. Each should be in a different style, but all should emphasize clarity.
- in the process, you should explain what changes you made. ,
- please do not use jargon.
- The sentence is for you to rewrite is… “ ### Slide 24
- TODO: No text extracted from this slide. ### Slide 25
- Clinical Reasoning ### Slide 26
- (Google’s regular LLM)
- Differential Diagnosis ### Slide 27
- Fundamental Theory of Informatics… but is it true? ### Slide 28
- TODO: No text extracted from this slide. ### Slide 29
- TODO: No text extracted from this slide. ### Slide 30
- TODO: No text extracted from this slide. ### Slide 31
- Problem: PHI cannot go to these companies
- Leaves protected environment:
- need a Business Associate Agreement.
- Real issue of consent ### Slide 32
- How deidentified is deidentified enough? ### Slide 33
- Local deployments ### Slide 34
- Prompt Engineering:
- Zero shot prompting just ask for the answer out of the blue ”The patient has heart failure and rales. Why is she short of breath?”
- N-shot prompting ask a sequence of questions and rely on in-context learning. “What’s your differential?” ”Why do you think heart failure is most likely?” “What tests would you order?”
- Chain of thought prompting: “The patient has xyz medical history and presents with shortness of breath. First, summarize the case, then provide an ordered different diagnosis. Gives pros and cons for each possibility. Then, provide recommendations on which test to order. ### Slide 35
- Prompt engineering is hard
- Act like a [Specify a role],
- I need a [What do you need?],
- you will [Enter a task],
- in the process, you should [Enter details; often in a chain of steps],
- please [Enter exclusion],
- input the final result in a [Select a format],
- here is an example: [Enter an example].
- Act like an expert diagnostician,
- I need a prioritized differential diagnosis that includes all of the likely diagnoses and all the diagnoses it would be important not to miss,
- you will first generate a list of the most likely diagnoses, you will then order them from most to least likely, then you will give a probability for each,
- in the process, you should consider all the most likely or dangerous diagnoses. Consider the pretest probability of each diagnosis, then modify it by the cumulative strength of evidence to get a final probability.
- please Prioritize accuracy.,
- input the final result in a list format, with percentage likelihoods and a short explanation,
- Here is the case: … ### Slide 36
- Summary?
- You should probably be using these tools
- It takes skill to use them well
- They are not infallible and raise tricky issues
- For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them.
- -Socrates, on writing ### Slide 37
- TODO: No text extracted from this slide.
101.3 Learning objectives
- ARTIFICIAL INTELLIGENCE FOR HOUSESTAFF
- Goal 1: Demystify LLMs (etc.) work
- Logistic Regression
- Neural Networks
- Embeddings: represent the meaning of text with numbers (‘word vectors’)
101.4 Bottom line / summary
- ARTIFICIAL INTELLIGENCE FOR HOUSESTAFF
- Goal 1: Demystify LLMs (etc.) work
- Logistic Regression
- Neural Networks
- Embeddings: represent the meaning of text with numbers (‘word vectors’)
101.5 Approach
- TODO: Outline the initial assessment or decision point.
- TODO: Outline the next diagnostic or management step.
- TODO: Outline follow-up or escalation criteria.
101.6 Red flags / when to escalate
- TODO: List red flags that require urgent escalation.
101.7 Common pitfalls
- TODO: Capture common errors or missed steps.
101.8 References
TODO: Add landmark references or guideline citations.