RESEARCH ASST II (TEMP)

Updated: 13 days ago
Location: Ann Arbor, MICHIGAN

The U-M School of Nursing, Applied Biostatistics Lab (ABL) seeks a student who is interested in contributing to a summer interdisciplinary research project situated at the cutting edge of intersection of tranlational research and data science. The goal is to develop a large language model (LLM) that extracts text and numeric data from Pubmed Central full-text articles. The immediate application is a model of risk of cardiovascular disease among individuals with prediabetes.

  • Develop and Fine-tune Large Language Model (LLM): Utilize your expertise in natural language processing (NLP) and machine learning to fine-tune a robust LLM to effectively extract relevant data pertaining to cardiovascular disease risk factors from the vast corpus of Pubmed Central articles.
  • Data Preprocessing and Annotation: Collaborate with the research team to preprocess and prepare already existing raw textual data and annotate key information relevant to the study objectives. Implement strategies for data cleaning, normalization, and standardization to ensure the accuracy and consistency of extracted data.
  • Algoritm Implementation and Optimization: Explore prompt engineering to enhance the performance and scalability of the LLM, optimizing predictive accuracy.
  • Model Evaluation and Validation: Conduct through evaluations of the developed/fine-tuned LLM to assess its performance against predefined metrics and benchmarks. Validate the model outputs through rigorous testing and comparison with ground truth data, identifying areas for improvement and refinement.
  • Documentation and Reporting: Maintain comprehensive documentation of the LLM development process, including codebase, experimental setup, and results analysis. Generate clear and concise reports summarizing key findings, challenges encountered, and future directions for research.
  • Collaboration and Communication: Collaborate effectively with interdisciplinary team members, including biostatisticians, computer scientists, and domain experts, to ensure alignment with research goals and objectives. Communicate progress updates, findings, and insights in regular team meetings and presentations.


  • Similar Positions