Nahin Khan
Education
2023-2025
ETH Zurich
- Masters of Science in Computational Biology and Bioinformatics
- GPA: 5.71 / 6.0
- Relevant Courses:
- Big Data, Probabilistic Artificial Intelligence, Applications of Deep Learning on Graphs, Computational Biology, Automated Software Testing, Large Language Models, Advanced Machine Learning, Image Analysis and Computer Vision, Current Topics in Biophysics
2016-2021
Carnegie Mellon University
- Bachelors of Science in Computer Science
- Bachelors of Science in Biological Sciences
- GPA: 3.94 / 4.0
- Dean’s List High Honors: Fall 2016 - Spring 2021
- Relevant Courses:
- Computer Networks, Computer Security, Natural Language Processing, Complexity Theory, Machine Learning (Coursera), Graph Theory, Molecular Biology, Immunology, Cell Biology
Work Experience
Sept 2024 - Present
Backend Engineer, IVIA Lab, ETH Zurich, Switzerland
- Implemented the backend logic for customized named entity recognition using LLMs
- Contributed ideas for increasing efficiency of named entity recognition pipeline
Jan 2022 - Aug 2023
Research Assistant, Qatar Computing Research Institute
- Contributed to bioinformatics-related projects
- Conducted literature search to familiarize myself with current state of research
- Implemented methods and created pipelines to analyze data and gain biological insights
- Automated processes using Makefiles and common command-line tools (awk, sed, xargs, etc)
- Trained CNN and BERT models to predict protein binding strength using Tensorflow and Hugging Face
- Wrote and submitted papers for publishing with various teams
Oct 2021 - Jan 2022
Full Stack Developer, sKora
- Set up development, staging, and production environments for product development on Linode
- Set up CI/CD pipelines for automated tests and deployments to Kubernetes clusters using GitHub Actions
- Implemented backend API using FastAPI and frontend using ReactJS
- Performed code reviews and gave feedback to team members
Dec 2020 - Jan 2021
Backend Engineer, sKora, Intern
- Scraped millions of records from a football transfer market website
- Populated internal databases after cleaning the data using MySQL-Python integrations
Aug - Dec 2019
Parallel Data Structures and Algorithms Teaching Assistant, CMU Pittsburgh
- Led weekly 50-minute recitation sessions to teach key concepts to students in SML
- Led review sessions to help prepare 250+ students for midterms and exams
Jan - May 2020
Ballroom Dancing Instructor, Carnegie Mellon University Qatar
- Co-instructed a Student-led Course (StuCo) designed to be an introduction to ballroom dancing
- Taught cha cha, waltz, foxtrot, tango, salsa and more
Skills
Coding
- Python, C, Bash, JavaScript, SML, HTML, CSS, Matlab, R, x86 Assembly
Technologies / Environment
- Kubernetes, Docker, GitHub Actions, Django, FastAPI, MySQL, Nginx, ReactJS, Linux
Machine Learning
- Pytorch, Pytorch Geometric, Tensorflow, Keras, CNNs, GNNs, VAEs, Transformers, BERT models
Research Experience
Oct 2024 - Present
Multi-Modal Modelling of Time-series ICU Data, Biomedical Informatics Group, ETH Zurich
- Generated foundational models for ICU data by performing masked pretraining on time-series data
- Incorporated multiple modalities (vitality measurements and text) of data
Apr - Aug 2023
Genome-wide association study for ECG traits, QCRI
- Identified novel genetic regions that contribute to abnormal ECG patterns
Nov 2022 - Apr 2023
Improving Polygenic Risk Scores, QCRI
- Proposed methods for improving genetic risk scores when combining local population and published data
June - Nov 2022
Multiomics Project, Center for Precision Health and Medicine, QCRI
- Combined genomic and metabolomic data from 3000+ people with over 22,000 features
- Utilized multi-omic data to propose molecular mechanisms leading to coronary heart disease risk
- Proposed genetic and metabolomic risk factors to screen for coronary heart disease risk
Feb - Sept 2022
Antibody Project, Qatar Center for AI, Research Assistant
- Extracted binding strength data between antibody and antigen motif sequences using crystal structure data
- Trained machine learning models for predicting the strength of binding between antibodies and antigens
Jan - June 2020
Honors Thesis: A Bioinformatics Tool for Exploring RNA-Protein Interactions
- Developed a CLI tool that collects and visualizes RNA-Protein interactions (https://rnpfind.com)
- Developed features for studying correlation and overall-binding profiles of RNA binding proteins
- Utilized Django and Docker; run and managed on Linode hosts
April - Oct 2019
Genetically Engineered Machine Competition, Boston, Team Bioinformatician
- Built a machine capable of detecting recessive genetic disease in carriers in half an hour
- Collaborated with various international teams on molecular modelling of Cpf1, gRNA, and template DNA
June - Aug 2017
Woolford Lab, Mellon College of Science, Pittsburgh, USA, Research Intern
- Investigated the role of Drs1 in ribosome assembly
- Created spotting assays of mutagenized yeast strains and isolated preribosomes for protein analysis
- Constructed models of Drs1 function in assembly
- Presented results at Meeting of the Minds Conference (2018)
Aug - May 2017
Phage Genomics Research Course, Carnegie Mellon University
- Sequenced extracted DNA from isolated phages usign Ion Torrent machine to obtain DNA strand sequences
- Performed computational assembly and annotated the sequence DNA to generate a gene map
Projects
May 2024
Automated software testing of Optimization Solvers
- Implemented consistency and metamorphic testing on popular optimization solvers
- Verified internal consistency of GUROBI and CPLEX across mutated test cases
Dec 2023
Link Prediction on Knowledge Graphs
- Trained and evaluated an RGCN model on the FB15k-237 dataset
- Generated embedding representations for entities in a task-independent manner using contrastive learning
- Achieved mean reciprocal rank of 0.538
Dec 2023
Reinforcement Learning on a Pendulum
- Implemented an off-policy RL algorithm on the Pendulum environment
- Implemented soft actor critic (SAC) to directly predict Q-values and policy
- Used the modified variant which automatically sets the temperature
Nov 2023
Bayesian Optimization
- Optimized an unknown objective function that is costly time-wise to evaluate
- Addressed an unknown constraint function defining regions of unsafe evaluation
- Implemented a variant of the GP-UCB algorithm
Nov 2023
Approximate Bayesian Inference in Deep Neural Networks
- Implemented Stochastic Weight Averaging Gaussian (SWAG)
- Trained a deep CNN with posterior distribution estimates for parameters using SWAG
- Implemented histogram binning and temperature scaling to improve calibration
Oct 2023
Weather prediction with uncertainty estimates
- Used Gaussian Processor Regressors to estimate the weather based on location
- Improved efficiency by k-means clustering and learning multiple GPRs for each cluster
- Evaluated the model with an asymmetric cost to model realistic city requirements
Oct 2023
Custom Message Passing Layers in GNN
- Investigated the effects of breaking permutation invariance in a GNN message passing layer
- Concluded that some datasets benefit from breaking invariance due to hidden structures in node ordering
March 2021
Mini BitTorrent
- Wrote a peer-to-peer program for sharing files between multiple hosts in C
- Designed and implemented a TCP-like protocol for sharing chunks of data (over UDP)
- Implemented congestion control and the sliding window algorithm
Feb 2021
HTTP 1.1 Compliant Server
- Wrote an international standard (RFC2616) compliant web server in C, supporting GET, HEAD, and POST
- Integrated CGI support
May 2021
Question and Answer Model
- Wrote a program that generates questions from an article, and answers questions regarding the article
- Used word2vec word embeddings to find similarity between sentences in the article and the questions
- Utilized co-reference resolution to increase similarity matches
Nov 2020
Mailpile: Open-source Contribution
- Submitted a bugfix for an open-source mail client
- Fixed several issues by fixing the root cause of allowing duplicate email address registration
- Fixed both front-end and back-end
Oct 2020
Dynamic Memory Allocator
- Wrote a dynamic memory allocator (malloc) in C with high utilization (69%) and throughput (14k KOPS)
- Utilized a segregated free list and engineered a custom region for small-sized allocation requests
Nov 2020
Automated Theorem Prover
- Developed a prover of theorems in intuitionistic logic written in SML and Prolog
Nov 2016
Python Chess A.I.
- Developed a chess program with Artificial Intelligence in Python and shared online for a course
- Has been referenced by several repositories on GitHub and has attracted community attention
Open Source Contributions
May 2023
CMplot
- R package that allows users to generate various genomic plots
- Enhanced a particular type of plot by adding the option to view multiple x-axes
Apr 2023
Tensorflow, Missing Semester, langchain, Mephisto, cs228-notes, nginx-proxy, ukemi, bedtools2, Docker Docs, celery, Python Packaging User Guide (pypa)
- Documentation and textual fixes of tutorials / lectures
- Improvement in clarity of topics
Apr 2023
alpaca.cpp
- Fixed a bug that could result in a segfault error to occur on some architectures
Mar 2022
atomium
- A Python package useful for parsing PDB (3D molecular) files and manipulating 3D structural data
- Added a feature that allows the user to see the origin (organism) of each protein chain in the PDB file
- For example, an antibody-related PDB file often has human / mouse chains and viral chains together, which this PR makes viewable programmatically
May 2021
TLDR
- Useful tool for looking up cheatsheets for console commands
- Added a page that explained how “bfg”, a tool for removing large files or passwords from Git history, works
Awards and Honors
Oct 2019
Andrew Carnegie Scholar and Qatar Campus Scholar, Awardee
- Selected for showing high standards of academic excellence and leadership in
- Awarded to top 2 students out of ~140 during graduation the community
May 2019
Fifth Year Scholar ($80,000), Awardee
- CMU program that sponsors a student to study for an extra year after graduating
2017-2020
Qatar Foundation Scholarship ($200,000), Recipient
- Competitive merit-based full scholarship given to students in Education City