AI Research Internship Hunt as a CS PhD Student
Tips and thoughts from my relatively successful summer research internship hunt (as a 3rd-year Computer Science PhD student)
Introduction
My Fall 2023 semester had been challenging as I had too much on my plate, including (1) my PhD research LexC-Gen (Yong et al., 2024), (2) my voluntary involvement in the making Aya model safer (Üstün et al, 2024), (3) making SEA languages more representative (being a Malay language ambassador for Aya (Singh et al., 2024), helping out at SEACrowd project, and hosting AACL tutorial (Aji et al., 2023)) (4) an NIH grant project that funds my PhD, (5) academic responsibilities such as classes, and (6) summer research internship search. And thankfully, all went well, and I even managed to squeeze out time to travel internationally back home in Malaysia to see my family.
It was quite stressful applying for research internships because there weren’t many resources online on how to apply and prepare for them. Furthermore, given the English-centric and becomingly close-sourced AI development, I was quite uncertain at the beginning if my search for internships was a good use of my time. I could have spent this coming summer working on my multilingual NLP research agenda, instead of an internship that wouldn’t be aligned with my research direction or wouldn’t allow me to publish papers.
Fortunately, I received multiple offers and chose to spend my upcoming summer interning at FAIR (Meta AI). I hope my post sheds some light on the AI research internship application process. I have also received a lot of help and great advice during this process, so I hope to share it with more people.
General Process and Timeline
In general, AI research internship process are as follows:
submit applications: This usually means uploading your CV to the open positions. For some companies, you should not submit applications if you are getting referred for certain positions. You should check with the person who refer you if you need to apply by yourself. I use a two-page CV following this template (without photo).
online assessment or take-home assignments. This typically includes machine learning (ML) or deep learning (DL) coding questions, or data-structure-and-algorithms (DS&A) coding questions. Some also asked multiple-choice questions about ML/DL knowledge.
(multiple) technical interviews with engineers or researchers, which includes coding for DS&A (similar to those on Leetcode) or ML, asking questions about LLMs and deep learning, and presenting my research work.
interviews with hiring manager, which typically involves talking about my research interests and/or asking common behavioral questions. In my opinion, this is more casual, and I get to know more about the team and projects I will be joining.
Typically, before the online assessments or interviews, there would be recruiters reaching out to me to inform me the type of the interviews and who I would be interviewing with. Different companies have different structure. For instance, interviews could go from 30 minutes up to 90 minutes. Some interviews were back-to-back, and some were spaced with one or more weeks in-between. Some companies had technical components in all their interviews, whereas some only had one.
For most companies, it took me 2 to 3 months to go from applying to receiving the internship offer decision. Usually the recruiter would reach out to schedule a call and inform me the offer details.
Preparation
Here, I will share how I prepared for my interviews after applying. In my experience, surprisingly, there weren’t many overlapping questions asked by different companies during interviews.
A. Data Structure and Algorithms (DS&A)
I used Blind 75 and Neetcode 150 to brush up my DS&A skills.
For certain companies, I would use the company tag to filter and practice specific DS&A questions on Leetcode.
B. Machine Learning (ML) / Deep Learning (DL)
I reimplemented some basic ML algorithms referring to this repository.
I didn’t revise much on knowledge questions, especially for DL, because I think my research uses them on a regular basis. Though, I did have a refresher on common ML concepts such as bias vs. variance, SVM, precision vs. recall, AUC curves, etc.
C. Large Language Models (LLMs)
For positions that are related to GenAI, knowledge of LLMs are expected.
I referred to multiple sources, such as original transformer (“Attention is all you need”) paper and this blog, to understand how attention works implementation-wise and equation-wise.
I also relied on HuggingFace tutorials and documentation resources as a refresher for everything around LLMs, such as tokenizers, efficient training, etc.
D. Behavioral Interview
I used this guide to prepare for the behavioral interview questions.
In my interview, I focused on my unique collaborative research experience at BigScience and Aya. My understanding is that companies are looking for interns who could fully own a project and communicate task items effectively.
E. Research Interview
Usually it took around 20 minutes (or less) to present my work. Some companies even asked me to prepare beforehand to present a paper that I didn’t author.
Most of the time, I didn’t have to prepare slides beforehand unless instructed so through email. A format that I used when I presented my work verbally is as follows:
What is the general problem I am solving?
How are people currently solving the problem, and what are the shortcomings of their approach?
Describe my proposed solution in one sentence. (If possible, describe the motivation for such a solution.)
Describe the significance of my work in one sentence.
Then, go into details about my proposed solution, experimental setup, and the results.
The key to verbal research presentation is to keep it brief and hold the interviewer’s attention. Use keywords and short sentences. Allow time for Q&A and the flexibility for the interviewer to stop you midway (so it becomes more of a discussion session).
If I were to present with slides, I would go with how research work is usually presented in conferences.1
Personal Thoughts
Here are my random notes about my applications for summer research internships.
Startups (such as Scale.ai) open their applications in early Fall, whereas larger tech companies (such as Google, Amazon, and Meta) open their applications much later. This is useful to know because oftentimes I thought that I was ghosted, and the scarcity mindset kicked in and stirred up more stress. In fact, for companies such as Google or Meta, the interviews only start to take place in December or even later.
Referrals help. I’ve gotten four referrals, and all of them help me move past the resume screening stage. On the other hand, cold-emailing researchers have mixed results. I’ve sent emails to researchers whom I want to work with in more than 10 companies, but only one (outside the US) replied and interviewed me.
Twitter (𝕏) hasn’t helped me as much as I thought. I was ghosted when I applied through Google Forms shared by researchers on 𝕏 or when I reached out through direct messaging. I suppose job application-wise, Twitter is just another LinkedIn. However, sending emails after reading their Tweets about open positions has resulted in one person kind enough to respond and refer me (although the reply success rate is very low). Nonetheless, one positive thing from 𝕏 is that I know that certain companies have opened their positions.
Interviews have only focused on one or two recently published work. I originally thought the interviews will cover most of my past work, but all interviewers had only asked me to describe my most recent work. In some interviews, I even had the freedom to choose which work to present. This leads me to believe less in becoming a paper-publishing machine during my PhD study. I think this is noteworthy because my experience contradicts claims that companies care a lot about paper count.2
Companies prioritze PhD students at the later stage of their education. I did not hear back from any applications in my first two years of study, and this year, I got multiple interviews and offers. My experience corroborates others’ experience that, unless your school have a direct collaboration with companies, research internships are usually offered to students at a later stage of their study.3 In fact, many applications specify the preferred PhD graduation date of interns–––they are looking for students in the third year or fourth year of their study.
Many companies aren’t doing multilingual NLP work, but I think they value my GenAI safety and LLM training/adaptation experiences. I would say that fewer than 3 of my internship offers are about multilingual NLP, and yet it seems that the rest companies value my prior experience on low-resource language adaptaton of LLM (e.g., can be applied to coding languages) and jailbreaking safety guardrails (e.g., can help design better guardrails or responsible practices for LLMs.) Surpringly, even though I’ve done research work on evaluation on cross-lingual ability, my interviews did not mention anything related to evaluation even though evaluation of LLMs is hyped up these days. Perhaps they are more interested in a particular type of evaluation that is underdeveloped, such as for tool usage, coding, etc.
Concluding Thoughts
I am extremely thankful for people who have referred me to the internship positions and given me great tips and advice. I hope that my post helps those who are now actively looking for research internships. I also want to highlight that whether you get research internships or not is not reflective of your quality as a researcher —— in my opinion, some interviews did not capture that quality,4 and offers are highly dependent on team matching, headcounts, and many factors outside your control.
Do you have similar internship search experiences? Anything you want to learn more about? Feel free to leave a comment below and let me know what you think.
For futher job market perspectives, check out ‘s Interconnects post and ’s NLP News post.
I believe there are a lot of resources or recordings online so I wouldn’t go into details here.
This might not be true because given
’s critique on the AI field, there’s a hiring bar on number of papers and h-index, and I might have already passed the bar. (See my discussion on Twitter.Of course, it could be that I have more papers published under my belt too this year.
Funnily enough, there was one interviewer telling me that there’s no point for DS&A interview when GPT4 can probably solve it better than interviewees, but he still had to give it to me because of company’s policy.
Need your perspective on a situation, I too wish to get into a role of research assistant but I neither have PhD not much papers, but I do have industry working experience of 6 years in deep learning field.
What can I do to assure the interviewer that I have the knowledge, if I don't wish to pursue PhD.
What else I should add to my interview prep, given this background
> Some companies even asked me to prepare beforehand to present a paper that I didn’t author.
Like you're given some else's research paper, understand it, prepare it and explain to the panel?