Senior Data Engineer

Updated: almost 2 years ago
Location: Boston, MASSACHUSETTS
Deadline: The position may have been removed or expired!

About Northeastern:

Founded in 1898, Northeastern is a global research university and the recognized leader in experience-driven lifelong learning. Our world-renowned experiential approach empowers our students, faculty, alumni, and partners to create impact far beyond the confines of discipline, degree, and campus.

Our locations—in Boston; Charlotte, North Carolina; London; Portland, Maine; San Francisco; Seattle; Silicon Valley; Toronto; Vancouver; and the Massachusetts communities of Burlington and Nahant—are nodes in our growing global university system. Through this network, we expand opportunities for flexible, student-centered learning and collaborative, solutions-focused research.

Northeastern’s comprehensive array of undergraduate and graduate programs— in a variety of on-campus and online formats—lead to degrees through the doctorate in nine colleges and schools. Among these, we offer more than 195 multi-discipline majors and degrees designed to prepare students for purposeful lives and careers.

 

About the Opportunity:

The Media Cloud project (http://mediacloud.org) is seeking a Senior Data Engineer to develop scalable text analysis pipelines, research and implement cutting-edge text classification approaches, and support and collaborate on academic research projects related to media attention, hate speech, and social media platforms. The Media Cloud platform is a set of online tools, and associated research methods, for monitoring and measuring online media.

Responsibilities:

In this grant-funded role, you will wear many hats - exploratory data scientist, text analysis expert, data pipeline engineer, research collaborator, product manager, and more. You will work closely with the principal investigators and a team of media researchers to research, prototype, and develop data analysis workflows that can scale from initial prototypes to corpora of millions of documents. Some of this will rely on skills you already have, but you will have to do significant work learning news skills and exploring cutting-edge supporting technologies and algorithms. This position provides an opportunity for someone to work on leading tools that support critical research into how social mobilization interacts with media and to help make Media Cloud more useful for researchers and non-profits trying to understand the role of media for democratic processes. We expect scholarly and popular press publications to come out of this research.

Given the conditions created by the ongoing pandemic, this position is open to part-time remote status. However, it does require being on site at Northeastern at regular intervals.

 

Primary Duties and Responsibilities

Keep up to date on research in data analysis architectures, text classification, hate speech detection, social media platform policies, machine learning, etc. to inform new functionalities in the tooling and research output.

Work with other team members to establish a technical vision for the project.

Contribute to research papers with planning, writing, and data needs.

Maintain, upgrade, and build new data pipelines with data from existing corpora, APIs, and other sources.

Write code that can scale systems to handle ever-expanding data requirements.

Engage in active collaboration and coordination with the cross-institution research team.

Contribute to related project data needs as needed.

Provide budget, logistical, and HR inputs to support grant management.

Contribute to a healthy remote workplace and cultures.

 

Qualifications:

Required:

College degree or other domain-specific accreditation, preferably in computer science, data science, or related fields.

2+ years of experience with cross functional engineering teams.

5+ experience working with software and data in some capacity.

Programming fluency in Python and data-related libraries (pandas, Jupyter notebooks, etc).

Demonstrated ability to iterate quickly through prototypes.

Knowledge of large scale data collection, processing, and storing systems.

Ability to work productively in a virtual environment with remote team members all over the world.

Interest in working on issues related to media converge and hate-speech, democracy, gender, race, or health.

 

Preferred:

Master’s degree or other domain-specific accreditation, preferably in computer science or data science related fields.

Prior experience as a senior software engineer or product development leadership.

Hands-on experience with complex technical project management.

Prior work with online media ingestion and storage.

Interest in working in an academic research environment on projects with real-world impact.

 

 

Salary Grade:

 12

Additional Information:

Please attach resume and cover letter.

Northeastern University is an equal opportunity employer, seeking to recruit and support a broadly diverse community of faculty and staff.  Northeastern values and celebrates diversity in all its forms and strives to foster an inclusive culture built on respect that affirms inter-group relations and builds cohesion.  

All qualified applicants are encouraged to apply and will receive consideration for employment without regard to race, religion, color, national origin, age, sex, sexual orientation, disability status, or any other  characteristic protected by applicable law.

To learn more about Northeastern University’s commitment and support of diversity and inclusion, please see www.northeastern.edu/diversity .

 



Similar Positions