Data Engineer

Princeton University, United States

Updated: 20 days ago

Location: Princeton, NEW JERSEY

Job Type: FullTime

The Accelerator seeks a Data Engineer to work with team members to assist in developing, deploying, and improving data-intensive applications and processes. As part of a small cross-functional team, this individual will participate in product design and iterative development to support the mission of powering policy-relevant research by building shared infrastructure.

As someone growing in their expertise, this individual usually plans and executes tasks requiring judgment, adapting standard techniques, and sometimes creating new methods to solve problems. They have enough experience to be confident in their abilities and have completed projects. They typically work independently, receiving instructions on the expected outcomes, occasional technical guidance for uncommon issues, and approval from supervisors before starting projects. They collaborate with others to resolve important questions and coordinate work. They may use advanced techniques.

A remote work arrangement within the United States may be considered for candidates with the appropriate background and experience. University-paid business travel to Princeton, NJ may be required approximately 2-4 times per year.

The term of this appointment is 1 year, with the possibility of renewal based upon satisfactory performance and funding.

Data Lake Design, Implementation and Maintenance

Using the latest Cloud storage and processing techniques, help design and implement a Data Lake architecture allowing our project to store and process terabytes of data daily. Utilize Parquet tables using a common storage format and enable processing using Python and Spark.
Support in monitoring performance and optimizing bottlenecks in storage and data transfer for scale and cost-effectiveness. Help onboard new users and document processes on the platform.

Data Science and Data Augmentation Analysis

With support, analyze data sets within the platform to identify useful insights and patterns with computational modeling using ML libraries such as Pytorch and Tensorflow. These insights will then be incorporated into our product designs and these conceptional model code will then be made ready for use in a production environment.
Help test, QA'd and document data flows and model definitions to ensure that ongoing support is available.

Cloud Based Data Processing

We plan to use services offered by our cloud-based partners to enhance and expand our data with the help of third-party data sources and insights. To support this effort, assist in creating and documenting processes, contribute insight on offers from third-party providers, and help develop code that enables our platform to communicate with our partners through REST APIs and/or SDKs.

Data Ingestion Pipeline Development

Help create pipelines to pull data from various sources such as websites and APIs to then be transformed into our core storage format. Write these pipelines using Python and using tools such as Apache Spark and the Data Lake for efficient processing of the data.
Assist in maintaining pipelines, ensure that they run efficiently and at the correct time. Perform data validation and data quality tasks with some supervision to ensure the accuracy of the data.

Web based Crawler Development

Develop, with supervision, web crawlers to extract HTML from websites using tooling, such as Python, BeautifulSoup, Selenium, or similar. The crawlers should work on both plain HTML and AJAX-based websites. Provide ongoing support and maintenance for the crawlers, under supervision, to ensure they continue to function properly.

Essential Qualifications:

A combination of relevant internship or work experience and education that would equal 1-3 years of relevant experience with a record of accomplishment
Proficiency in Python.
Experience with distributed systems.
Strong knowledge of data storage technologies
Familiarity with relational databases and Elasticsearch.
Experience tuning data systems for performance and reliability.
Development experience with PyTorch and TensorFlow on both CPU and GPU targets.
Knowledge of text processing and image processing techniques.
Experience with extracting data from APIs
Education: Bachelor's degree or equivalent work-related experience

Preferred Qualifications:

Experience in data lakes and data mesh architectures
Experience with web scraping

We at the School of Public and International Affairs believe that it is vital to cultivate an environment that embraces and promotes diversity, equity and inclusion - fundamental to the success of our education and research mission. This commitment to diversity informs our efforts in recruitment and hiring as we actively seek colleagues of exceptional ability who represent a broad range of viewpoints, experiences and value systems, and who share Princeton University's dedication to excellence.

Princeton University is an Equal Opportunity/Affirmative Action Employer and all qualified applicants will receive consideration for employment without regard to age, race, color, religion, sex, sexual orientation, gender identity or expression, national origin, disability status, protected veteran status, or any other characteristic protected by law. KNOW YOUR RIGHTS

36.25

No

Yes

180 days

No

No

No

Mid-Senior Level

#Ll-DP1

Join our Talent Network to receive updates about working at Princeton.
Princeton University job offers are contingent upon the candidate’s successful completion of a background check, reference checks, and pre-employment screening, as applicable.
If you have questions or comments regarding the iCIMS Privacy Policy or iCIMS FAQs , please contact [email protected] .
Go to our careers site.

View or Apply

Similar Positions

Tiger Data Metrics Developer Temporary, Princeton University, United States, 29 days ago
Details Posted: 17-May-24 Location: Princeton, NJ, US, 08544 Type: Full-time Salary: Open Categories: Other Staff/Administrative Internal Number: 240965859 TigerData Metrics Developer - Temporary ...
Drupal Developer (Web Specialist), Princeton University, United States, 29 days ago
Details Posted: 17-May-24 Location: Princeton, NJ, US, 08544 Type: Full-time Salary: Open Categories: Other Staff/Administrative Internal Number: 240965879 Drupal Developer (Web Specialist) US-NJ-...
Professional Specialist/Associate Professional Specialist For Simons Observatory, Princeton University, United States, 2 days ago
Details Posted: 13-Jun-24 Location: Princeton, NJ, US, 08540 Type: Full-time Salary: Open Categories: Other Staff/Administrative Internal Number: 242189059 Position: Professional Specialist/Associ...
Research Specialist Ii, Princeton University, United States, 29 days ago
Details Posted: 17-May-24 Location: Princeton, NJ, US, 08544 Type: Full-time Salary: Open Categories: Other Staff/Administrative Internal Number: 240874430 Research Specialist II US-NJ-Princeton J...
Advancement Coordinator, Princeton University, United States, 22 days ago
Details Posted: 24-May-24 Location: Princeton, NJ, US, 08544 Type: Full-time Salary: Open Categories: Other Staff/Administrative Internal Number: 241263971 Advancement Coordinator US-NJ-Princeton ...
Assistant To The Executive Director, Princeton University, United States, 6 days ago
Details Posted: 09-Jun-24 Location: Princeton, NJ, US, 08544 Type: Full-time Salary: Open Categories: Other Staff/Administrative Internal Number: 242025287 Assistant to the Executive Director US-N...