Skip to main content

Postdoctoral Scholar - Data Science – Berkeley Institute for Data Science (BIDS)

We are no longer accepting applications for this recruitment

Description

The Berkeley Institute for Data Sciences (BIDS) is seeking two postdoctoral scholars to work on the research and development activities funded by the grant titled “Project Jupyter: Computational Narratives as the Engine of Collaborative Data Science”. The Jupyter and IPython Projects are a set of open-­­source software tools for interactive and exploratory computing, developed at UC Berkeley, Cal Poly, Simula Research Lab, Southampton University and industry partners. These software projects support
reproducible and collaborative scientific computing and data science across a wide range of programming languages (Python, Julia, R, etc.). The main application offered by Project Jupyter is the Jupyter Notebook (https://try.jupyter.org), a web-­­based, interactive computing platform that allows users to perform data cleaning, data analysis, statistical modelling, numerical simulation and data visualization. We have over 2 million users worldwide across a wide array of technical fields.

These positions will contribute to the design and implementation of the project’s open-­­ source software in an agile and highly collaborative GitHub-­­based workflow. This work will include supporting existing features as well as developing new features related to the frontend user interface, collaboration capabilities, documentation, deployment, testing infrastructure, etc. You will also help us build the user/developer community by attending and speaking at conferences, hackathons, workshops in scientific computing and data science at local and international events. We are looking for candidates with extensive frontend and/or backend software engineering experience with significant open-­­source experience.

The postdoctoral scholars will work with the project PIs and the rest of the project team on the main research targets of the project, covering three main areas (full details can be
found in the project’s grant proposal, available here):

  1. Interactive Computing
    a. Notebooks as interactive applications: expose a view of a notebook that presents it as a responsive application based on selective hiding of code and inputs from metadata annotations.
    b. Modular, reusable UI/UX: refactor the User Interface tools in the project to support easier reuse and composition in different application contexts.
    c. Software engineering with notebooks: support the development of reusable code libraries in the interactive notebook environment.

  2. Computational Narratives
    a. nbconvert: tools for importing and exporting notebooks from and to other document formats, as part of scientific publishing pipelines.
    b. Element filtering: selective display of notebook content depending on usage context.
    c. Documentation: improved project documentation as well as novel tools for documenting scientific computing systems based on the Jupyter Notebook system.

  3. Collaboration.
    a. Real time collaboration: continue developing a generic architecture for real time synchronization of notebook content that can be used with multiple backends.
    b. JupyterHub: extend the current JupyterHub multiuser design to other collaboration models beyond the basic Unix file permission one.

Duties and responsibilities
● Work with the project PIs and project team on research and implementation of code prototypes that will target the questions listed above.
● Develop and deploy selected prototypes as open source software products in
the Project Jupyter organization, following the project’s standards of practice.
● Actively participate in the project’s community, mailing lists, weekly developer meetings and other project-­­related events.
● Review pull requests and discuss/resolve Issues on GitHub.
● Engage the community of Jupyter users by teaching workshops and presenting educational talks about the project.
● Present the results of the research at national and international conferences
and co-­­author scholarly publications.

A PhD or equivalent degree in a relevant field is required by the start date. At the time of application, the basic minimum qualification is the completion of all degree requirements except the dissertation.

Additional qualifications required by start date:
● Experience contributing to open source projects and best practices
regarding code development, testing, documentation, review and community participation.
● Experience in Data Science.

● Advanced-­­level knowledge of the Python programming language.
● Ability to communicate clearly and effectively in spoken and written English.
● Availability to travel to project related conferences, meetings and workshops 4-­­6 weeks per year both in the United States and internationally.

Additional preferred qualifications:
● Knowledge of and contribution to the various projects in the PyData ecosystem
(Jupyter, IPython, NumPy, SciPy, Matplotlib, Pandas, Seaborn, sklearn, etc.).
● Experience as a presenter at conferences for data science and scientific computing (SciPy, PyData, Strata).
● Experience in using/developing tools for other languages used in the scientific community, like R and Julia.
● Experience in mentoring others in programming, data science and open source software development.
● Familiarity with HTML/CSS and JavaScript development.
● Refined visual design sense and basic knowledge of how to design UI/UX.
● Experience with a sampling of modern web frameworks (possibly jQuery, less, Bootstrap, React, Angular, etc.) and web technologies (Websocket, ES6, …)
● Experience developing for parallel environments, such as MPI,
IPython.parallel or Apache Spark

Both positions are a one year full time (100%), renewable up to three years, with a starting salary dependent on qualifications, To apply, submit your curriculum vitae, a cover letter, material that demonstrates the applicant’s programming skills (could be a public source code repository or a programming project included with the application) and contact information for three references to https://aprecruit.berkeley.edu/apply/JPF00899.

Letters of recommendation are not required at this time. If requested, all letters of reference will be treated as confidential per University of California policy and California state law. Please refer potential referees, including when letters are provided via a third party (i.e., dossier service or career center), to the UC Berkeley statement of confidentiality (http://apo.berkeley.edu/evalltr.html) .

Applications will continue to be accepted and reviewed until the positions are filled. Please address inquiries to Stacey Dorton , calbear95@berkeley.edu at the University of California, Berkeley.

The University of California is an equal opportunity/affirmative action employer.

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, or protected veteran status. For the complete University of California nondiscrimination and affirmative action policy, see:
http://polic y.ucop.edu/doc/4000376/NondiscrimAffirm Act.

The Department is interested in candidates who will contribute to diversity and equal opportunity in higher education through their work.

Requirements

Document requirements
  • Curriculum Vitae

  • Cover Letter

  • Material that demonstrates the applicant’s programming skills (could be a public source code repository or a programming project included with the application)

Reference requirements
  • 3 references required (contact information only)