Research Computing Teams Link Roundup, 5 June 2021
Hi, everyone:
We have several items this week about post pandemic management and continuing to manage a distributed team.
It's hard to hire and retain excellent staff in research computing, particularly (but not only!) in academia. We can't offer FAANG-type salaries or perks, and if we're even trying to compete on those terms we're doomed. Our only option is to play to every advantage we do have: challenging, meaningful work; flexibility in work (which has always been a perk of academic environments); strong benefit plans; stability of employment; ability to support wide ranges of interesting projects; and clear and up-front staff salary bands (making wide disparities between people hired into the same jobs less likely). And we can play to those strengths while doing our best to attract the kinds of candidates want to have real impact on research, feel part of a research and education community, value learning new things and working in new areas, and who intend to occupy a job long enough that the benefits we do often offer (like pensions) matter.
Too often we don't make use of those advantages when trying to hire. I'm routinely mystified by the websites of research projects — often research projects who mention that they have a hard time hiring — that make it hard to find out if there are jobs open. And when one does finally dig up a job ad (typically hidden under two or three pages of transparently meaningless institutional fluff about "how we work at [huge institution with vastly different cultures across department and teams]"), rather than being clear on priorities and goals for the new hire, the opportunities for growth and new challenges, and transparency around salary bands, I often see none of that and instead a laundry list of "3 years experience with technology X", "2 years experience with technology Y", etc.
These job ads have been clearly inadequate for years, and fixing job ads and where they're posted is comparatively easy (even at institutions that have a fixed institutional job ad structure filled with boring boilerplate, no one prohibits these projects from advertising those jobs differently elsewhere and linking to the decrepit HR system for applying). So I kind of despair about how we as a community are going to handle the much greater challenge of post-pandemic hybrid work.
Flexibility in work post-pandemic is going to have to include hybrid remote/on-site approaches to work. Even the hospital I work for, an organization that is very much not widely known for its forward-looking, innovative approach to HR, is rolling out an institution-wide set of policies and perfectly serviceable materials for staff and managers in this new hybrid world, with near-site team members working from elsewhere some or all of the time.
We're probably going to have to go further than this. We'll have to allow people to work from away, probably including hiring people who will never ever commute from their distant location to "headquarters," while also making sure space is available on-campus for those who want that (and employees who want to feel part of a research and education community will value that, at least some of the time). Hybrid is much harder than purely remote or purely onsite; the typical failure mode seems to be for people who are on- or near-site to feel "part of the team" in a way that the distributed team members don't, and to slowly lose engagement with this distributed team members until they eventually leave. That means we have to keep up the practices we've been using during the pandemic like asynchronous written communications and processes even when they're not necessary for everyone. It'll be easy to slip.
We're girding ourselves for that here; our working hours currently temporarily span six timezones and two continents, and we're aiming to make that permanent (after what will surely be an epic battle, sung about for generations to come, with the previously mentioned HR department). We're having to really up our game with building consensus by circulating documents around for comments, writing decision documents, and the like. But this is also already really helping us with collaborators or with team members who are only on our project 20% of the time, and with raw material for what will become papers or blog posts.
What is your team thinking for post-pandemic? What are your plans, and what are your concerns? Let me know by hitting reply, or emailing me at jonathan@researchcomputingteams.org.
PS - the newsletter will be taking a break next week; when we come back the following week we'll get back to the regular Friday evening (eastern time) delivery time.
And now, on to the roundup!
Managing Teams
What It Takes to Run a Great Hybrid Meeting - Bob Frisch and Cary Greene, HBR
It’s pretty clear that the safest way to have meetings with both on- and off-site team members is to erase the distinction by having the meetings the way most of us have been doing them for the past 15 months - everyone connecting in separately from their laptops. That way everyone is on an equal footing.
But if there are a lot of people on site, that might not be realistic. Frisch and Greene offer the following suggestions for those who are on-site (interestingly, they assume that the person running the meeting is necessarily in the office):
- Better audio - the old speaker/mic combination in the centre of the table isn’t going to cut it any more
- Better video, especially of the distributed participants for the on-site team, ideally life-sized
- Design the meetings for all attendees - tools for voting, taking notes/whiteboarding, etc
- Have real facilitation, by someone not running the meeting, to make sure everyone is participating
- Have on-site participants advocate for individual off-site participants
Honestly, I find this article a little discouraging. It gives some kind of idea how tough this is going to be - this isn’t even the hard case, where there’s multiple in-office teams at different locations as well as some working individually from elsewhere.
Microaggressions at the office can make remote work even more appealing - Karla Miller, Washington Post
Office spaces aren’t equally welcoming environments to all of us. Here Miller points out that for many potential team members, distributed work can mean less of the constant low-level stream of bullshit they’d normally experience in a predominantly white and male workplace.
This point:
Working at home has largely spared them from having to deal with such incidents as […] being mistaken for another colleague of the same race (a problem solved by having names displayed in video meetings)
apparently really registered with a lot of readers on twitter, and it’s a point that I had literally never thought about.
Reading this, I wonder if hiring in an increasingly distributed manner will also help recruit from groups that experience this sort of tiring nonsense all the time. We typically have small tight-knit teams, and team members from a lot of different demographic groups might well feel concerned about joining the team as “onlies”, the only member of that group. Will offering distributed work de-risk that choice enough that it would help improve diversity of our teams with new recruits (while of course leaving the work of equity and inclusion?)
Research Software Development
Open Source Communities Need More Safe Spaces and Codes of Conducts. Now. - Jennifer Riggins, The New Stack
Codes of conduct in Open Source Software—for warm and fuzzy feelings or equality in community? - Vandana Singh, Brice Bongiovanni, William Brandon, Software Quality Journal
Riggins walks us through the need for codes of conduct for open source projects, pointing out the rather shocking statistic that women make up less than 3% of open source communities, and that this has been stagnant for two decades. Between higher demands on their time and increased likelihood of be taken less seriously if not outright harassed, they are even less represented in open source than they are in tech generally.
Riggins points to empirical research by Singh et al that includes results such as:
- Some women specifically hide their identities for open-source work
- Women who had a good first open-source experience are much more likely to contribute
- Many women explicitly look for codes of conduct before participating
There are existing codes of conduct like the contributor covenant which can be used directly.
Our own project (and team) do not have such CoCs, but as the team grows and our code grows in visibility we’re clearly going to have to adopt them.
Porting ATLAS experiment codes to exascale architectures - Nils Heinonen, ALCF Blog
Heinonen writes about porting two codes - FastCaloSim and MadGraph - necessary for the ATLAS experiment to pre-exascale systems at Argonne, in particular using Kokkos, a DOE lab-supported C++ programming model for applications which can target multiple platforms. Unfortunately, there’s not a lot of details, except that Kokkos allows interoperability with pure CUDA, allowing for an incremental porting process. On NVIDIA GPUs, Kokkos does nearly as well as the native GPU code, coming within 10% (that’s what I infer from the short post).
One of the lessons learned, “CUDA portability layer concepts translate well, even if the explicit syntaxes differ”, continues to strike me - that the basic CUDA programming model seems to be holding out extremely well across a number of different kinds of accelerators. I wonder when or if that will change?
Introducing the Open Source Insights Project - Google Open Source Blog
The blog post lists new Open Source Insights site, deps.dev, which lists networks of dependencies for almost the whole ecosystem of npm, Go, Maven (for Java), and Cargo (for Rust). “Coming Soon” is PyPi, which should be fascinating given the size of that ecosystem.
Research Data Management and Analysis
SQLite Data Starter Packs - Public Affairs Data Journalism Class at Stanford
A great resource for teaching data analysis; a set of 15 small but real and interesting datasets shipped as SQLite files.
JupyterLite - Jeremy Tuloup and Nicholas Bollweg
JuypterLite is a remarkable implementation of Jupyter in Web Assembly, running entirely in the browser and served by static files. An example is here; it works flawlessly for me in Chrome. For demos or even teaching this would be enormously easier than setting up something in Binder or a JupyterHub instance.
High-Performance Python Communication with UCX-Py - Peter Andreas Entschev, NVIDIA Developer Blog
As Entschev demonstrates, it’s getting increasingly easy to use Dask and the rest of the RAPIDS stack using high speed interconnects like Infiniband or NVLink using UCX, which started as pieces of MPI runtime and is now a pretty well-established tool. I’m optimistic about UCX, and UCX-Py, enabling a new generation of parallel data analysis and simulation tools and approaches that need MPI-type performance but aren’t limited by MPI semantics.
Research Computing Systems
Don’t Be a Blockhead: Zoned Namespaces Make Work on Conventional SSDs Obsolete [PDF] - Theano Stavrinos, Daniel S. Berger, Ethan Katz-Bassett, Wyatt Lloyd, HotOS 21
(Presentation videos for this and other papers, as well as those papers, are available from the conference website).
Stavrinos et al. walk us through the advantages over block storage models of Zoned Namespaces, an existing and increasingly supported command set for newer SSDs and NVMe devices. Rather than continuing to pretend that SSDs are fast spinning disks, ZNS recognize that NVMe have other concerns, like data needing an entire erase unit to be erased before rewrites. Using block interfaces that don’t recognize this tends to lead to write amplification (one byte needing to be written causing a cascade of other writes), leading to reduced throughput and faster wear.
The sooner upstream software can take advantage of ZNS interfaces, the sooner we’ll get to take full advantage of these new storage devices.
Emerging Technologies and Practices
Building an SRE Team? How to Hire, Assess, & Manage SREs - Emily Arnott
How developers can be their own operations department - Daniel Order, Stack Overflow Blog
DevOps and SRE are two sides of a similar coin - bridging the gap between systems and developer teams to do better work faster. DevOps topics usually involve speeding release cycles, and SRE topics usually focus on improving automation, resiliency, and handling incidents, but there’s a significant degree of overlap.
Even if you aren’t explicitly building an SRE or DevOps team, you can start hiring for these skills and approaches in your regular ops or dev teams and try to take advantage of some of the improved tooling available and emerging best practices about running resilient systems and speeding release of new software.
Arnott’s article talks about the responsibilities of real-world SREs - automating processes, developing tools and infrastructure, creating runbooks, defining incident response steps, and the skills and knowledge you should be looking for in hires who would take on such roles.
Orner’s article writes about how Flipp gave - over a period of five years! - its development team greater tools to push new features or new systems to production by changing its infrastructure team’s focus from gatekeepers to enablers and pushing for continuous deployment.
My Magical Adventure With cloud-init - Christine Dodrill
Cloud-init is a standard cross-platform way to initialize a VM instance, given information from the cloud provider, provided user data, and provided vendor data. But as Dodrill demonstrates, if you’re willing to jump through a few other hoops it can be used programmatically to generate VMs locally for developing and testing on your own box (while still leaving you with configuration files that means you could also spin up the same instances on Azure or AWS or elsewhere).
Calls for Proposals/Applications
Ocean Hackweek - 3-6 Aug, Applications due 13 June. Virtual and in person at Bigelow Laboratory for Ocean Sciences, Maine
From the website:
OceanHackWeek (OHW) is a 4-day collaborative learning experience aimed at exploring, creating and promoting effective computation and analysis workflows for large and complex oceanographic data. It includes tutorials, data exploration, software development, collaborative projects and community networking.
Faculty Training Workshop: Teaching Heterogeneous Computing - 16-23 July, Applications due 15 June
Sadly only available to faculty members, this covers a newly developed set of teaching models by the Touch project at Texas state covering GPU programming, hybrid algorithms, and task mapping and scheduling.
Events: Conferences, Training
Tonne of interesting events coming up this roundup:
2nd ELEMENT workshop on Exascale Meshing, 9 June, Free Registration
This workshop covers the full meshing workflow - generation, adapting, partitioning meshes and visualizing results.
Excalibur-SLE Workshop: Data Visualisation and Data Flows - 16-17 June, free over Zoom
From the posting:
Stakeholders at this workshop will discuss the challenges of visualisation and data flow at exascale and topics such as: (i) limits of current workflows given roadmaps of future storage and I/O bandwidth; (ii) prescribed and automated in-situ data extraction; (iii) in-situ dimension reduction techniques; (iv) intelligent data compression; interactive analysis of large ensembles of simulations; and, (v) immersive visualisation using VR and AR
2021 SPDK, PMDK & Intel Performance Analyzers Virtual Forum - 22-24 June, Free Registration required
Overview of the storage performance development kit, persistent memory development kit, and Intel performance tools; last years talks and materials are available if you’re unsure if the event is for you.
ISC 2021 - 24 June - 2 July, Registration for full conference 249€ plus 199€ for tutorials; deep discounts available for students
The extremely broad program for the ISC conference, including a wide range of relevant tutorials, wil be held online for relatively reasonable prices.
Machine Learning for Planetary Space Physics Monthly Seminar - Next seminar 29 June
This is a monthly seminar (with previous talks on YouTube). From the site:
This seminar series aims to bring together researchers in planetary science, space physics, data science, and other domain applications of data science. We welcome presentations from a broad array of fields including: Earth based and planetary science applications, educational efforts, and basic data science research.
New Directions in Numerical Linear Algebra and HPC: Celebrating the 70th Birthday of Jack Dongarra - 7-8 July; free registration due by 2 July
Talks TBD but an extremely heavy-hitting lineup of HPC numerical linear algebra experts lined up to celebrate the accomplishments of Dongarra.
Random
The NVIDIA blog helpfully breaks out recorded talks and materials from GTC2021 events relevant to those of us in HPC.
A list of sources for project management document templates.
A couple resources for technical diagrams and images - pikchr is a tool and pic-inspired language for embedding simple diagrams in markdown, and inkscape has a long-awaited 1.1 release.
JSON handling in Postgres has gotten a lot better with the Postgres 14 release.
The next 50 years of unix shell programming [PDF].
Windows is becoming an increasingly plausible development environment for research computing?! Yeah, it’s weird to me, too. Here’s what developing python looks like these days, through the eyes of a Python core developer.
Learning Linux system programming by writing a pseudo-device driver (e.g. /dev/xyz).
“Computer, enhance”: using GPUs and deep learning to up-scale a cosmology simulation.
The Ada/SPARK community is starting RFCs for new language features.
An ambitious effort to complement the entire textbook An Introduction to Statistical Learning with labs involving R and tinymodels.
anchore is a command-line tool for scanning running docker containers for security issues.
That’s it…
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or about research computing teams. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming two weeks with your research computing team! - Jonathan
About This Newsletter
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.
Jobs Leading Research Computing Teams
This week’s new-listing highlights are below; the full listing of 185 jobs is, as ever, available on the job board.
Senior HPC Consultant - University of Monash, Clayton AU
This role offers a unique opportunity to make an impact on cutting edge Australian research projects. The successful candidate will work closely with scientists to understand their requirements and to design and implement innovative computing solutions. The responsibilities involve developing new hardware/software solutions, re-architecting or redeploying existing open source or commercial solutions. The position is expected to bring strong expertise and provide technical and team leadership in specific areas.
Senior Manager, Cloud Architecture, Research Computing - Nuance, Burlington MA USA
The role is a management and leadership role that works directly with the R&D Research Computing Engineering and Operations teams, R&D management and technical leads, Security, and IT. You will take ownership of the environment design and deployment, management, and efficiency, with consideration of future needs and trends. You will directly influence and shape the environment that is utilized to develop the latest AI technology to Nuance products, such as Dragon Ambient Experience and Agent AI, among many more.
Manager/Senior Manager, Biostatistics - Irvine CA USA, AbbVie
The position is to lead all statistical aspects for clinical studies or a small-scale project within the Medical Aesthetics therapeutic area. As the project statistician for assigned clinical programs, the position represents statistical science on clinical sub-teams advising team members in drug development strategies, plans all aspects of data analysis for assigned projects, has responsibility for the preparation and maintenance of the Statistical Analysis Plan (SAP), and ensures the quality tables, listings and graphs for the clinical document while adhering to the pre-specified analyses and timelines.
Manager, Statistical Programming - Gilead, Foster City CA USA
Primarily responsible for the business, operational, and compliance aspects of drug discovery, development, and marketed products at Gilead. Statistical programmers work collaboratively with internal colleagues and external vendors to ensure the efficient, high-quality production of analysis datasets and statistical outputs for study reports and integrated summaries in support of Gilead’s regulatory, scientific and business objectives.
Biomedical Informatics Services Manager - UC San Diego, La Jolla CA USA
The incumbent will serve as the expert for the CTMS applications within the CTRI and across Health Sciences. Responsible for implementation of research studies in these CTMS in order to improve the efficiency of conducting studies, improving revenue capture, and enhance the quality of study data. Works closely with the study team to construct study calendars, data collection forms (i.e., case report forms), budgets, reports, and other such tools. Provides training to users of the applications and guides other analysts configuring these applications. Responsible for the identification of issues related to the construction processes and the re-engineering of such processes, including documentation, standard operating procedure (SOP) creation and training, where feasible.
Scientific Priority Leaders - Royal Botanic Gardens, Kew, London UK
The Priority Leader is responsible for developing and implementing a vision for innovation in the identification, naming and classification of fungi and plants, accelerating the description of the world’s biodiversity; driving
a paradigm shift in taxonomy to embrace machine-learning, trait research (including genomic, chemical, morphological and ecological) and citizen science; while being guided by our expertise and collections in key families, our knowledge of conservation threats, and a consideration of socio-economic benefits.
Delivery Manager, Data Science - Sun Life, Toronto ON CA
This role will be a member of the Data & Analytics team and will be accountable to initiate, plan, mobilize project teams, and govern them through to execution. She/he is accountable to remove barriers and facilitate solutions to address business partner's needs. The successful candidate is expected to have a passion for data analytics and data science solutions and will need to be able to pick up and understand a wide range of analytics tools and solutions.
Head Computational Biology Research Site Ridgefield & Global TA Head - Boehringer Ingelheim, Ridgefield CT USA
The Global head of Therapeutic Area (Immunology & Respiratory OR Cancer Immunology & Immune Modulation) portfolio will lead and coordinate Computational Biology activities at a major Discovery Research site. Represent Computational Biology in the global and local Leadership Team and equivalent senior committees. Responsibility includes support and line management of Global Data core group and Computational Biology scientists.
Artificial Intelligence - Quantum Computing Research Manager - Ford Motors, Dearborn MI USA
The Artificial Intelligence and Quantum Computing (AIQC) Research Manager is a key role in the Digital Cockpit Technologies & AIQC Organization. This is a research and development role that requires driving research in diverse AI-ML and Quantum Computing domains and delivering research driven AI-ML-QC solutions aligned with product needs. One critical responsibility will be to drive product focused innovative AI-ML solutions in the area of consuming applications/services by leading a cross-functional team that is responsible for delivering it. The AIQC Manager will lead and grow a group of highly talented AI-ML-QC experts in providing cutting edge solutions, that draw from classical and emerging domains (Data science, Machine Learning, Connectivity and QC), to a diverse set of engineering problems with Near-Far value to Ford. In conjunction with Technical Leader and Senior Manager, this role will set the strategy, develop roadmap, and own the AIML technical development for the organization.
Manager, Research Operations - Computational Biomedicine - Cedars Sinai, Los Angeles CA USA
The Manager, Research Operations plans, organizes, manages and controls the daily operations of their area and works closely with departmental leadership, Faculty, Principal Investigators (PIs), staff and students to provide analytical support and project management in fulfilling the established goals and objectives for the department and organization. Strategizes and collaborates with the Director and senior leadership, along with Academic Affairs regarding development and implementation of policies and procedures. Manages and supervises administrative staff; provides leadership coaching, and opportunities for professional development. The Manager, Research Operations facilitates Human Resource (HR) functions for their area by collaborating with the appropriate HR partners for Faculty and department staff recruitment, visa and immigration assistance, employee relations, compensation, and benefits.
Senior Research and Data Curation Manager, Graduate School of Business - Stanford University, Stanford CA USA
The Senior Research and Data Curation Manager will support the creation and development of a standards-based research data ecosystem and work closely with librarians, data scientists, faculty, and doctoral students who acquire and analyze research data. The successful candidate will work with librarians and data specialists who will help re-envision the Library’s role within the research data lifecycle and who can respond to and support new research methods being used by Stanford’s research community. The Senior Research and Data Curation Manager will help to ensure that GSB Library’s acquired collections of qualitative data, quantitative data, text corpora, and software are properly evaluated, structured, described, and accessioned, and that these collections are appropriately secured and accessible for researchers at the University. This position will add value to existing research data services by tying together in a holistic way the GSB Library’s strengths in supporting faculty research grounded in a data lifecycle perspective.
National Digital Research Infrastructure Director - Health Data Research UK, London UK
We are seeking an exceptional individual to lead, in the first instance, the first phase of the DARE-UK programme. This individual will be responsible for leading a team to develop and deliver A design for a novel and sustainable national federated digital infrastructure (including, for example, services, TREs, privacy enhancing technologies, standards and supporting policy and governance frameworks), to be co-developed with users, funders, public, technology providers and data custodians and that provides a range of capabilities that can be flexibly deployed to support different analytical requirements.
Programme Manager, Data Centric Engineering - Alan Turing Institute, London UK
This post holder will, manage the Institute’s Data Centric Engineering programme, and coordinate interactions with the programme’s strategic partner, Lloyds Register Foundation, as well as project partners, in collaboration with colleagues in Partnerships team. The Programme Manager role is responsible for facilitating smooth operational links between the partner universities and research institutes as well as close working within The Alan Turing Institute to deliver high-quality research projects, research programmes, and knowledge exchange in data science and AI.
Lead Data Architect - Healthcare of Ontario Pension Plan, Toronto ON CA
The Lead Data Architect will have a strong analytics mindset, a passion for innovation, and the ability to build and maintain strong partnerships at all levels across our enterprise. We are looking for individuals that thrives within an Agile team-based framework. With our focus on Data management and building out our future analytics platform, we are proactively seeking top talent to join our Pension Solutions Group to help us create and implement our new data warehouse and analytics platform.
Project Lead, Primary Health Care Information, Data Advancement - Canadian Institute for Health Information, Toronto ON CA
The Project Lead, Primary Health Care Information, Data Advancement, works closely with the Manager, PHCi, to close critical PHC data gaps across Canada. They are responsible for leading Electronic Medical Record (EMR) data advancement activities, including the planning, receipt, processing and data quality analysis. The Project Lead also works to support the evolution of the PHC Electronic Medical Record Minimum Data Set and relevant artifacts. They provide effective team leadership and project management, to deliver a robust and evolving program of work. Additionally, the Project Lead supports the development of partnerships with internal and external stakeholders, including clinicians, vendors, policy makers and non-governmental organizations.
Program Manager - Quantum Hardware - Amazon Web Services, Vancouver BC CA
Bachelors degree in Supply Chain Management, Finance, Engineering, Operations, or other field from an accredited university or 2+ years Amazon experience; 8+ years of relevant strategic sourcing experience including vendor negotiations, global contract management, process improvement and supplier relationship management; 8+ years of experience in procurement, supply chain, inventory management and operations
Research Portfolio Development for Computational Biology Program - Berkeley Lab, Berkeley CA USA
This successful candidate will primarily support research program development activities like projects that combine some of the following elements: imaging, biomanufacturing, environmental biology, sensors and controls for automated experimentation, machine learning, data management, and HPC) and other related topics. The Program Developer will be responsible for a variety of activities, including idea generation and visioning, planning and facilitation of workshops, drafting and editing white papers, and developing ongoing relationships with a variety of internal and external stakeholders. The successful candidate will be building new research capabilities at Berkeley Lab in a growing area of research for the Department of Energy.
Principal Applied Research Manager - Optimization - Microsoft, Redmond WA USA
Our team is collaborative and interdisciplinary. You will be part of a team within Microsoft that brings together mathematicians, physicists, and software engineers, as well as interface directly with subject matter experts at customer organizations. By bringing together a unique combination of knowledge and skillsets, as well as cutting edge hardware, we build novel solutions to the world’s hardest and most impactful computational problems. This role encompasses technical leadership and management of a diverse team dedicated to solving problems in mathematical optimization and advancing the state of the art in optimization algorithms. As a customer facing role, this will be ideal for candidates who enjoy continually learning about new computational problems and new algorithmic techniques.