Research Computing Teams - Quarterly Goal Setting and Link Roundup, 2 April 2021
Hi there - if this is a long weekend for you, I hope you’re enjoying it.
Last time we spoke a little bit about expectations, and routine feedback to team members and peers when those expectations are met or not met. This time, let’s consider - or skip to the roundup.
Feedback is a mechanism to align expectations with your team members in the moment, and to encourage meeting those expectations in the future. Sometimes those expectations were explicit; other times, they were implicit and it’s a helpful way of making them explicit. This is a simple, extremely useful and scandalously underused tool, particularly in research environments. What’s more, your team members want feedback. Do you want more feedback from your manager about how you’re doing? Why do you think your team members feel differently than you do?
On top of the small course-corrections of routine feedback, it’s important to have regular conversations looking back at previous goals, and setting future goals. Here, the expectations are very explicit - you are setting goals, and looking back to see if they’re set.
Our research institutions probably have an annual review process set up for this. They’re usually incredibly poor. What’s more, a year is just an absurdly long time in research computing to set goals. For most of us, our work is changing rapidly; the idea that today we should have a pretty good idea of our work from now until April 2022 is just goofy.
Quarterly is a pretty good cadence to review work, learning, and career development goals with team members. Twelve-ish weeks is long enough to accomplish meaningful work, while being short enough that priorities probably haven’t shifted dramatically in the intervening time. These goals are things you absolutely can and should be talking about in one-on-ones, but setting some time aside every quarter just to have these goal-review and goal-setting conversations helps clarify expectations about goals, ends up with them written down, and gives team members clarity about what their priorities are. The resulting document is also something that can inform one-on-one discussions.
A template for the document I use for such reviews is available as a google doc here; I show an excerpt below. By having it as a google doc (or Office 365/sharepoint document, or whatever tools your team use routinely), it can be kept as a running document (most recent review on top), collaborated on, and frequently reviewed. What will be most useful for you and your team may well be different. I use these reviews as an occasion for a bit of a deeper check-in on how things are going in areas that sometimes get overlooked in the more day-to-day focus of one-on-ones.
The mechanics of these reviews is that we schedule a meeting outside of our usual one-on-ones; an hour is generally enough for a team member whose done this before, it might take more than that for a team member doing it the first time. I update the document by adding the review for the new quarter, taking goals set last review and copying them in; then each of us adds starting notes. The document covers:
- Questions for discussion - finding out what they were proud of, struggled with, and learned in the last quarter, and anything they’re excited or anxious about in the next;
- Reviewing past goals on, discussing whether they met expectations, and setting next-quarter goals on:
- work outcomes
- career development
- skills development
- Discussing what they need to work on in light of how they did on those goals, or due to other things that have happened in the last quarter - nothing in this discussion should be new or a surprise, it should all have been raised before in routine feedback
- Setting, with their input, goals for the next quarter.
The meeting then discusses the starting notes from myself and the team member, and then agree to summaries and commit to future goals. Having their input on these sections is extremely valuable; it increases the commitment to the goals.
The first time a team member goes through this, it can be a little scary - people have had or heard of pretty terrible performance review experiences in the past, and they often don’t realize it’s an opportunity for a conversation about what’s going well, what haven’t, what priorities to work on, and their own learning and career development goals. To make it a little less difficult to start this the first time, when onboarding a new hire we immediately set 30 day goals in the worksheet, then after a month set goals for the remaining 60 days of the quarter. At the end of their first quarter we go through the sheet together for the first time, but it’s at least partially familiar to them so doesn’t seem so daunting.
Do you have a similar process? Have you seen anything similar or that you find works very well? Let me know - hit reply, or email jonathan@researchcomputingteams.org.
For now, on to the roundup!
Managing Teams
The resilience of mixed seniority engineering teams - Tito Sarrionandia
An ongoing if unintended theme of the newsletter is that when managing teams, many useful things - like everything involved in having the team move to distributed work-from-home, giving feedback, having quarterly goal-setting - come down to making things more explicit. That requires a lot of up front work, more documentation, change of processes, and a little more discomfort for the manager initially - but then make a lot of other things better and easier for everyone.
Sarrionandia talks in this short article about the advantage of having teams with a range of seniority in exactly this light. Having junior staff on the team means that more resources have to be dedicated initially to explaining how things work, documenting processes and tools, etc. But those steps of making things more explicit make things work better for everyone. It makes it easier to bring new people onto the project, junior or senior. Those now explicit steps can be put into playbooks or automation scripts (or conference talks, or papers). More work initially, but that work pays off.
Measures of engineering impact - Will Larson
I’ve mentioned before that as a manager, we measure something to inform a decision or action. We’ve talked about measuring the productivity of technical teams - you have to look at the team level, not the individual, and pick metrics that indicate something getting in the teams way, something that you can change. The measures inform an action. That’s useful; you can arrange for fewer things to be in your teams way.
But measuring the impact of the technical teams is really what we want to accomplish. You want your organization to have as much impact as possible. We owe our team members work with meaningful consequences, and we owe the research endeavour as much help in the advance of knowledge as we can offer.
Larson and some of his colleagues discussed this and found that a number of big tech companies have almost comically simple measures used internally for impact, that are straightforward, centre on the things they care about, and are hard to game:
- Squre - new billiable features
- Gusto - number of competitive advantage created/improved
- AWS - number of comms-approved press releases
As people working in scholarly research, one of the skills we’ve developed is ways to disprove or provide evidence for the claim “X affects Y” by choosing one or more proxy measurable quantities to observe. This is one of our outsized skills. Choosing simple metrics can be a very effective way of demonstrating impact externally and informing decision making internally.
In research computing, some of our measures take some doing but are inarguably signs of impact - amount of use, papers published, citations, contributions. Which ones make the most sense will depend on what your teams work on; but any of them or any related metric is 100x more meaningful that input measures like utilization, lines of code, or data entries.
Managing Your Own Career
One mentor isn’t enough. Here’s how I built a network of mentors - Erika Moore
We’ve talked about assembling a group of mentors before, such as in #60. People by and large are more than happy to give advice and suggestions to others coming up in their field. Here Moore, writing in Science’s careers section, gives very specific and useful steps about how to build a network of people that one can ask for advice:
- Cast a wide net
- Get to the point - send short emails with very specific asks (which requires clarity on your part of what you want from them)
- Come prepared for those who do say yes
- Consider the context - things that worked for them might not work for you, and people may have a lot going on right tnow and be unable to help
Product Management and Working with Research Communities
Writing in the Sciences (Coursera Course) - Kristin Sainani
Writing is one of those things that many of us got into science or computing to avoid. But written communication, especially to stakeholders and the public, is vital for effective product management in research computing. Sainani has what looks like a pretty good short course on writing for within research communities and to the public:
Topics include: principles of good writing, tricks for writing faster and with less anxiety, the format of a scientific manuscript, peer review, grant writing, ethical issues in scientific publication, and writing for general audiences.
Product Manager Assessment - Sean Sullivan, ProductFix
In research, we typically learn on the job a rough-and-ready form of project management, which can work passably well for research projects.
In managing research computing teams, though (and arguably when you’re running a large enough programme in research) our primary focus is on managing products - software, data, systems, expertise - not projects. This is one of the reasons why research funding, invariably project based, is such an awkward match for research computing. We may very well execute projects as part of our product management, but the product as a whole outlasts any particular project.
In this article, Sullivan walks through a product management assessment with three high-level components - product expertise (including product/market fit), product management and skills (shown below), and people skills and other competencies.
I think this would have to be changed quite a bit for the research computing context - what getting money for our products looks like in research is very different than in the private sector. But in terms of illustrating the breadth of needed skills, and areas to be aware of when managing multiple products, I think this caries over into our world.
What do you think - what skills would a very capable research computing product manager have?
Research Software Development
Why Do Interviewers Ask Linked List Questions? - Hillel Wayne
In a field that values moving fast as much as software development, where the worst thing you can say about a piece of code is that it’s “legacy”, it’s surprising how many practices persist “because that’s the way it’s always been done”.
In this article, Wayne looks into why linked list questions are so common in software development interviews. It’s been that way since the late 70’s to mid-80s. And Wayne’s proposed explanation makes sense - it was basically fizz-buzz for having written programs with non-trivial data structures in C or Pascal (the 2nd most common language at the time) - languages with pointers but without any real standard library of data structures, so you had to roll your own.
About the Open Source Security Foundation - the OpenSSF
With open source software and dependencies, we’re building on the shoulders of giants - but what if some of those giants are malicious?
There’s been a lot of justified concern about open source software supply chain attacks with malicious or compromised dependencies. The OpenSSF is trying to build a community around security tooling, best practices, vulnerability disclosures, and identity attestation for OSS dependencies. Good to know about, and potentially get involved with.
Research Computing Systems
The Zero-prep Postmortem: How to run your first incident postmortem with no preparation - Jonathan Hall
It’s never too late to start running postmortems on your systems when something goes wrong. It doesn’t have to be an advanced practice, or super complicated. Hall provides a script for your first couple.
I’d suggest that once you have the basic approach down, move away from “root causes” and “mitigations” and more towards “lessons learned’. Those lessons learned can be about the postmortem process itself, too; you can quickly start tailoring it for your team and work.
socat - Cindy Sridharan
Sridharan provides a nice introduction to socat - kind of like netcat but much more flexible - any network or unix domain socket (datagram or stream), file descriptor, sctp, pty, named or unnamed pipes, openssl… Very useful for debugging , testing, or for quickly doing ad-hoc data movement between different kinds of linux processes.
Emerging Data & Infrastructure Tools
BPF for storage: an exokernel-inspired approach - Yu Jian Wu et al., arXiv
Introducing Amazon S3 Object Lambda – Use Your Code to Process Data as It Is Being Retrieved from S3 - Danilo Poccia, AWS News Blog
Moving processing to the data is increasingly important; these two very recent items describe two very different approaches to it.
In the first article, the authors push some data processing into the kernel, via eBPF. The paper takes inspiration from the networking community, which has long done offloading of some simple processing to the network card or the network stack; as well as exo kernel file systems from the late 90s. Having some of the processing of data happen in the kernel can reduce the number of user-kernel context switches and data boundary crossings. They implement on-disk B+ tree lookup in kernel with eBPF (!) and find that latency can drop by half and IOPS by even more.
In the release by the AWS S3 team, Poccia describes combining lambda functions and S3 access points to automatically process data pulled by S3 buckets. This could support processing of data near the S3 bucket, but also can provide features such as redaction of data, providing “views” of data in different formats, augmenting data with information from other sources, etc.
Calls for Funding
ACM SIGHPC Computational & Data Science Fellowships - Nominations close 30 April
May be of interest for trainees you work with; nominations are now open for this $15,000 USD fellowship (or purchasing power equivalent in other countries). Eligible are graduate students (masters or PhD) early in their program.
Calls for Papers
4th International Workshop on Practical Reproducible Evaluation of Systems - Papers due 20 April; workshop 24 June
As part of the virtual ACM High-Performance Parallel and Distributed Computing 2021 meeting, this workshop focusses on automated and reproducible computational experiments for evaluating computational experiments - simulations, evaluations of systems, etc.
The 3rd International Workshop on Parallel Programming Models in High-Performance Cloud (ParaMo 2021) - 10-12 page papers due 7 May; workshop Aug 30-31
This workshop, held as part of Euro-Par 2021, focuses on high performance computing in the cloud - including programming models, frameworks, network storage and memory management, heterogenous resource management, performance evaluation and configuration.
Workshop on MAthematical performance Modeling and Analysis (MAMA) - Papers due 17 May; Workshop June 14
Performance analysis, modelling, and and optimization is the topic of this 1-day workshop held as part of the virtual ACM SIGMETRICS 2021.
Events: Conferences, Training
Developer First Tech Leadership Conference - 6 May 2021, $5-$50
This day-long conference has talks covering people management, asynchronous and remote development, managing technical assets, running effective meetings, dev team metrics that matter, and managing your own time.
May HPC User Forum Meeting, 11-13 May North America friendly timezone, 12-14 May Europe/Asia friendly timezone
A three day virtual HPC User Forum meeting, free for members.
Random
Something that may be useful for your team or researchers you support who work with data in python - a pandas drop-in replacement that runs in parallel using Ray or Dask.
A bracing, curse-laden reminder that there’s no best programming language, and no one other than contributors should really care what language you use.
A free and open-source online whiteboard tool. Not as feature-packed as say miro but very easy to get started with.
Quickly show some mathematics on your screen (for say a video call) with markdown, latex, and muboard.
Characters that work well across platforms and terminals.
WebGL is one of multiple ways of having performant visualization tools in the browser. There's a new generation of WebGL, WebGL2, but most tutorials either haven’t kept up or have lazily updated WebGL1 resources. This is a nice looking tutorial of WebGL2 from scratch.
That’s it…
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
Jobs Leading Research Computing Teams
Highlights below; all 133 currently available jobs are available, as always, on the job board.
Head of the Data Science Section - European Space Agency, Villanueva de la Cañada ES
The Head of the Data Science Section reports to the Head of the Data Science and Archives Division and is responsible for the definition and implementation of a data science strategy for the Science and Operations Department. The purpose of the Data Science Section is to develop innovative and disruptive data science activities with a view to increasing the science return of ESA’s Science missions through enhancing the science data exploitation, as well as by improving the efficiency of mission science operations.
You will manage a small team and associated industrial services and external contracts. You will interface with all teams within the Department and will look for synergies and opportunities for collaboration with other data science initiatives within ESA and beyond.
Senior Scientific Database Manager - National Health Service, London UK
The principal responsibility of the post will be to help deliver the PHEs surveillance systems, in particular, the ongoing development and running of the HCAI Data Capture System (DCS) by contributing to the review and development of technical documentation for the new system. This system supports PHE deliverables for enhanced analyses of Methicillin-resistant (MRSA), Methicillin-sensitive (MSSA) Staphylococcus aureus, Gram-negative bacteraemia and Clostridium difficile infection (CDI).
Senior Scientific Manager - DNA Pipelines - Wellcome Sanger Institute, Hinxton UK
You will be responsible for managing one of the DNAP core operational teams, ensuring that the team provides high quality leading edge services to support the changing scientific and capacity requirements of the Wellcome Sanger Institute. You will lead and manage a diverse operational team to ensure that they provide a customer focused, timely, high quality service to our faculty groups, external stakeholders and third parties. You will develop & maintain close and effective working relationships other DNAP teams, DNAP Technical Development and faculty groups to improve operations and be a key driving force to ensure continuous improvement and the deployment of new pipelines into operations smoothly.
Applied Computer Science Deputy Group Leader (R&D Manager 3) - Los Alamos National Laboratory, Los Alamos NM USA
The Deputy Group Leader position is an entry‐level R&D management position. The Deputy Group Leader works closely with the Group Leader to provide scientific and programmatic leadership and to oversee the group’s technical quality and the operational and security envelope. The successful candidate will be an effective leader who has outstanding communication skills and a vision for assisting the Group Leader and the Laboratory in integrating capabilities across the institution to achieve mission goals. This position provides the opportunity to engage in many programs in both the open and secure scientific arenas. Group-level managers in CCS Division are expected to actively participate in technical projects and effectively balance leadership with technical contributions. The Deputy Group Leader position is funded at a 50-75% level and the candidate is expected to secure and dedicate the remaining effort towards technical projects.
Senior Research Software Engineer - UK Atomic Energy Authority, Culham UK
Upcoming projects include: work on software for plasma modelling, engineering design for future fusion reactors, and robotic control systems. We are particularly keen to hear from people with expertise in: C++ (including hardware interaction / control systems), experience of engineering design workflows and tools, or improving portability of existing research software through containerisation. Acting as the RSE lead on significant projects and other activities (including supervising the work of others). Forming strong partnerships with domain experts and project managers to make sure aims are understood and achieved and successful collaborative development is established.
Senior Research Software Developer - University College London, London UK
The UCL Research Software Development Group was the first of its kind and is one of the leading university-based research programming groups in the UK. We work with researchers across the university to ensure UCL retains the highest standards of excellence in computational research. While we always welcome any candidates matching the job description, we are on this occasion particularly seeking expertise in high-performance computing (e.g., MPI, OpenMP, SYCL, novel technologies), and/or web development with Django. The Senior Research Software Developer will take on a leadership role within the group, either technically or managerially, helping to guide the vision for this strategically important area for UCL. You may lead the technical design for complex projects, manage a portfolio of research.
Deputy Director of Computing - European Centre for Medium-Range Weather Forecasts, https://www.ecmwf.int/sites/default/files/vacancies/_VNVN21-14_en.pdf
At the core of the department’s mission are state-of-the-art high-performance computing (HPC) facilities, storage solutions to support the world’s largest meteorological data archive and digital technologies solutions for the organisation. The role is a senior management position reporting directly to and working closely with the Director of Computing on the strategy, development roadmap and 24/7 service provision of ECMWF’s mission-critical ICT infrastructure and services. More than 60 technical and engineering staff, located in Reading and Bologna, provide HPC and data management solutions and ICT services to users of the ECMWF Departments of Research, Forecasts, Copernicus and Administration, as well as ECMWF’s external users in Member States and Co-operating States. The Computing Department actively researches the digital technology landscape for the benefit of ECMWF’s future operations of ECMWF’s highly demanding 365/24/7 HPC facility and supporting data archive, ensuring resilient and mission-critical computing services. The successful candidate will lead the research of current and emerging technologies and will provide strategic insight into their potential for the strategic planning and development of ECMWF’s ICT architecture and technical designs, in particular in the fields of HPC, AI and data handling solutions, and will lead or contribute to relevant proposals for research and innovation actions
Computational Biology Program Developer - Laurence Berkeley National Laboratory, Berkeley CA USA
In this role the successful candidate will be a co-member of the Biosciences Strategic Programs Development Group (SPDG) and the Computing Sciences Area Office. This position will help to establish new research programs at the intersection of Biosciences and Computing, and identify strategic opportunities where both Areas can collaborate to increase scientific impact. This successful candidate will primarily support program development activities like projects that combine some of the following elements: imaging, biomanufacturing, environmental biology, sensors and controls for automated experimentation, machine learning, data management, and HPC) and other related topics.
Assistant Director, Center for Artificial Intelligence Innovation - National Center for Supercomputing Applications, Urbana-Champaign IL USA
NCSA is seeking a talented, highly-motivated individual to provide leadership, oversight and management of the Center for Artificial Intelligence Innovation (CAII). The Assistant Director - fulfilling the role of Director of the CAII - will provide leadership and management of the activities and strategic initiatives of the CAII. The selected candidate will foster and actively participate in a vigorous research program with international impact, including the pursuit of external funding opportunities. The AD will oversee all staff and student employees within the CAII.
Computational Research Scientist - Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley CA USA
The Research Scientist will contribute to programmatic initiatives in the areas of bacterial and archaeal genomics with emphasis on single cell sequence data. The successful candidate will be expected to develop new scientific directions and lead cutting-edge research efforts in support of JGI’s Strategic Plan and Department of Energy (DOE) mission areas in biogeochemistry, carbon cycling, and bioenergy. The Research Scientist will also be expected to support a broad portfolio of JGI User Science within the microbial genomics portfolio and support an array of administrative production-related activities under the general supervision of the Microbial Genome Program Head.
Principal/Senior Data Scientist (Genomic Surveillance) - Wellcome Sanger Institute, Hinxton UK
The Malaria Genomic Epidemiology Network was founded in 2005 and has grown to become a data-sharing network with partners in over 40 countries. Our goal is to support and accelerate the use of genomic surveillance data by malaria control programs and policymakers to make effective informed decisions. As part of the national response, our team is also playing a pivotal role in the Genomic Surveillance of COVID-19, tracking and analysing the spatiotemporal spread of this disease to directly influence and guide policy makers to control the pandemic. You will contribute to supporting and influencing local and global health organisations in their surveillance activities to achieve well-informed, impactful and sustainable interventions. You will also lead the production of analysis that answer specific, relevant and crucial questions that build upon decades worth of research, as well as the production of open data resource publications to maintain MalariaGEN’s status as a world leader in malaria genomic surveillance.
HPC Systems Specialist - Senior DevOps Administrator - EPCC, University of Edinburgh, Edinburgh UK
EPCC is the UK’s leading centre for High Performance Computing (HPC) and Data Analytics with an international reputation for excellence in computational science. At EPCC we share a passion for innovation using state-of-the-art technologies, a love of solving complex challenges and a desire to impact academic, public and private sector partners in a meaningful way. We are the delivery engine for a large number of data science projects through the £1.1 billion Edinburgh City-Region Deal as well as being very active in supporting ground-breaking Covid-19 research. This role will provide a unique opportunity to lead on the development of the Edinburgh International Data Facility (EIDF), a multimillion pound "ground up" private cloud infrastructure.
Product Manager, Quantum Computing Software - NVIDIA, Santa Clara CA or Hillsboro OR USA
We are looking for a technical, user-focused Product Manager to build an exciting new product to enable researchers and framework builders in the areas of Quantum Computing. As a product manager for the product you will establish a vision, gather requirements, set product roadmaps and provide go-to-market strategies. You will work with software developers, hardware developers and research scientists to enable scientific breakthroughs in this exciting area. Define and prove – gather insight and build analysis to enter or change new markets through a clear value proposition for your product. Market intelligence – understand your developers and champion their needs through roadmaps and priority setting. Build and Deliver - work with engineering to set requirements and priorities. Sense and Respond - work closely with customers, create surveys, present at conferences, and understand product quality. Product launches – define the go-to-market strategy and contribute to the cross-functional implementation of the plan across Marketing, PR, Sales, etc.
Scientific Computing Project Manager - Leidos, Jefferson AR USA
The candidate should have experience successfully leading and managing task order comprised of cross functional service delivery teams experienced in regulatory science, bioinformatics, advanced analytics, artificial intelligence/ machine learning, modeling and simulation, data management and high performance computing. The project manager is responsible for the cost, schedule, and technical performance and will serve as the single customer point-of-contact for the customer on management and technical matters and contract-level issues. He/She will be the advocate/ voice of the customer, be a strategic advisor of the customer, and ensure the team understands the customer’s needs, business operations and requirements. He/she will provide the planning, direction, coordination, and control necessary to successfully delivery program goals and objectives within the scope, time and budget constraints. He/ she will interface with task and functional leaders, subcontractors, and support personnel.
Senior Research Software Engineer - Cloud and Linux - The Rosalind Franklin Institute, Seattle WA USA
The High Performance Computing (HPC) team is looking for a talented PM to help us develop the HPC portfolio of products within AWS, which includes products like AWS Batch and AWS ParallelCluster, as well as network interfaces like Elastic Fabric Adapter (EFA). You will work closely with teams in EC2 as well as on new technology from Annapurna Labs. As a Sr PMT - External Services, you will own product development from strategy/product vision through feature definition, prioritization, positioning, naming, and GTM/adoption. We are looking for experienced product managers who are passionate about solving customer problems, and have demonstrated success working backwards from ambiguous customer needs, translating them into disruptive, successful products that delight customers. You will have the curiosity to dig into complex business problems, challenge the status quo, and bias towards using data to simplify the customer experience.