Research Computing Teams Link Roundup, 7 Aug 2020
Hi!
Last week’s “Ask Managers Anything” question was “How are you making sure that junior staff get access to mentoring when everyone is working from home?” I got several replies back; paraphrasing them for anonymity:
- I don’t have any junior staff as direct reports, so it hasn’t been an issue
- We’ve been leaning heavily on white boarding tools like Mural or Miro to have conversations about how things work, and asking our junior staff to either present back or write up the discussions as documentation to make sure they’re learning
- Our team has experimented with giving new junior staff small but increasing stretch assignments with lots of contact-hours for asking questions and getting feedback
- We’re struggling with this too, look forward to hearing what others say!
- We’ve standardized on VSCode so we use Liveshare, along with Teams for chat/teleconference, to run pair programming sessions; it’s not perfect but it’s worked for us.
Thanks for your answers! I may try a couple of these myself
This week’s top question is, “How do you start difficult conversations with team members?” My own answer (which echos an article in the roundup list this week) is to try to reduce the need for them by having less-difficult (but still difficult) small conversations early. And when they are necessary, I try to think it through and then start the conversation, with a pretty firm view about what a desired big-picture outcome looks like but with an open and questioning mind about what the underlying issue is and how to address it. I … don’t always succeed, but I’ve gotten noticeably better/less bad at both the early and later conversations over time.
How about you - do you have any advice for our question asker or the community? How do you start difficult conversations with team members? Do you have a go-to phrase that helps or a process? Let me know (just respond to this email) and whether or not I can credit you by name and I’ll include it in the roundup. And feel free to ask (and vote on) questions for next week here.
In other news, some of you may also have noticed some different URLs to the job board amongst other things; I’m slowly moving the newsletter and related information on to its own website, www.researchcomputingteams.org, so it’s less about me. Input as always is welcome (feedback is a gift - PRs doubly so!). If all goes well, I’ll finish the migration next week.
Managing Teams
Handling difficult conversations - Rachel Hands, Managing Equitable, Effective, Teams
As above, difficult conversations don’t get easy, but they do get easier. And once you’re a manager, as Hands says,
It’s imperative that you, as a manager, initiate tough conversations when the need arises.
There’s no way through but through, though, so Hands recommends identifying what’s making you uncomfortable about having the conversation so as to defuse it a little bit, and then to focus on the (very specific, observable) issue at hand and that you’d like to start addressing. Then during the conversation, listen a lot, be open to solutions (or even restatements of the problems), and don’t get sidetracked.
None of that makes it easy, but focussing on future outcomes and being aware of what’s making you uncomfortable helps.
Hands points to similarities with the feedback model she advocates, and I’ll just add as above that giving frequent, specific, early feedback can greatly reduce the number of Big Conversations like this that you need to have. Many short slightly difficult conversations >> fewer very difficult conversations.
Create space for others - Will Larson
One of the hardest things about a transition to leadership, either on the people-manager or technical-leadership track, is stepping further and further back from directly making contribution and spending more time making room for others, nurturing their contributions, and gathering their input. In this article, Larson describes how that works at the Staff+ Engineer level at large tech companies.
How to Turn around a Disengaged or Underperforming Employee - Lighthouse
Tactical Challenges In Hiring Junior Engineers - Cindy Sridharan
These two articles benefit from being read together. The topics are quite different but they both speak to the need for managers to invest time in new and/or struggling team members.
In research computing we tend to both not reassign or remove employees who aren’t good matches and not invest enough in employees who are struggling. It’s a bad combination, it hurts team morale, it hurts the struggling team members, and it hurts the research we’re trying to support.
The first article talks about what’s necessary in coaching an underperforming team member. It’s a lot of work, for both you and the team member. I think the blog post lays it out well, and if the topic interests you you should read it. The only things I’d add are:
- You can often - not always, but often - avoid going through this process by providing small frequent feedback earlier, rather than waiting for things to become A Big Problem.
- This process will not always be successful. I really like Roy Rapoport’s article on five conditions that have to be met for improvement to occur.
The second article is very upfront about what’s involved in hiring junior team members:
I strongly believe that if a team isn’t willing to invest at least 1–2 years, they shouldn’t be hiring junior engineers.
This is especially true in research computing, Our junior staff tends to be straight out of undergrad or coming from a couple of years in industry; our senior staff tends to be out of Ph.D. programs/postdocs. Not only is there a gap in experience, but there’s huge cultural gap. The process and mindset of doing research will be completely new to them. It’s good that they’re bringing in new mindsets and approaches! We don’t want to quash that. But the cultural difference will have to be bridged to make sure communication works and and expectations are clear.
Managing Your Own Career
How to make your data science team faster (and speed up progress) - Gregg Detre, Making Data Mistakes
I’ve got this under manage your own career because the article is really about having tough conversations with your own manager about your team’s (perceived) progress. When talking with your manager, like any other stakeholder, the reported symptom (your team isn’t going fast enough) may be pretty different from the actual underlying problem, so the key is to not get defensive and to not jump to conclusions about what you interpret the problem to be but to dig in and get more information about what the specific cause for concern is so you can address it better. Feedback is data, but data is generally noisy and incomplete; sometimes you need to collect more data before you know what the right next step is.
Product Management and Working with Research Communities
My Screencasting Workflow - Laurie Barth
Screencasts can be very useful for training materials. Very experienced screen caster Laurie Barth offers her workflow here. Barth records the screencast without audio, then adds annotations, and only when after that does she record an audio trac narrating it. That extra step seems like it would take extra time, but I’ve tried the whole narrating-while-typing thing before and it’s pretty hard and took either extra takes or living with lots of “umms”. I’ll try this next time.
Building digital workforce capacity and skills for data-intensive science - OECD Science, Technology and Industry Policy Papers
This white paper takes a careful look at to what the workforce needs will be to enable data-intensive science in both the public and private sectors. They take a close look at 13 case studies, and it’s worth reading if you’re interested - it’s only about 40 pages. Maybe most crucially, a key take-away of the report is:
There is a need for both digitally skilled researchers [..] and a variety of professional research support staff, including data stewards and research software engineers.
This isn’t surprising but as just the latest calls for professional recognition of the sort of work research computing staff performs, it’s good to see. Also called for are career ladders for such staff within Universities and other research institutions.
Cool Research Computing Projects
Prevalence of multiple forest disturbances and impact on vegetation regrowth from interannual Landsat time series (1985–2015) - Hermosilla, Wulder, White, and Coops
A nice example of what people will do as soon as datasets are made available. Canada’s Landsat data over 20 years is open access, and this team cleaned integrated all of the data (from 4 satellites) into a massive number of time series, using change-detection techniques to analyze disturbances. While wildfires were the greatest source of disturbance, areas exposed to anthropogenic disturbances were much more likely to see a distinct second disturbance later on in the time series.
Research Software Development
Julia 1.5 Highlights - Jeff Bezanson & Stefan Karpinski
I’m still a little wary of adopting Julia after having my heart (and code) broken quite a few times in the lead-up to 1.0, but Julia’s power in creating really cool and performant DSLs for particular problem areas - look at JuliaStats, DifferentialEquations.jl, JuliaFEM, and others - is hard to deny.
1.5 has some cool features, including sending time-travelling reproducible bug reports using the rr tool which has made an appearance on the newsletter before. It also has more of the same things which has always worried about the state of the product management - a relitigation of how scoping works (!) with a compromise about how it will work one way in the REPL and another way in compiled code (!!) with appropriate warnings.
Still, it’s a cool language and people are building neat things with it, especially in numerically-intensive areas of research computing.
Writing and publishing a Python module in Rust - William Woodruff
One of the things I’ve always liked about Python for research computing is that it lets you prototype things quickly and then, once things are working, swap out slow pieces of code for faster compiled modules. In this article, Woodruff describes the surprisingly simple journey of writing and publishing a Python module written almost completely in Rust. It sounds like there’s still a couple of rough edges around distribution, but it’s remarkably simple, even compared to C- (and especially to C++)-based code. This will be interesting to watch.
Getting Started with Dafny: A Guide - Microsoft Research
Dafny is a research programming language by Microsoft that allows you to add pre- and post-conditions to routines (or even blocks) which are then verified statically upon compilation - essentially making just building your code equivalent to running a large suite of unit tests. The Getting Started Guide is an interactive introductory tutorial for the language.
This sort of incremental “hardening” of code could be extremely useful for research computing (especially for development of code that relies on subtle numerics or algorithms). Computers are way better at the kind of subtle book-keeping that ensures loop invariants are held or assumptions are valid than we are, and anything that pushes such bookkeeping to the computers is of interest to me.
(Relatedly - a theorem proving language has just recently been used to show that a 30-year-old oft-quoted result for Bloom filters is wrong, and proved the final result.)
Research Computing Systems
Drawing good architecture diagrams - Toby W, (UK) National Cyber Security Centre
A nice overview of drawing architecture diagrams. The article makes the point that the diagram is about communicating, and if it doesn’t communicate the key points of the system to the readers, then it’s not succeeding. I like this advice:
Start with a basic high level concept diagram which provides a summary. Then create separate diagrams that use different lenses to zoom into the various parts of your system.
Having multiple diagrams of the same system viewed through different “lenses” seems more likely to be a success than trying to cram everything into one diagram.
An Introduction to ZFS: A Place to Start - Nick Fusco, STH
This is a nice simple introduction to ZFS for those who might be thinking of using it with Linux. It describes how ZFS divides storage unit (disks, VDEV, and pools) and how ZFS’s approach to RAID works within a pool, the advantages (snapshots) and disadvantages (IOPS, fragmentation) of the Copy-on-Write approach and journalling, and why ZFS in particular benefits from SSDs. It also covers, a little bit, some key configuration parameters.
Emerging Data & Infrastructure Tools
Modernizing the HPC System Software Stack - Allen, Ezell, Peltz, Jacobsen, Roman, Lueninghoener, and Wofford
This paper - by authors at the DOE NNSA facilities that know a thing or two about running large “big iron” HPC system - advocates for drastically updating a traditional HPC stack:
the HPC community has allowed system software designs to stagnate, relying on incremental changes to tried-and-true designs to move between generations of systems.
(preach!)
They argue for more service nodes using modern, horizontally scalable (and thus resilient/available) cluster services, managing extremely bare compute nodes that support something like containers for jobs, more configuration management, better state management (and thus security) and orchestrations, pointing out specific places where the HPC community can learn from what is going on in the broader big-computing world.
I think most or all of these system stack updates would be welcome, but a bigger (and possibly prerequisite) shift will have to be in operations culture; focusing on customer-facing service levels and metrics, which can then lead to “chaos monkey-lite” approaches like Slack’s Disasterpiece Theatre; this sort of focus will, I believe, lead inevitably to more modern system setups in HPC, just as it has elsewhere.
Calls for Proposals
2020 Workshop on Languages and Compilers for Parallel Computing - Submission Deadline 13 Aug, Virtual Conference 14-16 Oct
A long-running (1988!) conference on parallel programming systems.
5th IEEE International Conference on Fog and Edge Computing 2021 - Papers due 3 Jan 2021, event 10-13 May 2021, Melbourne Australia
Readers will know that I’m cautiously optimistic about Fog and Edge computing for data gathering and processing in place for research computing applications. This is one of the big events in the field.
Events: Conferences, Training
Containers in Production - Online, 10-11 Aug, Free
OpenDev is having a 2 day workshop covering topics like containers and OpenStack, Security and Isolation, Telco and Network Functions, Bare metal and containers, and Acceleration and optimization. Full schedule is here.
High Performance Computing Autumn Academy - Online, 7-18 Sept, Application due 14 Aug, £350-990
Cambridge’s Centre for Scientific Computing is holding it’s autumn HPC course online this year. “The overall aim of this course is to provide course attendees with a strong background in programming techniques suitable for general scientific programming.”
Random
Scientists rename human genes to stop Microsoft Excel from misreading them as dates - this was such a pervasive and recurring problem that scientific nomenclature was changed to avoid the problem.
Using travis, and travis secrets and encryption to deploy via ssh.
A fork of make with tracing and a debugger.
Migration stories are always interesting. Here’s the story of how LinkedIn rewrote its messaging infrastructure while keeping it up and running the whole time.
A nice story on the 60-year history of algorithms for multiplying very large numbers, and the recent proof of an n log n algorithm (but not a lower bound!) involving FFTs.
Termpdf.py, a PDF file viewer for the terminal.
A couple neat open source tools from Dropbox - a non-dumb password strength estimator and Broccoli, a blocked compression algorithm optimized for file syncing.
Shournal - ever used script to record a terminal session, or history to reconstruct what you did? This records not only shell history across sessions, but also (using fa-notify) what files were touched (even if they don’t appear explicitly in the command line) for improving reproducibility of processes.
Herbie, a really neat web service for rewriting floating-point expressions to improve accuracy.
Parabol, with free plans for up to two teams, looks like a useful tool for retrospectives.
That’s it…
And that’s it for another week. Respond with any input you have - answers to the “difficult conversations” questions, or any feedback for me at all.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
Jobs Leading Research Computing Teams
Highlights below, all jobs at the jobboard.
Data Science and AI Junior Project Manager - AstraZeneca, Various - US, Sweden, UK
We are recruiting for a talented Data Science & AI Junior Project Manager who can lead small projects or support large projects/programmes. You will have a working knowledge of project management methodologies, tools and templates and contribute to the development and maintenance of work products or change programmes. You will also ensure that business requirements are effectively captured, and will be responsible for effective tracking and reporting of project management information and highlighting and supporting resolution of areas of risk in project delivery.
Senior Data Scientist, AI@Unity - Unity, Vancouver BC CA
e are looking for a Senior Data Scientist to join our centralized data science team that supports multiple products and teams at Unity. As a Senior member of our team, you will be responsible for not only providing solutions, but identifying new opportunities within our data and solving them with machine learning techniques. Our team develops and implements machine learning models to production and intuitively makes decisions around what approaches to take in solving hard problems.
Research Programming Manager - UCLA Health, Los Angeles CA USA
Reporting to the Director of the Center for Integrative Connectomics (CIC), the Research Programming Manager has primary responsibility for network infrastructure management and systems administration for a large research group with internal applications that support teaching, scientific research, administration and operational activities. In this role, you will provide technical support for a diverse set of end users consisting of faculty, postdocs, researchers, students and administrative staff. You will also meet regularly with the Informatics team to develop and implement technical solutions, discuss long range planning and recommend enhancements/upgrades that leverage computing resources, all while maintaining general oversight of internal proprietary software, databases and web sites developed for the CIC.
Scientific Computing Product Manager - GSK, Collegeville PA USA
The HPC and Scientific Computing Product Manager functions as an intermediary who ensures clear and effective communication across Scientific Computing, R&D wide scientists/teams and stakeholders who depend on scientific computing to support their objectives, and with executive management to ensure effective capturing of ROI impact and alignment with broader R&D strategies. The individual will build strategic partnerships with both internal and external groups and institutions to identify opportunities where GSK’s Scientific Computing resources and expertise can be leveraged to accelerate the timeline and improve the success rate of drug discovery using state-of-the-art computing techniques and/or emerging technologies and platforms. They will be responsible for communicating platform (e.g. High-Performance Computing [HPC]) processes to users, including environment changes, implementation timelines and progress as well as training and documentation.
Senior Technical HPC Manager, NASA - ASRC Federal, Moffett Field CA USA
MAKE A DIFFERENCE on a team that supports a 11,000+ node, 7.24 petaflop supercomputer system with over 50 petabytes of temporary storage and an exabyte capable archive system connected in one of the world’s largest InfiniBand fabrics. You will be an active member of the ASRC Federal account leadership team reporting directly to the Site Lead and working with our NASA customer to oversee the HPC and non-HPC systems groups. Keeping stakeholders informed, you will be attending regularly scheduled customer meetings to report status on all on-going activities within your domain as well as answer customer inquiries concerning all aspects of the various projects.
Program Lead, Data Repository - Canadian Institute for Health Information, Toronto or Ottawa ON CA
The Program Lead for the ODT (Organ Donation and Transplantation) Data Repository works closely with the manager to successfully lead the development of a new pan-Canadian data repository for ODT. They are responsible for the overall development and operational management of the data repository, and associated information products and services. In development phase, they will work closely with internal stakeholders such as Information Technology and Services (ITS) and external partners such as Canada Health Infoway (Infoway) and data providers to successfully capture and integrate high quality and timely ODT data from providers into the data repository. The lead will supervise an interdisciplinary team to meet key project deliverables.
Group Manager – Advanced Computing Operations - National Renewable Energy Laboratory (NREL), Golden CO USA
Advanced Computing Systems and Operations specifies, procures, and operates HPC, cloud, and other mission computing systems. HPC capabilities include a supercomputing user facility service supporting all renewable energy and energy efficiency programs within DOE. This leader will serve as a Group Manager reporting to the CSC Center Director and lead Advanced Computing Systems and Operations with a focus on excellence in organizational management and in developing strategic capabilities and programs in support of the center and lab’s vision, strategy, and mission execution. This will require partnering with peers across the laboratory to leverage advanced computational capabilities on behalf of the mission work of NREL.
Lead Software Developer - Dunlop Institute, University of Toronto, Toronto ON CA
The incumbent is responsible for acting as the lead software developer for a major scientific nationwide project funded by the Canada Foundation for Innovation (CFI) to convert the enormous raw data streams from next-generation radio telescopes into sophisticated digital databases that astronomers can use to make new discoveries. Key responsibilities include: coordinating and overseeing software development activities between six Canadian universities and other partners; developing uniform codebase through a central repository; and ensuring that all components are interoperable and meet scientific andtechnical specifications. The incumbent also coordinates, prioritises and oversees the project’s software milestones, feature requests and bug reports; participates in the project’s management committee;and facilitates multi-site collaboration.
Technical Director of REN-ISAC - Indiana University, Bloomington IN USA
The TD participates in and supervises engineers and analysts engaged in developing and providing the technical infrastructure and services for the Research and Education Networking Information Sharing and Analysis Center (REN-ISAC) members. The TD manages the development and integration of systems supporting the REN-ISAC’s role in serving as the Computer Security Incident Response Team (CSIRT) for research and education networks in the U.S. The TD coordinates with other organizations at IU including the Global Network Operations Center, security staff, network engineers, and research laboratories to meet the goals of the REN-ISAC and to provide the REN-ISAC members services. The TD coordinates the development and maintenance of appropriate procedures, checklists, training materials, and maintenance manuals for the REN-ISAC systems, assuring the technical and operational readiness of the REN-ISAC staff. The TD develops, implements, and maintains complex secure, trusted communications mechanisms for members and staff. The TD works with REN-ISAC members to define requirements for technical services.
Applied Research Manager, Acoustical Modeling - Facebook, Menlo Park CA USA
The ideal candidate is an experienced research scientist and manager with a proven track record of technical accomplishment, project management, and communication skills, and a passion for speech technology and voice interfaces. Manage a team of speech scientists and software engineers to design, build, and ship speech technology to fulfill current and future product requirements. Set technical direction and establish a research agenda leading to improved performance on Facebook use cases. Be a subject matter expert able to hold your own in technical discussions and influence product strategy
Lead Biostatistician - Premiere Research, Winnipeg MB CA
Our Biostatisticians apply knowledge of statistics to independently provide statistical consulting, assist with study design and protocol development, and perform statistical analysis of clinical trials. They will also review project related documents, prepare statistical analysis plans (SAPs) and statistical reports. This individual will develop analysis data specifications, create analysis datasets, Tables, Listings and Graphs (TLG) of clinical trial data using SAS. He or she will also perform quality control of TLGs and derived data sets created by others; Develop and validate SAS programs, micros, and utility tools; Apply advanced programming knowledge to support programming efficiencies.
Head Of Research - European Bioinformatics Institute | EMBL-EBI, Hinxton UK
As Head of Research you will report directly to EMBL-EBI’s Co-Directors and play a key role setting vision, strategy and scientific direction for the Institute. The Head of Research also provides oversight of EMBL-EBI’s research portfolio, similar to a Head of Department.
The Head of Research will chair the Group Leader appointment panel for dedicated research group leaders at EMBL-EBI and provide decisions on EMBL-EBI research resources. As part of the leadership group you would represent the Institute within EMBL, to the EMBL Council, and externally to the scientific community within Europe and worldwide.