Research Computing Teams Link Roundup, 27 Mar 2020
Hi!
It’s been quite a week. Research computing teams across the world are contributing to the effort against COVID-19 by prioritizing support, computing access, data analysis or software development. A number of vendors of cloud infrastructure or proprietary software have waived their fees and teams are rushing to make them available to their researchers.
If you’re working somewhere where you can help out in those ways, it can be great to feel like you’re contributing to those very timely efforts - that you’re doing something to help.
On the other hand, if you’re not in that position, it’s easy to feel like you’re on the sidelines while important, urgent work is going on somewhere else. But that’s not the case.
The COVID-19 danger is enormous and immediate, but we face many other challenges once this one passes. Whether the research you support is in areas of climate change, environmental science and ecology, or new-generation power systems like hydrogen cells, renewable power, batteries, or the grid systems to connect them; other areas of health research like cancer or heart disease; food science for a growing world; finance and the economy — or basic research with no obvious immediate application but which could lead who-knows-where! — the world needs more, better, and faster research to tackle the many issues ahead of us and to take advantages of new opportunities we don’t even know exist yet. And our teams help power that research.
The world is discovering that it can make huge, wrenching changes in the space of days if it wants to. This newfound willingness to act - informed by the results of research from groups around the globe - puts us in a much better position to take on the challenges that we will face after the current crisis has passed.
Whether you’re working overtime now on the needs of the current crisis or putting in work that will help us deal with — or avoid! — a future issue, thank you for the work you and your team do and the research you help drive forward.
So let’s look at this week’s link roundup!
Managing Teams
Feeling Recognized at Work May Reduce the Risk of Burnout - Lab Manager
The headline says it all; this reports on a study that reports the titled result. There’s a lot going on right now, and your team members are feeling a lot of pressure from a lot of directions - make sure to honestly recognize their work and their accomplishments. And right now, working at anything approaching normal productivity is an accomplishment.
Moderating Discussions over Video - Beth Andres-Beck
Working remotely and communicating online doesn’t really introduce new problems so much as it greatly amplifies exiting problems that can otherwise be papered over with in-person interactions.
Some meetings are pretty straightforward and translate well to online - standups, or team status updates. But it if you want to have a brainstorming meeting or a meeting to come up with a new solution to a problem - or even choose which problem to solve - rather than just a round table of updates, doing it virtually takes some extra thought. Moderating those discussions well takes some doing even in person - our team this week had an opportunity to see some online meetings where this was and wasn’t successfully achieved virtually. (Taking notes on what doesn’t go well is a good starting point to running your own meetings better…)
In this article, Andres-Beck takes some lessons from her experience in quite a different environment - from those of her small liberal arts school professors who did a very good job and moderating classroom discussions in the humanities (which are typically much more engaging than lectures in STEM fields).
The whole article is a bullet-point list of things you can straightforwardly do, so I can’t really summarize it; do read it. Some suggestions that stood out:
- Have the other videos in gallery view, so you keep an I on if people are engaged
- For up to 10 people I don’t bother with raised hands. For more than that some video tools have a built-in mechanisms: for the rest you can use chat
- Have one person, ideally not a participant, take collaborative notes so everyone else can pay attention
- Establish your facilitation plan up-front and communicate it
- Cover the goal of the specific discussion as well, and frame any expectations
- As much as possible, prompt for specific kinds of comments, rather than using open-ended questions
- People aren’t getting the usual signals to stop talking, so don’t be shy about interrupting
How to Create the Perfect Meeting Agenda - Steven G. Rogelberg, HBR
The title oversells the blogpost here, but Rogelberg suggests one useful and relatively easy thing to do to improve agendas, and it ties into one of bullet points from above about covering the goal of the discussion clearly.
The article suggests that instead of having vague agenda items like “Revisiting performance of data ingest module” or “New system uptime” they should instead be clear, focussed questions “What changes could drop data ingest times from 1hr to 10min like Prof X needs?” or “What are acceptable, feasible uptime requirements for the incoming data-analysis cluster?”. This takes more thought, but starts the conversation off in the right direction — and has the advantage that it’s clear when the agenda item is finished (when you have answers to the questions.)
Three signs of a poor hiring process—and four ways to fix it — Cate Huston
Right now our HR departments are all tied up with issues larger than our upcoming positions, but as things stabilize into our new normal, we in research will still be able to hire — which isn’t necessarily true of our colleagues in other sectors. It may be easier to hire (because of job losses and fewer new jobs elsewhere) or harder (because everyone will want stability and not want to switch jobs if they have any choice at all), but it’s still important we do it well — our team members are the people who do the work that powers research.
This article talks about debugging a hiring process that isn’t going well. Their steps are:
- Acknowledge your bias so to go beyond it
- Reset your expectations
- Articulate definite skills and behaviors, and why they are important
- Standardize rubrics
By and large we hire infrequently so it’s hard to see patterns and distinguish between “we’re not hiring well” and “the last hiring round was tough”. That makes it more important, not less, to make sure we’re doing a good job each time. Their step three about articulating behaviours, I think is really important.
In technical fields we tend to overemphasize “2 years of experience with [technology X]”, and I don’t think that’s helpful; if you have someone who has tackled similar problems before and has a history of learning new stuff, the lack of that experience isn’t important - and if they haven’t tackled similar problems before and have no history of learning new stuff, even 15 years of experience as an expert X-er is unlikely to be what you need.
We’ve been trying to focus on behaviours - not attitudes, not knowledge, but behaviours - that we need in the team. We come up with them by thinking about team members who have been successful, and the specific things they have done that have helped the project and the team; or specific things that we don’t have people doing that we need. So we’ve been moving away from technology shopping lists to include some responsibilities like:
- Respectfully reviewing other team mates’ code, and incorporating respectful code reviews into your own work
- Eventually represent [project] at collaboration meetings, advocating for success of the collaboration as a whole in a way that is aligned with [project]’s goals
- Design and Implement solutions with minimal supervision, frequently validating the approach with teammates and collaborators
and requirements like:
- Demonstrated ability to work both independently and in a team
- Demonstrated willingness to initiate collaborate with external partners, especially with international colleagues
- Demonstrated ability to learn what is needed for the task at hand
- Demonstrated tendency to improve the team’s tools and processes
Research Software Development
Legacy Code Online Conference - Wed Apr 1, 12:00 - 17:15 UTC
This is a quickly-assembled online conference on handling legacy code, “The Software Craft and Testing Conference”. It will consist of five live talks with questions via chat; the schedule should be up shortly.
Why your software should use UUIDs in 2020 - Ivan Borshchov, DevForth
UUIDs still aren’t widely used in research computing; this article gives a brief overview of what they are, the differences between the different formats and representations, and why they should be used unless you have some strong external source for unique ids for data objects.
Product Management and Working with Research Communities
Scientific Software Projects and Their Communities - Rene Gassmoeller
During my work, I have frequently noticed that some software project members have valuable scientific ideas and follow best practices for software design yet their software projects never attract a sufficient user base to establish themselves.
Research software, or research infrastructure, is of no use at all unless there is an engaged user community who uses and cares about it. This article summarizes some of the work of Gassmoeller’s Better Scientific Software Communities project that takes a look at a number of scientific software communities and tries to determine commonalities between those that are successful.
The article (and for that matter the materials being assembled at Gassmoeller’s project page) are very much worth reading. Some key points:
- Friendliness and welcoming atmosphere of a community isn’t enough to attract contributors; much more important is response time:
Contributors who received feedback within 48 hours “have an exceptionally high rate of returning,” whereas “Contributors who wait longer than 7 days for code review on their first bug have virtually zero percent likelihood of returning.”
- Highly visible credit to contributors matters a great deal - probably more so in research software than other OSS efforts, since credit is the coin of the realm in our world;
- Policies and governance matter a great deal to manage the inevitable conflicts that will occur if more than a handful of people are involved;
- Leadership skills, not just technical skills matter.
This is really important work if we want our research software and infrastructure to have impact, and I look forward to hearing more out of this project.
Organizing a Conference Online: A Quick Guide - Geoffrey Rockwell, Oliver Rossier, Chelsea Miya & Casey Germain
Two weeks ago I included another resource for putting together an online conference; this one explores does more to the range of different outcomes you might want a conference to have — what would make you think this conference you’re considering was successful? — and how you could arrange a virtual conference to achieve that. What’s more, it goes into a couple possibilities for ways that a virtual conference could be organized that you couldn’t or wouldn’t do for an in-person conference.
We’ll be attending online conferences a lot in the next few months, and that will make people more willing to take part in virtual conferences in the years ahead. If we learn to do these well, they could be a very useful for conferences for small or dispersed communities, or initial conferences to bring a community together.
Ten simple rules for providing effective bioinformatics research support - Judit Kumuthini et al
This is part of PLOS Computational Biology’s useful “Ten simple rules” series. While it’s written in terms of bioinformatics support, it could be easily used for any of a number of data analysis projects, and (with a few more tweaks) even for other kinds of research computing projects such as simulations, running systems, or (more tweaks) software development.
I think this is a useful paper for some of us to keep a link handy to - not because I think you or your team needs to read this, you already know how to run such a project. Their ten rules follow, and won’t surprise you:
- Collaboratively design experiment
- Manage scope and expectations
- Define and ensure data management
- Manage the traceability of data
- Determine how and what metadata are reported
- Coordinate data and internet security
- Control data quality throughout the project lifecycle
- Identify suitable computational tools for data analysis
- Track, record, and confirm workflow changes
- Repurpose the data
But this document or something based on the ideas in here could be very useful for setting expectations with researchers or communities who haven’t worked with your team before. “This is how we do research computing projects, and here’s how they normally work”.’
Cool Research Computing Projects
NSF Announces New Expeditions in Computing Awards
The most recent rounds of NSF Expeditions in Computing awards (next round due in June!), a program for creating multidisciplinary centres that push forward computation in science and engineering, are as usual go to extremely interesting projects.
Of the three, I’m particularly drawn to the Global Pervasive Computational Epidemiology project, an extremely timely project which was awarded before the current situation, brings together not just large scale modelling over networks for forecasting and inference, but also policy, economics, and sociology, with significant knowledge transfer components.
The other projects are the Understanding the World Through Code project, combining machine learning and program synthsesis with applications in small molecule design, RNA splicing, and cognitive sciences; and the Coherent Ising Machines project, looking at quantum annealing for optimization problems and novel applications.
Emerging Data & Infrastructure Tools
How Big Technical Changes Happen at Slack - Keith Adams & Johnny Rodgers, Slack Engineering
It turns out they happen bit by bit.
Slack wants to make sure we catch revolutions at the right time, while limiting the energy we spend chasing fads.
Slack gives their teams lots of flexibility to play with new technologies for prototypes or even small services in production, so that they’re constantly testing new stuff out:
Instead we strive to actively invest in exploring new things, knowing that most of these investments will return nothing. To bias our investment towards useful new things and away from fads, we are ruthless in killing experiments early that do not prove valuable.
And that’s the key; they are very disciplined about stopping experiments with technologies that do not have significant advantages. This generally isn’t a top-down thing, it’s more peer-based; if others don’t see the benefit and join in, the experiment just burns out.
Our teams are generally too small to allow the same volume of experimentation, which is a shame and slows down testing and adoption of new technologies which are useful; one of the reasons for this newsletter is to try to let teams know what other teams are playing with to speed this up a little.
Modern Data Lakes Overview - Developer.sh
Data lakes meet a need which is more typical of enterprise than research computing. We typically have (or have had) the luxury of being able to collect our data how we want it, and then store it in some nice uniform format. But as larger pools of data are becoming available, sometimes we want to be able to access it as it is, and may not have the time to be able to neatly process through some uniform interface even if it doesn’t all look the same.
Data lake software is a way of providing a single query interface (typically SQL or a variant) to a large number of possibly different data sources, often with a data source discovery function as well. This article gives a quick overview of the approach; it focuses on just two newer tools (Delta Lake, built on top of Spark, and the incubating Apache Iceburg), comparing those with the venerable Apache Hive. There are a lot more such tools out there and some of them look promising; of the ones covered in this article, Delta Lake would be an excellent choice for a shop that has already invested heavily into Spark, but otherwise I’m not sure these are the most compelling tools for us.
Automating MySQL schema migrations with GitHub Actions and more - Shlomi Noach, Github blog
Amazon Redshift CI/CD — How we did it and why you should do it too - Doron Vainrub
CI/CD and other forms of automation are slowly gaining traction in research computing for much of our software, but much less so on the parts that connect directly to the data. And understandably so - these changes, if done incorrectly, can lead to data corruption or loss, which is one of the worst possible outcomes in research computing.
But being overly hesitant to make changes to the file formats, databases, or other stores which hold that data has downsides, too. The more barriers in place to making those changes, the less practiced and tested those change procedures are, which make them more error-prone which in turn leads to hesitance to make changes…
CI testing, with and schemas and data models as code which can be rolled back on a test branch before being rolled into production, is a way of moving in the direction of having these data changes be testable and building confidence in changing the data back end. These two articles talk about different aspects of making these changes and running tests in an automated way.
The first article talks about GitHub’s experiences with database migrations in particular, talking about going from a very labour-intensive and largely manual process to one that has many more automated steps. The article goes very deeply into the process, including a number of open source tools and the use of GitHub actions to automate steps leading to these migrations and implementing them once approved. If you have a project that does even occasional schema migrations, there’s something to be learned from this article.
The second article takes a broader view and talks about all database-related code — so tables and views, but also all the user-defined functions, stored procedures, and ETL routines they use for AWS Redshift, a hosted data warehouse solution which is based on a now-old version of postgres — and moving all of those into CI/CD. Their approach is:
- Database code is version controlled
- Database code is validated
- There’s a CI/CD pipeline…
- …and automatic deployments
They’ve developed a tool called RedCI (not yet available, but it sounds like it which) which handles this workflow for them. I’d like to see more research computing applications adopt databases more heavily (and move code into the databases) so it’s nice to see tooling coming along for helping support routine testing of such flows.
Random
Lazydocker - a tui for monitoring and performing bulk actions on docker containers.
Do you work on a Mac and have a favourite iTerm pane setup for certain kinds of tasks? Automate their creation with itomate.
An early-ish (2013) and influential book on working remotely in the modern tech era, “The Year Without Pants” about Wordpress by Scott Berkun, is available in digital formats for free.
I finished the ebooks and email sequence for my one-on-ones quickstart; I have to say, for sending out automated emails in a sequence, ConvertKit is super slick. I wonder if it could be used for email-driven courses in research computing topics.
An awful lot of us are discovering the limitations of our workplace’s VPNs. When things return to normal and we can make changes, tailscale’s peer-to-peer architecture (with a hub-and-spoke control plane) looks really interesting.
Apparently csh is punk rock.
That’s it…
And that’s it for another week.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
Jobs Leading Research Computing Teams
Sr. Software Engineer Manager HPC/Parallel/Distributed Computing - KLA, Milpitas CA USA
This role requires a solid SW architect skill with strong management experiment, who is well versed in the software development, organized, detail oriented and able to deliver results on-time and within budget.
Director, Data Strategy - Putnam Investments, Boston MA USA
The Director, Data Strategy is in charge of the development of the Data and Analytics overall strategy and delivery on the enterprise data management roadmap by working closely with business partners across Retail, Institutional, Investments, and Operations. This role ensures that Data and Analytics goals are aligned with the business s mission, strategy, and objectives
Operations Lead - High Performance Linux Computing Environment - Merck, Austin TX USA
Will lead the operations activities of the High Performance Computing (HPC) team by working closely with HPC Architects, DevOps Engineers, Solution/Workflow Engineers and Research Scientists to implement/script/operate both on-premise and cloud based infrastructure used in supporting scientific workflows. This technical role requires deep knowledge and experience with large computing environments.
Advanced Computing, Mathematics & Data Division Director - Pacific Northwest National Laboratory, Richland WA USA
The Division Director will work closely with the Director of the University of Washington-PNNL Northwest Institute for Advanced Computing where University of Washington faculty, staff and students and PNNL staff collaborate to advance the use of computing in discovery and a broad range of application areas.