Thanks for your comments about earlier hiring and feedback posts. I’m taking those responses and getting some help pulling those together, incorporating the input, adding more material, and putting into some kinds of coherent wholes that can be made more widely available (like the getting started with one-on-ones material). As always, I appreciate your comments, questions, suggestions, and feedback - please always feel free to hit reply or email me at firstname.lastname@example.org at any time.
For now, the roundup!
It’s not personal, but once the pandemic is over and some kind of normalcy returns, a lot of people are going to leave their jobs. The past year has been stressful and exhausting, and even if their job wasn’t a big part of that, a lot of team members are just tired with everything and are going to be looking for a change.
So while it’s always good to be prepared for any given team member to leave (make sure everything’s always documented, use one-on-ones to have a good understanding of what everyone’s working on, use techniques like pair programming/PR reviews or talks and demos to disseminate knowledge), we should be doubly prepared in the coming months.
The post on NOBL makes the following suggestions:
It may seem odd to celebrate the team member leaving, but basically none of your team members are going to keep their current job with you until retirement. They will all move on, and new team members will join in their place, that’s good and healthy, and we want to mark these moments like all life transitions - celebratory, even if a little melancholy.
To the list they provided, I’d add: make sure you have the documents - job description, evaluation framework, interview questions - in place to be able to post a job ad for a replacement for anyone on a moment’s notice. (If you don’t - review the hiring discussion we had in #59, #60, #61) .
These two articles point out the power of fairly simple (but tough to do) behaviours to move projects or conversations forward.
In the first, Kaplan-Moss tells us about the power of just sitting down and grinding through a lot of manual work to make a big impact. The work example hs cites is simply spending the hours of time needed to go through a huge issue backlog, categorizing and labelling and prioritizing. With that done, the team was now unblocked on closing a number of the tickets as duplicates, seeing the patterns in what needed to be done, focussing on the highest value issues, and plowing through the backlog.
I’ve noticed this too in my own career - sometimes just bearing down and doing a bunch of unglamorous manual work can shift project momentum in ways it needs to go, or bring you to the table for conversations because you now have produced something of value that wasn’t achievable another way.
In the second article, the author points out the power of just staying silent when facilitating conversations. This is especially important to us as managers with some authority. Being silent for what feels to you an incredibly long time will eventually prompt others to contribute.
Dirty Escalations: Making Frenemies and Pissing Off People - Chase Seibert
So Manager-Tools would tell us, correctly, that escalation is a very broad term that means any kind of communication of increased urgency/importance - bringing a due-date of a deliverable earlier, going from an email to a phone call or a quick video chat, etc. That is all true, and “escalation” is also widely used to mean specifically raising an issue up the organizational ladder, and until we have a term specifically for that kind of activity, people will continue to use escalation to mostly mean that.
Seibert talks about the “to the manager” kind of escalation, what the problems are, and how it can go bad. The suggestion is to have the two team members jointly escalate to their manager(s) with a co-authored document which then maintains the context as it goes up the chain. Not only is that a pretty good approach, it even works when the team members share a manager and that manager’s you. You want to avoid being the tie breaker of first resort, and encouraging your team members to author something and send it to you is a good step towards having them get it mostly sorted themselves.
Ramadan Mubarak! - Ramadan Tips for Non-Muslim Friends of Muslims - Fahmida Kamali
Ramadan is here, and Kamali offers some tips for those of us who are non-Muslims to be supportive to and non-weird around their Muslim colleagues during this period of fasting. As Kamali points out in a twitter thread, working from home does not make it easier to fast (do you find yourself eating less these days?) and we could all use a little extra support right now. In this slide deck, she offers some background and some tips:
Expasy, the Swiss Bioinformatics Resource Portal, as designed by its users - Séverine Duvaud, Chiara Gabella, Frédérique Lisacek, Heinz Stockinger, Vassilios Ioannidis, and Christine Durinx, Nucleic Acids Research
It’s fair to suggest that for a lot of research computing product development, feedback from users comes towards the end of the process. With the re-design of Expasy, the Swiss Institute for BIoinformatics’ bioinformatics resource portal, the developers used a more tech-industry style of user input, doing a usability assessment study of their existing portal (one of the first 150 websites in the world!), looked at existing user flows, and looked at competitors. Then they proposed some broad approaches to key stakeholders, and upon feedback from there, started iterating with wireframes with real users.
They implemented the new site without breaking old links, making sure that about 5000 URLs which they knew were being at least occasionally used from their analytics were redirected.
The article is a full overview of their process if you’re interested in user-centred design and wondering what it might look like for your products.
Four projects receive research computing support through institute’s seed grants - Penn State Institute for Computational and Data Sciences
Funding for research computing and data support is kind of a mess. Most PIs don’t need multiple FTE-years worth of one specific kind of technical support for their projects, hiring them is hard anyway even if you do have the expertise to evaluate their skill sets, and that’s if you could get the funding. More common is that the need is for a smattering of different skill sets for a few months at a time, but that’s even harder to fund, and to procure. So what typically happens is that some poor grad student gets the technical work dumped on their desk even though it doesn’t fall into their area of expertise.
The only real alternative I see is an agency model, where a PI (especially at the start of a project) can get a short-term team of expertise for consulting and starting a project, and engage with them as the project progresses as needed.
Penn State is doing this with a seed grant program for working with experts at ICDS. Crucially, it’s not just for software development:
Faculty members can apply for support hours from one or more RISE team members for assistance with databases, visualization, code optimization, application development, science gateways, cloud computing, cyberinfrastructure development or other skillsets needed for research in computational and data sciences.
Ideally (I’ll write about this more) the same grant could also involve computing/data hardware access, and would be transferrable between multiple such “agencies” rather than just in-house. But for the majority of capacity needs for research computing and data support, this is the most viable path as far as I can see.
Singularity Registry (HPC) - Vanessa S.
Making a second appearance in two consecutive weeks, this time Vanessa has put together a Singularity repository client that works with lmod to provide singularity instances as module files for HPC systems. The singularity images that are available can be seen here; they include a number of hard-to-build bioinformatics tools, and images that support GPUs.
A sysadmin can make sure that the module system calls
module use /opt/lmod/shpc
and then, e.g.,
module load tensorflow/tensorflow/2.2.2
would load the tensorflow singularity module, and running “tensorflow-tensorflow-shell” would transparently run the singularity image as if it were a local executable file.
Alternatively and maybe more excitingly, a user can setup shpc and have access to those images.
NVIDIA Enters the ARMs Race with Homegrown “Grace” CPUs - Timothy Prickett Morgan, Next Platform
Optimizing Data Movement in GPU Applications with the NVIDIA Magnum IO Developer Environment - NVIDIA developer blog
CUDA Python - Developer Preview - NVIDIA developer blog
The biggest research computing news this week came from a number of announcements made at GTC ’21.
The splashiest news, and one that got a lot of people talking, was NVIDIA’s announcement of their upcoming ARM server CPU, Grace. It is aimed squarely at the AI/HPC market, and will be part of a more integrated fabric with memory and GPUs, with wild proposed numbers for memory-to-CPU and memory-to-GPU bandwidth (that maybe don’t seem so wild after Apple’s M1 breakout). The Grace CPUs will be available in 2023, and will be featured in the Swiss National Supercomputing Centre’s next big system as well as a system at Los Alamos National Labs.
Morgan puts the Grace CPU in useful perspective, connecting it back to NVIDIA’s “Project Denver” started a decade ago, and suggests some likely architectural considerations.
While the CPU news is exciting - and with Graviton, M1, and now Grace it’s hard to view ARM as an upstart for research computing anymore - what strikes me as more meaningful is the increasingly integrated hardware and software stack NVIDIA is proposing. As an example, the GRACE CPU, it is said, will work seamlessly with NVIDIA’s HPC compiler and libraries.
The integration continues with the announcement of the Magnum IO developer environment, which takes existing high-speed communication libraries (NCCL for between-GPU communications, NVSHMEM for shared memory, UCX as a backplane for between-node communications, GDS for GPU-to-storage data transfer, and the cuFile APIs) ties them into common profiling tools, and suggests that they will become increasingly tightly integrated (taking work already done with NVLink and of course Mellanox).
Finally, NVIDIA clearly prefers Python for its non-compiled glue language connecting components; there’s now a preview of a CUDA Python (described more in this NVIDIA blog post) which includes a lot existing packages and adds a common, official, Python API to CUDA functionality which will (eventually) improve interoperability between a number of pieces of the CUDA python ecosystem.
It’s an ambitious programme of effort, and it will be really intriguing to see how successful they are over the next couple of years. Those who can afford it can get a glimpse at this future more integrated stack by applying for access to an NVIDIA ARM Developer Kit which honestly seems like it would be a lot of fun to play with.
Docker without Docker - Thomas Ptacek, Fly.io
Fly.io, which showed up in the Random section of #62, has an increasingly powerful compute-at-the-edge framework - you can give Fly.io docker containers and it will run them out in their CDN, close to your users (or data collection sensors, say).
But they don’t run the services as Docker containers - they run inside lightweight Firecracker VMs (e.g., #59). In this post, Ptacek shows how Fly.io allows users to push a docker image to a secure docker repository, and the steps that are gone through to unpack the docker image (showing how the docker repository API and docker layered file stems work), how those are are bundled up into a loopback filesystem for the Firecracker VM, and how that’s started up. Using Docker as a format for the image lets them make use of Docker’s nice and widely-used tooling for both the users to create them image and for them to interface to (e.g. they run a containerd instance to cache docker images), even if they don’t actually execute the images that way.
The article links to a number of other Fly.io blogposts that might be of interest, such as that for their SSH infrastructure using wireguard.
2021 US-RSE Virtual Workshop - Abstracts due, May 3; workshop 24 and 27 May
Calls are out for 15 minute talks or breakout discussions on topics such as:
ELIXIR Webinar: Towards professionalising data stewardship - 18 May, 15:00 CEST, Free
There has been an effort to professionalize research software development careers; in my view, this effort has mistakely siloed software development from other areas of research computing and data. Areas like research data management, curation, stewardship, and governance also need such recognition and professionalization of career paths. The private sector, or even academic health research is well ahead of us here; the job board routinely has a dozen or so managerial jobs for such jobs in the private sector, particularly in regulated industries like health and finance.
In this workshop, authors of a recent Netherlands report on data stewardship competences, training and education will discuss their reports and steps needed in the Netherlands and abroad.
EDUCAUSE Cybersecurity and Privacy Professionals Conference - 8-10 June, Online - early registration $247 members, $495 non-members
Tracks for the conference include:
Nice post walking through atomics, fences, and memory ordering.
A post extolling VSCode’s new-ish DevContainers for not only deploying but writing code in a container.
JupyterLab 3.0 is out.
A collection of useful GitHub Actions snippets.
A deep dive into CORS, how it works, and best practices.
Meet taskell, a small command-line and terminal based kanban board that lets you import trello boards.
Cloudflare pages is now generally available. I just migrated a static website with a GitHub-to-Travis-to-S3-to-Cloudfront-with-AWS-Certificates deploy cycle to GitHub-with-Cloudflare-Pages and so far everything is way easier. Worth checking out, especially if you already have a GitHub pages site that you’d just like to make faster.
Reminder that PHP is still a thing and it’s maybe better than you remember.
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
This week’s new-listing highlights are below; the full listing of 130 jobs is, as ever, available on the job board.
Director, Research & High Performance Computing Support (RHPCS) - McMaster University, Hamilton ON CA
Reporting to the Vice President, Research, the Director will be engaged in the support and direction of research computing services at McMaster in their capacity as Director of RHPCS . RHPCS partners with McMaster’s AVP and CTO office, University Technology Services (UTS), Computer Services Unit (CSU) in the Faculty of Health Science, the Privacy Office and all other Faculty and campus IT personnel to provide services for all researchers at McMaster. The Director works with government, industry, and partner organizations such as Compute Canada, the New Digital Research Infrastructure Organization (NDRIO), ComputeOntario, SOSCIP and others, to deliver services that enable McMaster researchers to have access to state-of-the-art digital research infrastructure required for the computational and data intensive research they undertake. The Director leads a team of information, technology and other professionals that provide support across multiple domains and inform the research community of relevant policies, such as emerging data standards, and privacy and security compliance requirements. The Director co-chairs the IT Governance Research Technology Committee and ensures that research technology initiatives adhere to the IT Governance processes.
Project Manager – Scientific Software Development - Atomic Weapons Establishment, Reading UK
We are looking to recruit a scientist with experience of project management and software development in a scientific context to work on the Nuclear Test Monitoring programme, supporting AWE’s national nuclear security role. You will work as part of a team advising the UK Government on the technical verification of the Comprehensive Nuclear-Test-Ban Treaty (CTBT). You should have knowledge of software development and computer programming, including languages such as Python, C/C++, Java and scripting languages, along with knowledge of database querying and management e.g., SQL and experience of working in a UNIX/Linux environment. You should also have experience of project management/delivery and organising technical work. You should be strongly self-motivated, able to work collaboratively with both the UK NDC scientists and the UK NDC operations team and have good communication and presentation skills. Ideally you should have experience of working with geophysical data.
Senior Data Science Manager - Marine Stewardship Council, London UK
In what is a new role for the MSC, the Senior Data Science Manager will be responsible for the enhancement and growth of the Science & Standards data science function. The team will be aligned to the mission of the MSC, and will encompass research insights to support the MSC’s standards and assurance system, develop efficient collaborative coding workflows, and will provide best practice advice on quantitative research techniques and open reproducible science. You will also be creating links with relevant external research organisations for data intensive projects.
Senior Data Capability Manager - Cancer Research UK, Stratford UK
This role works within the Data Capability and Operations team to lead the implementation of data capabilities to support CRUK’s Data as an Asset strategy. This exciting role will be responsible for data policies, data standards and data quality for the data management across CRUK’s applications. This is a highly collaborative role and you will be working with colleagues across technology and the organisation to drive impact. You will take a proactive approach to get oversight on current and future data-related initiatives and make sure they are aligned to the Data as an Asset strategy, as well as taking a strategic view on the gaps in our data capability and identify ways of addressing those within the short, medium and long term.
High Performance Computing Lab Task Lead - AVTC Group, Houston TX USA
Responsibilities will include identifying lab deficiencies, planning future purchases, investigating system problems, proactively monitoring system health, guiding a team of system administrators, providing inputs to management for team members’ performance, and helping lab users to effectively utilize the system.
High Performance Computing Project Manager - Princeton, Princeton NJ USA
The HPC-PM position reports directly to the Cooperative Institute for Modeling the Earth System (CIMES) director and is responsible for the management of the new computational resources and activities at CIMES, and develops innovative strategies for computational resources and activities to achieve their objectives. The resource management will include developing, implementing and tracking project prioritization, assessing and enhancing system utilization, planning for future HPC and storage resource acquisitions, ensuring that resource utilization aligns with the strategic goals of CIMES. These computational resources include a multi-petaflop computational platform, petabyte-scale storage resources, and a software environment for the configuration, running, and analysis of some of the world’s leading weather and climate models developed at NOAA’s Geophysical Fluid Dynamics Laboratory (GFDL).
Data Science Manager - Microsoft, Vancouver BC CA
We are looking for an experienced Data Science Manager to manage a team of 5-6 applied data-scientists.You will be responsible for impacting the backlog and strategy of200 plus engineers and working with a broad cross-discipline team. We are looking for a manager that establishes an inclusive culture and growth mindset. You will need to partner effectively with teams across Microsoft to design holistic solutions. This will require technical depth to ensure we make acceptable tradeoffs among multiple technology options.
Manager, Data Privacy Architect, Data Management, Analytics & Cognitive - Deloitte, London UK
As a Data Privacy Architect within Deloitte, you will be working within the Data Management team of the Data & Analytics Modernisation practice alongside some of the top experts in the country and on some of the largest and most complex client engagements across a variety of industry sectors. You will be given the opportunity to grow and take on responsibility from day one in a challenging but rewarding and meritocratic environment. Career growth for the right candidates is only limited by their ambitions.