I hope you’re having a good week. Below is the continuation of our discussion on hiring, stemming in part from the more formalized pipeline that we’re working on; you can also skip to the roundup.
Last week I started with the basic premise - you have a hypothesis that you’ve found a good candidate (and they have a hypothesis that your team would be a good match for them). Then, as scientists, the nob is to disprove the hypothesis.
If you accept that the hiring process is about both sides being able to detect a mismatch as early on as possible, a lot of next steps fall into place. The single most important thing we can do to ensure a good hire is to have a really clear, unambiguous description of what the job requires and what we’re looking for - so unambiguous that you could hand it off to someone else and they’d end up with basically the same post-interview short list of candidates that you would.
A clear description of what you need will help people who would be good for the job find it and self-select by applying, build agreement with your current team members around what you are collectively looking for in a new hire, and help you separate out those who would be good additions to the team from those who wouldn’t as early on in the process as possible. And sitting down and hashing out with your team the job description and requirements is a great way to have an open conversation about work in the team - both as one-on-one conversations and then collectively with the group.
Like so many things in managing, putting together an internal job description and list of requirements isn’t rocket surgery, it just requires thinking things through carefully. The most common issue I see with job descriptions in research computing jobs - and I look through a lot of research computing job ads - is that they are way too specific about technical requirements and not nearly specific enough about anything else. When we’re hiring, we’re hoping to choose a team member who will contribute in a number of ways to the work of the team over at least a few years, and our job requirements should cover what it will take to be successful over that period of time.
To counteract this tendency to focus on technical details, it’s useful to imagine having a successful candidate in the job three to six months later that’s working out really well, and back your way out from there what you’re looking for. This will help you and your team focus on what the person will be to work with, what kind of gaps you currently have.
It will also usefully tend to downplay the requirement that they “must have three years of [tool X] experience”. No one is going to be fully productive in a new role in the first few months - even tool X experts will take some time to learn your particular code base/architecture/system - and this gives someone who has demonstrated related skills a couple of months to learn tool X passably well. Do you really see yourself hiring someone, in research computing, who isn’t broadly capable enough to pick up enough of a language/system/process to start contributing in three months if they’ve done related things with different technologies in the past? If not, why make tool X a hard requirement?
Manager Tools has a podcast episode on writing simple job descriptions which is very useful. They suggest starting with five questions (tweaked here for context):
In our line of work since our tendency is to focus on the technical skills I find it helpful to make explicit in the job description the team skills: in our context that might look like
These last questions, especially the first and second, are about cultural fit. When cultural fit is just used to vaguely mean “like us”, it can be a huge source of bias in interviewing. But when it’s clearly and explicitly defined, it is useful for both sides as a way to clarify expectations about how people work together on the team. (We’ve gone so far as to start an evolving slide deck on how the team works together - it needs work in sections, but the discussion around the slides has been very useful!)
The needs of your jobs and team are going to vary, but in research-adjacent environments we often have cultural expectations around collegiality:
and about independence:
These expectations are neither objectively good nor bad, low nor high, to have of team members, but they are common in our line of work. People who work very differently - who expect very well scoped tickets to work on in disconnected chunks of work, or to be able to toil along on in their own in a corner without interacting with others - are unlikely to enjoy or succeed in environments with these expectations, and vice versa. It’s best to have your teams working expectations extremely clear at the outset for both your clarity in evaluating the candidates and for transparency to the candidate about what the job entails.
Once you have understanding of the requirements for the role, you can start prioritizing them into “must-haves” and “nice-to-haves”. It’s important to be ruthless about pulling items out of the “must-haves” list! It limits your job pool unnecessarily, and divides your focus when deciding on applications in too many directions. Is your team really unable to support a new team member with related skills as they learn about tool X and platform Y?
A complication when prioritizing requirements in research computing is that, it’s pretty common to be open to hiring someone with any of two-or-three different kinds of skillsets. In data science groups you might be interested in growing your team’s skills into NLP or computer vision; a systems team might be open to a security expert or someone with deep openstack experience, or someone who has deployed a monitoring and alerting system before. That’s ok; when distilling this down into a single job description you just break out the common requirements and activities, list them first, and then think of the others “the candidate must fit one of the three following profiles” and then list them separately. Ideally we’d prioritize one and only list it, or have three separate job ads, but that is sometimes out of our control and we we have to work with the situation as it exists.
You can now distill the activities and requirements into a job description. This document is now starting point for discussion with whole team, and other stakeholders who would be working with the new hire. Do they see requirements you’ve missed? Do they have different priorities for those requirements than you initially thought? Are there areas of disagreement that must be understood and resolved?
The next step after having agreed-upon requirements is to think about how to evaluate them; that will come next week.
Speaking of non-technical skills being underrepresented in technical job descriptions… Communicating well is absolutely essential part of a job in any interdisciplinary endeavour like research computing, and written communication is becoming absolutely vital as teams go remote. That doesn’t necessarily mean particularly good grammar or vocabulary - we’re an international community, many in our community are ESL, and those are things that can be cleaned up with tooling support afterwards. But being able to logically make a point, express an argument, or describe a process is essential.
In this twitter thread, Orosz lists a number of resources attempting to convince the reader of this point, and other resources that he feels can be used to help improve written communication skills.
Talent is largely a myth - Avishai Ish-Shalom
In research we’re pretty good at understanding that people grow in capabilities over time, and we typically avoid the tech company trap of talking about “Hiring the Best Talent”. But when we focus our job searches for people who can solve our immediate technical problems when they walk in the door, which is easy to do if we’re not careful, we can backslide into this mentality.
Ish-Shalom reminds us that:
and if we’re trying to hire “the best” candidate during our job search we don’t pay enough attention to our team’s abilities to help the candidate grow and for their strengths to develop.
Maximize your mentorship: search and secure - Neha Batra
I don’t think it’s controversial to suggest that as research computing managers we are given precious little guidance, or useful advice. If we want those things, we have to seek them out ourselves.
Like with putting together a solid list of job requirements, the steps for finding and recruiting mentors to give us some advice aren’t surprising or challenging - there’s no “One Weird Trick for Getting Mentorship”. You just have to figure out what you’re looking for, who you’d like to talk to, and approach them seeking some advice.
People, even busy people, are generally pretty open to having occasional short conversations with and giving advice to people who are earlier in their career path and have questions. And in other contexts, we know this - those of us trained in academia generally wouldn’t think twice about contacting a more senior author on a paper we were interested in, or a colloquium speaker, to ask some questions about how they did the science. But we’re so weirdly conditioned around management not being real valid work in academia that we’re pretty reticent to approach people seeking advice on those topics.
Batra goes through the steps of figuring out where you want mentorship, prioritizing potential mentors, an initial ask for a discussion, and asking for another conversation in a couple months.
If you build it, promote it, and they trust you, then they will come: Diffusion strategies for science gateways and cyberinfrastructure adoption to harness big data in the science, technology, engineering, and mathematics (STEM) community - Kerk F. Kee, Bethanie Le, Kulsawasd Jitkajornwanich
Software packages, like ideas, don’t in fact speak for themselves. Getting any sizeable number of people to adopt a new idea, new practice, or new tool requires enormous amount of coordinated communication effort. In this paper, Kee, Le, and Jitkajornwanich describe what they found to be kept practices to increase the adoption of research computing tools - in this case science gateways and cyberinfrastructure. And why would we build tools if not to have them adopted?
> Based on an analysis of 83 interviews with 66 administrators, developers, scientists/users, and outreach educators of SG/CI, we identified seven external communication practices—raising awareness, personalizing demonstrations, providing online and offline training, networking with the community, building relationships with trust, stimulating word‐of‐mouth persuasion, and keeping reliable documentation.
Relatedly, I’ve recently discovered the Open Source Guides which have brief but good overviews of what you should be thinking about to get users for your open source software, building communities, best practices for maintainers, and developing formal governance when it’s time.
What does a scientific community manager do? Check out the CSCCE Skills Wheel and accompanying guidebook! - Centre for Scientific Collaboration and Community Engagement
Community Engagement Planning Canvas - Tamarack Institute
The the skills needed to manage a scientific community are of immediate interest to us as we try to engage with a research user community for software, systems, curated data, or anything else.
Why scientific community management is so important, in the CSCCE’s estimation, is:
> … science is inherently a community-based endeavor. The generation, validation, and dissemination of knowledge requires a network of diverse roles and a range of community configurations to meet specific needs -whether those needs bridge across disciplines, career stages, institutes or other boundaries.
The skills wheel workbook is a short 19 pages covering skills needed by an individual or a team engaging a community in the following areas, all of which are needed:
At a more tactical level, the Tamarack Institute has a community engagement planning canvas for planning and designing particular community engagement activities.
How I use Pandoc to create programming eBooks - Sébastien Castiel A good overview of a workflow for generating nice longform technical documentation in a variety of formats with markdown + pandoc.
Fulfilling the promise of CI/CD - Charity Majors, on the Stack Overflow Blog
Majors, who makes regular appearances on the newsletter, has a very clear view on the value of CI/CD - and, in particular, CD:
> The point of CI is to clear the path and set the stage for continuous delivery, because CD is what will actually save your ass.
> Until that interval [LJD - from writing new code to testing and at least some users working with the new code] is short enough to be a functional feedback loop, all you will be doing is managing the symptoms of dysfunction.
She points out that having good CI testing and CD is a matter of priorities, not skill sets:
> The teams who have achieved CI/CD have not done so because they are better engineers than the rest of us. I promise you. They are teams that pay more attention to process than the rest of us. Great teams build great engineers, not vice versa.
Effective Property-Based Testing - Russell Mull, Auxon
Generating Web API Tests From an OpenAPI Specification - Henrik Strömblad, Nordic APIs
We’ve talked about property-based testing with particular packages before on the newsletter - most recently #56. These articles distill the use of property testing to some high level considerations - in the first case, quite generally, in the second case for restful API testing in particular.
Mull’s article gives good advice for how to approach property-based testing in general. Its a good deep article - here are some pieces of advice that stuck out to me:
Strömblad’s article looks at REST APIs defined in OpenAPI. From my point of view, API specification languages like OpenAPI are a crucial first step for defining interfaces and allowing clear tests, expectations, and can reduce the need for writing code boilerplate in services or clients. Strömblad describes the use of a new package, Humlix, to start developing simple property-based (rather than example-based) testing for APIs.
Results for the first Stanford Software Survey - The Stanford Research Computing Center
Results of a Stanford-wide research software survey on use and development of research software. Some headline takeaways:
Kobalos – A complex Linux threat to high performance computing infrastructure - Marc-Etienne M.Léveillé and Ignacio Sanmillan, ESET
Kobalos — Indicators of Compromise - ESET
A really sophisticated malware targeting HPC clusters has been found by consultancy ESET, who has named it Kobalos. It’s targeting multiple operating systems including Linux, FreeBSD and Solaris, and perhaps even AIX and Windows, will contact a command and control centre, and will try to infect other systems. It may or may not be related to the rash of HPC centre compromises last year; Kobalos may in fact predate those. Once infected, ssh will be backdoored
There’s a twitter thread tl;dr if you like, a white paper, and the github link has a variety of hashes that can be searched for. The good news is that it seems to be relatively easy to scan a system or network for Kobalos.
Unrelatedly, I assume you’ve already done this, but if you haven’t, update your sudo (again) - this most recent vulnerability is a really bad one.
Last week we mentioned about how fast modern drives are in the context of floating point deserialization. Here, Põder points out that I/O throughput can now be limited by CPU and memory rather than the disk, in a quest to get the highest IOPS and throughput he could on a workstation. In doing so he gives a very nice and thorough overview of the entire I/O path (except the filesystem, which he bypasses here - imagine a database server doing raw block access).
With just one of the Samsung 980 Pro PCIe 4.0 SSDs he was able to get 1.149M IOPS or 6.811GiB/s throughput, and that was only keeping CPU 1% busy. To keep pushing, he profiles kernel activities, switches to direct I/O, adds a PCIe 4.0 Quad-SSD adapter and tunes it to avoid a bottleneck at the root complex, giving brief introductions to psn, 0x.tools, lspci, dstat, fio, and hdparm along the way.
Speed up pip downloads in Docker with BuildKit’s new caching - Itamar Turner-Trauring
BuildKit will now cache directories during builds the way that say Travis-CI or other CI/CD systems will; that can greatly speed up builds that make small changes to dependencies. The example given here is for python applications (that’s the topic of Turner-Trauring’s blog after all) but is widely relevant.
Annual Modelling and Simulation Conference 2021 - 19-22 July, Hybrid Fairfax VA USA and online, Papers due 1 March
Tracks of particular relevance to us include
JuliaCon 2021 - 28-30 July, Virtual, Free; proposals due 23 March
The CFP is looking for talks, lightning talks, mini symposia, workshops, posters, and BoF sessions particularly on Julia applications or approaches to:
Research Squirrel Engineers - An independent squirrel network for RSEs in DH and archaeology - SORSE talk, 11 Feb 16:00 UTC
This short SORSE talk describes the nascent Research Squirrel Engineers community forming in DH and digital archaelolgy.
European Molecular Biology Organization Lab Management Training - Various dates through 2021
EMBO is one of the few organizations out there doing leadership training for scientists; the session aren’t cheap but are very well regarded. Online sessions are coming up for:
Future proof - various dates in Feb and March, Virtual
Of possible interest for subsurface scientist trainees you work with - Agile* has software development classes for subsurface science coming up:
A modern, literate-programming take on forth.
A software development veteran shares opinions that she has changed (and some she hasn’t) over a decade+ in the business.
A columnar, Rust based distributed computing package focussed on ETL jobs - ballista. Think dask but more or less just for ETL (for now).
krunvm creates and manages lightweight VMs from OCI-compliant container images.
A twitter thread describing getting cron’s man page and code back into sync 30 years later.
The Titus Brown lab has an an example of moving a python research computing project away from setup.py and towards pyproject.toml/setup.cfg.
More examples of cloud providers actively going after HPC customers - Google Cloud has a machine image specifically for HPC jobs.
Degrees of Success: The Expert Panel on the Labour Market Transition of PhD Graduates, by the Canadian Council of Academies, is an interesting and in-depth look at the labour market outcomes for Ph.D. students in Canada; different countries will have different (but probably not wildly different) results. Very interesting for trainees and those working with trainees.
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Highlights below; full listing available on the job board. Some might find the NREL job interesting - 70% research, 30% managing…
Group Manager – Data Analysis and Visualization - National Renewable Energy Laboratories, Golden CO USA
The National Renewable Energy Laboratory (NREL) is seeking an accomplished leader for the role of Data, Analysis, and Visualization (DAV) group manager within the Computational Science Center (CSC). NREL’s Computational Science Center conducts research and provides cross-cutting capabilities and solutions including systems operation, advanced computer science, visualization, data science, applied math, and computational science to advance NREL’s mission. The core mission of the DAV group is to provide research and operational expertise necessary to convert data into knowledge, and to help researchers make observational data actionable. Key activities include scientific visualization, data analysis, data-focused predictive modeling using statistics, learning algorithms, digital twins and AI and data management for scientific data.
Head of Research Computing Services - Imperial College London, London UK
Craft a well-planned IT strategy and technology roadmap for research computing which will evolve, professionalise and harmonise research computing services and deliver an outstanding customer experience
Provide strategic leadership to the research computing service teams. Mould a high-performing offering improving scalability and making research computing simple to engage with. Drive clarity of responsibility, collaboration and efficiency of process plus act as a beacon of best practice through the adoption of contemporary IT industry disciplines
Manager, Data Analytics - LandSure, Vancouver BC CA
Our client, LandSure Systems Ltd. (LandSure), is a technology-driven organization providing innovation, project management, communication, and technology services to the Land Title and Survey Authority of British Columbia (LTSA).A wholly owned subsidiary of the LTSA, LandSure operates as part of a unique business model to support the continued growth of the LTSA and its services.
The Manager, Data Analytics reports to the Director, R&D and is responsible for developing and implementing data analyses, data collection systems and other strategies that deliver business intelligence analytics and enablethe organization to make informed decisions through data insights.
Research Project Manager, Computational Memory Lab - University of Pennsylvania, Philadelphia PA USA
The Computational Memory Lab at the University of Pennsylvania seeks to recruit a talented scientific project manager to help lead a multi-center team in carrying out research on the use of brain stimulation to study brain networks underlying cognitive functions and to help create neuromodulation therapies that will someday help restore memory function in patients who have suffered brain injuries
This position requires a unique combination of leadership / teaming skills and technical expertise in the fields of neuroscience and/or bioengineering. The project manager will supervise a staff of full-time research specialists and software developers, and assist them in carrying out this federally-funded BRAIN initiative project. The successful applicant will enjoy leading a dedicated and hardworking team whose hub is at the University of Pennsylvania but that includes researchers at leading institutions around the world.
Sr Systems Administration Engineer - GE Corporate Global Research, Niskayuna NY or Columbus OH USA
As a member of the Enterprise High Performance Computing as a Service (HPCaaS) team, this role will focus on providing global operational support to stakeholders who use High Performance Computing to deliver Engineering outcomes in their business. You will operate, secure and support high-end computing and data management infrastructure as well as provide excellent customer application support in a collaborative environment. Core to the role will be the implementation of robust process improvements and lean solutions for infrastructure deployment and configuration management of both development and production infrastructure while ensuring SLAs are achieved and customer needs are met. You will drive service quality while identifying opportunities for consistency in HPCaaS software, applications and tools in partnership with our global organization.
FDA Research and Scientific Computing Program Manager - KBR, Columbia MD USA
KBR is seeking a FDA Research and Scientific Computing Program Manager (PM) for a future opportunity. The PM will lead our team in supporting complex research / scientific computing and IT projects for an upcoming Food and Drug Administration (FDA) opportunity. KBR will support FDA in some of its most pressing programs affecting the Nation’s Public Health.
The PM will be responsible for overall program execution (technical, financial, staffing, and reporting) while providing strategic oversight and subject matter expertise to promote a culture of collaboration and continuous improvement across multiple task orders, stakeholders, and their associated teams. 10% travel.
Senior Research Scientist, Application Engineering - Oak Ridge National Lab, Oak Ridge TN USA
Contribute to and lead original research including software development, scientific papers, reports and other artifacts. Lead planning and major development efforts on scientific software projects. Lead or collaborate on proposals related to the application of computing in scientific domains. Coordinate, lead, and act as a representative of the Laboratory in national and international collaborations related to scientific software. Work closely with stakeholders to meet their software requirements and achieve their scientific goals. Act as a mentor for project members, junior staff, post-graduates, and students to help them grow. Participate in developing the strategic direction of research software engineering at ORNL.
Software Engineering Manager (Remote) - JMIR Publications, remote CA
Lead your team building modern architectures in critical applications operating at scale Your leadership enables, inspires, and motivates a talented group of developers With consistent coaching and 1 on 1s, you will help level up your team’s members and further their careers as the first among equals Bridge communications between multiple areas across products, technologies, development, QA, support, and infrastructure
Lead Data Scientist - Government of Ontario, Toronto ON CA
• Lead, mentor, and directly support a team of data scientists, data analysts, and research analysts in delivering data science products and services across a wide array of applications and projects • Manage foundational, cross-cutting data science projects, including projects focused on enablement, operationalization, and capacity-building • Provide strategic advice, guidance, and supports to ministry leadership to build and maintain data science functions • Serve as a ministry-wide expert to directly support your colleagues in their data science work and in growing data science knowledge, capacities, and culture
Senior Program Manager - Microsoft Health Next, Redmond WA or Boston MA USA
We are looking for an experienced Senior Program Manager with strong product thinking and technical depth. You will work across engineering teams and customers to help discover, define, and validate product capabilities and help bring them to life in our emerging bio-medical informatics platform.
Senior Software Engineer - Microsoft Health Next, Redmond WA or Boston MA USA
Terra (terra.bio) is a scalable, secure, and open-source platform for biomedical research, designed to help researchers and data scientists to focus on their science as they access data, run analysis tools, and collaborate. Microsoft is collaborating with Broad Institute of Harvard and MIT and Verily Life Sciences to expanding on Terra’s open, modular, and interoperable research platform, with the addition of the Microsoft Azure cloud, data and AI technologies, global capabilities and health & life science customers and partners.
We’re looking for an innovative, collaborative, highly motivated Senior Software Engineering to join our Genomics effort and lead the design and implementation of our cross-cloud identity, authentication, and billing services and APIs.
Deputy Director of Research- National Quantum Computing Centre - UK Research & Innovation, Oxford UK
The Deputy Director for Research will have primary responsibility for overseeing the research performed by, or in partnership with the Centre. With input from the NQCC Director, the Deputy Director for Research will define the Centre’s technical goals within the context of a continually updated technical roadmap. Additionally, the role will have a primary responsibility for establishing a rigorous and fair processes for assessing the performance of delivered quantum systems and software (versus other systems, whether of UK-origin or internationally).
Manager, Research Cloud Development & Operations - University of Melbourne, Melbourne NSW AU
The Nectar Research Cloud is powered by OpenStack and provides computing resources to researchers across Australia. We are looking for an exceptional person to lead the operations, maintenance, expansion and continual improvement of the Nectar Research Cloud services and manage a diverse team of expert cloud DevOps engineers. The successful candidate should be passionate about the ongoing DevOps practice and running the large distributed system that is the Nectar Research Cloud, enabling the next generation of research capabilities across many research disciplines. The position will liaise with the national Research Cloud node partner operators at remote sites and coordinate their operations.
Technical research manager - Barcelona Supercomputing Centre, Barcelona ES
The INB seeks a technical research manager that will join the team to provide support in the coordination of the IMPaCT Data Science programme, liaising with all the involved stakeholders within and outside the BSC, and contribute to coordinate the technical tasks. Plan the use of resources and monitor the progress of work of research projects involving several partners, supporting the PI in keeping track of both technical and administrative tasks. Organize and participate in project meetings, teleconferences, reviews and other events. Maintain communication with project stakeholders; participate in regular meetings and discussions