In #139, we talked about some general purpose first steps when taking on a new responsibility. Let’s talk about the other side of that, now — how to make sure someone we’re bringing on to tackle some new challenges can be as productive as possible.
Back in #135, we talked about beginning with the end in mind: having clear goals for the person taking on a new role, what it should like in the day to day. When scoping out a new responsibility, it’s important to define what success looks like after they’re fully onboarded. That onboarding timeframe may vary: say after 3-4 months for an individual contributor, maybe after 6 months or even longer for someone with significant leadership duties. That doesn’t mean they won’t continue to gro after that! But by that point the goal is that they’re functioning competently in the role. Our job, then, is to help them get there as quickly as possible, for the sake of the team and so that they can start feeling settled in and successful. That means designing an onboarding process.
There are four kinds of information to gather to design the onboarding process:
Building these lists will be iterative: items on one will suggest items that need to be added to another. Both you and the people you’ve had help you scope and define the responsibility can add to it and discuss it. For a software developer writing the lists lists might look like:
With those in mind, and collaborating with the people who are helping define the responsibility, you now have the raw materials to put together on onboarding plan. Stage the resources in such that they are always building towards application in some meaningful contribution. Introduce them to people in the (increasingly) broader community so that they have growing connection to the inputs and impact of the work of your team. At earlier steps in the onboarding plan, feel free to be quite prescriptive (e.g. preschedule a pair-programming session with one of the seniors on pieces of the code base); at later steps, the training wheels start coming off, and you can just list the people resources and let the new person make the connections and learn the material themselves. (You can always become more involved where needed).
As always, none of this is rocket surgery - it simply requires recognizing that onboarding is important, putting together a plan with the resources you have available, and then committing to seeing it through.
What’s the best onboarding plan you’ve seen (or experienced)? What’s the worst? Hit reply or email me at email@example.com and let me know. And as always, members of our community can always feel free to schedule a quick call with me if they have thoughts or questions.
And with that, on to the newsletter!
Should I Create a Performance Improvement Plan for My Direct Report - Lara Hogan
We, myself included, don’t talk often enough about underperforming team members. It’s uncomfortable! But letting a team member struggle indefinitely is bad for them, and bad for the rest of the team.
In that context of reticence about poor performance, Performance Improvement Plans (PIPs) frequently have a pretty bad reputation as a perfunctory, performative step taken before firing someone. The following situation isn’t uncommon: A manager repeatedly but only indirectly discusses with a team member some area of underperformance. The team member remains blissfully unaware, or thinks (perhaps reasonably, given the signals they’re getting) that it isn’t a big deal. The manager, increasingly frustrated by nothing changing, eventually blows up and goes to HR, gets a template for a performance improvement plan. With HRs guidance the manger fills it out, setting a timeline where change is unlikely to be able to happen effectively, and presents it to the team member as a fait accompli. Eventually the team member sees the writing on the wall and to the ineffective manager’s relief resigns, or the team member is fired.
The above is a pretty crummy scenario, and is absolutely the fault of the manager. It shouldn’t be and doesn’t have to be this way. Getting HR involved absolutely should be a final step, but in the scenario above, many earlier steps were skipped.
It’s vitally important to give clear, actionable feedback when something isn’t going well. It’s unfair to the team member not to. There are several methods (#73) including Hogan’s own that focus on behaviour and impact, typically bookended by questions. And waiting until things are too late to change (#06) isn’t fair to you or them.
Hogan describes the pros and cons of PIPs, and while documenting your feedback and working with them on a plan to improve in the needed area can be very useful, getting HR involved in the process is a pretty hard-to-retract step.
If you find yourself in the position of seriously considering one, Hogan has a nice decision framework here:
There’s also useful sections on what to do if the team member doesn’t see the importance of meeting the expectations, or don’t have a shared understanding of the expectations. This reminds me of a nice article on the five necessary conditions for improvement mentioned in #36.
By the way, the discussion above assumes that the thing they’re being given feedback on to improve is the performance of their own personal tasks. Adapting to new work is hard, and a fair amount of patience is justified in giving people a chance in getting up to speed. In that case I’m fully on board with Hogan’s clear, measured approach.
But there’s another domain in which people might be getting corrective feedback - about how they’re behaving to other team members or the broader community. If they’re being toxic to others, do not feel any such need for patience. Correct them once or twice, each times explaining the expectations and consequences, and document the behaviour. The next time it happens, begin the firing process, whatever that means in your institution. Do not allow toxic behaviour to damage the team or your research community.
Nonprofits May Need to Spend a Third of Their Budget on Overhead to Thrive — Contradicting a Donor Rule of Thumb - Hala Altamimi and Qiaozhen Liu
Even the nonprofit world is slowly coming to terms with the fact that paying money to supporting team members in being more effective in their work (for tools, or with support personnel, or externally provided services) is not “wasting money on overhead” but is instead “getting the job done effectively and efficiently”. And yet too often I hear horrified cries like “but that could pay for a new compute node/a summer student” when the topic of paying money for things that team members need comes up.
And that’s how we end up with centres that claim “our people are our greatest resource” with high turnover because they don’t have the tools they need to get their job done well.
Funders have a lot to answer for here, but the cultures of our teams sustain this attitude too.
How To Drive Your Own Career Growth In These 6 Easy Steps - Lighthouse Blog
This blog post lays out a great plan for making sure your career growth is a priority - its worth considering for ourselves, but also keeping in mind for our team members. Like so much in management, there’s no silver bullet here, it’s just having a clear plan in mind and constantly advancing it bit by bit.
By the way, the article starts by pointing to charts from PwC and Deloitte on what’s important for employees, highlighting career progression and growth as being top priorities. It’s worth comparing these graphs to those from our own community, Understanding Factors that Influence Research Computing and Data Careers (#133). They’re indistinguishable. Team members in our profession want the same things everyone wants - growth, recognition, flexibility, purpose, pride in their work and their team, manageable workloads, the tools to do their job well, and enough compensation that they don’t feel undervalued. We’re constrained somewhat on pay, but all of those other pieces are at least partially under our control.
Strategic Planning Should Be a Strategic Exercise - Graham Kenny
How Nonprofits Can Keep Strategy Front and Center - Alan Cantor
As readers will know, I think strategy is incredibly important for our teams. And because of that, I’m constantly vexed by two frequent problems with strategic planning in our profession.
The first in that when it’s done at all, far too often the output is feel-good pablum that reveals no insights, results in no changes, and is forgotten about as soon as the final version is printed. It’s a Potemkin compliance exercise for funding, not a way of collectively identifying problems and prioritizing solutions. The word “excellence” appearing more than say three times in the document is a terrific indicator of this particular failure mode. So is a described “strategy” of continuing to do some of everything. And if the resulting document isn’t routinely being cited internally during discussions in meetings about a decision being made? Well, it’s because the document has already been forgotten and no one took it seriously anyway.
The second problem is easier to fix. It happens when a good and valuable strategy came together, and was acted on, but the process was treated as a one-off. The temporary infrastructure put together for creating the strategy document — lines of communications to the community, meetings, documents — are torn down or slowly start to decay in place. And then after some arbitrary period of time (three years, five years, or a new leader coming in and asking for one), the whole process has to be recreated from scratch.
The first issue with the “one-off” approach is that those lines of communications with community, the community ownership of the plan, the trackers for problems and proposals — those are all in and of themselves strategic assets that need to be maintained and nurtured.
And that matters is because of second issue. We are well beyond the point where a research computing and data strategy development can be a twice-every-decade, fire-and-forget sort of exercise. Plans need constant monitoring. Priorities need periodic updating.
These two HBR articles this week touch on different aspects of this, Kenny reminds us not to think of strategic plans are fixed, and to aim for insight as part of the process. Cantor urges us at every board meeting (for non profits; think advisory committee in our context) to revisit strategy, with a reminder of the current mission, and real discussion of a strategic issue every meeting.
Note that this requires an active and engaged advisory committee! In some future set of issues we’ll talk about creating such engagement. We can learn a lot from how the best nonprofits develop their boards.
Coding For Economists - Arthur Turrell
A nice resource aimed covering python aimed for the economist community covering econometrics, time series data, text analysis, databases, geospatial data, and more. Also covered is writing computational results up in blog posts, technical reports, and papers.
Digital humanities needs equality between humanists and technicians - Urszula Pawlicka-Deger
Give technicians their due - Catrin Harris, Research Professional News
One of the best reasons to not silo ourselves into “research computing”, “research software”, “research data mangaement” camps is that we’re all wrestling with similar problems. And it’s not just us.
Pawlicka-Deger’s article drives home the need for greater respect to technical career paths for digital humanities, and Harris’ for technicians quite broadly in research.
Ten simple rules and a template for creating workflows-as-applications - Roach et al.
Increasingly, scientific computations involve the orchestration of workflows containing many tools. This is especially true in data analysis, but even increasingly in simulation.
This paper gives ten rules for creating these workflow applications, paraphrased somewhat below:
This is a very valuable distillation of best practices, with concrete suggestions and sample code. It actually reminds me a bit of a workflow-as-application counterpart to recommendations for a twelve-factor app for web applications (which held up remarkably well during a decade of rapid change): configuration stored in the environment, admin processes/backing services run by their own workflows, logs as recording of events…
New data reveals the hidden impact of open source in science - Chan Zuckerberg Initiative
A large dataset of software mentions in the biomedical literature - Istrate et al, arXiv:2209.00693
Staff at CZI have put together and organized an amazing dataset of 67 million mentions of software, from something like 16-20 million biomedical papers. The dataset and code are available on Dryad and GitHub. The methods are fun to read (for instance, clustering to deal with the various ways the name of a piece of software is given in papers). I expect that the real fun will be to see analyses of this dataset start to come out.
6 Best Practices to Manage Pull Request Creation and Feedback - Jenna Kiyaso, Doordash
Some best practices from Doordash about handling PRs; it’s interesting to see how many companies and communities have converged on the same good practices:
Modern Data Stack in a Box with DuckDB - Jacob Matson
Delta Lake 2.0: An Innovative Open Storage Format - Matthew Powers
A couple of data lake articles this week:
Matson writes a fun post describing how to implement a current state-of-the-art data lake environment on a laptop or single node using Meltano for ELT pipelines, dbt for data transformations, Superset for data exploration and visualization, and duckdb as an embedded OLAP columnar database. This could be used as a way of playing with some new tools in a realistic-ish broader context, as a prototype for a data lake solution for a research group, or even as the first steps of building a real production stack.
For data lakes where a single columnar embedded database might not be enough, Powers gives the summary of what’s new in Delta Lake 2.0, a storage format based on Parquet files. I’ve always thought of this as being very mark part of the Spark ecosystem, because that’s how it started, but there are connectors to Presto and other data engines - and the ability to access or revert to previous versions is pretty valuable for our kinds of use cases.
The Rise of Fully Homomorphic Encryption - Mache Creeger
Explained from scratch: private information retrieval using homomorphic encryption - Spiral Privacy
Constellation: The First Confidential Kubernetes Distribution - Felix Schuster, The New Stack
In research computing and data we’ve always relied heavily on multi-tenant systems, whether our own or externally provided. And now we’re increasingly asked to support sensitive data.
These first two articles give a good quick overview of Fully Homomorphic Encryption (FHE). FHE is a possible solution when you want to provide querying or even simple analyses of sensitive data on untrusted systems. Data is stored encrypted, using a scheme that preserves certain mathematical operations, and those operations are performed on the encrypted data directly. The server never sees the unencrypted data (or even queries).
There’s substantial overhead for using FHE, not least of which is that algorithms have to be rewritten to only use operations which are preserved under the encryption, but it’s surprising what’s possible under this scheme.
Confidential computing is another approach, relying on hardware support for secure enclaves or trusted execution environments within the server, which can’t be accessed by the OS or other tenants (even root), in which data is decrypted and acted upon. The downside here is it requires the hardened enclave to be hack-proof, and software has to be rewritten to support that; the upside is that arbitrary operations can be run and the enclave-supporting hardware is becoming more common.
Rewriting code to support the enclave isn’t trivial, but there are libraries, databases, and increasingly entire stacks being put together to make the confidential computing aspects more easily adopted - in the third article Schuster describes Constellation, an entire confidential computing k8s distribution.
Six European computing centres will be hosting quantum computing sites.
A record-breakingly powerful gamma-ray burst 2.7 billion light years away measurably disturbed Earth’s ionosphere.
ar5iv, from arXiv labs: read (v1 of most) papers in reactive HTML5 instead of PDF by changing the ‘x’ in arxiv in the link to ’5’.
SQLite famously is “open source but not open contributions”. libSQL is a fork of SQLite which aims for wider community contribution.
Hello world in Python, going down through the Python VM to stack traces to Windows console and system calls through vector font rasterization and window management
A fascinating three part series on the steam engine and why it wasn’t invented earlier when several key pieces were known as far back as the first century.
Working with hexagonal grids.
TensorStore, a Google-developed “one stop shop” for efficient reading and writing of multidimensional arrays from Python.
Interesting blog post summarizing a paper about how software developers use CoPilot (and presumably similar tools). The paper distinguishes between acceleration of what the developer was already going to do, and exploration of what to do next.
Rendering galaxy clusters, black holes, and Saturn in Minecraft.
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.
This week’s new-listing highlights are below; the full listing of 193 jobs is, as ever, available on the job board.
Sr. Manager Software Development, Advanced Technology Group - AMD, Markham ON CA
AMD Advanced Technology Group is an entrepreneurial research and development team to build AMD’s future advanced platforms & products. Our teams work closely with outside customers and internal teams to develop hardware, software, and systems solutions into next generation computing platforms. As part of AMD Advanced Technology Group, you will have the opportunity build a winning team that will collaborate with internal teams & customers, explore new platform technologies, and lead the development of best-in-class hardware, software, and systems technologies that our customers will use for real-world problems.
Head of Data Coordination Platform, Human Cell Atlas - Broad Institute, Boston MA USA
Over the first five years of the project we, along with an international network of collaborators, have developed the Data Coordination Platform infrastructure for the Human Cell Atlas: a program funded by Chan-Zuckerberg Initiative that serves to create a comprehensive reference atlas of cells in the human body. The Head of the Data Coordination Platform will be one part product leader and one part program leader, responsible for ensuring our software meets the needs of this global community.
Research and Development Manager, Biomedical Visualization - Harvard University, Boston MA USA
Centre Manager, Cambridge Centre for AI in Medicine - University of Cambridge, Cambridge UK
Duties will include overall project management for the £7m grant, liaising with key personnel at the participating organisations, providing support in implementing and providing input for strategic planning, overseeing the operation and ensuring deliverables are achieved. The role holder will support the Centre lead across the range of their responsibilities including advising on policies and procedures; supporting post-grant awards and Centre finances; arranging Centre-related events; facilitating student recruitment and supporting other Centre members, and working closely with colleagues from across the University administration to facilitate the aims of the Centre.
Head of Software Engineering - Leidos, McLean VA USA
Leidos is seeking a Head of Software Engineering to join our team! The dynamic leader will be a key part of the Leidos team supporting Connected Automated Vehicle (CAV) research across the US through our work at the U.S. Department of Transportation’s (USDOT) Saxton Laboratory (STOL), located at the Turner Fairbank Highway Research Center in McLean, VA. Our team helps to develop emerging technologies to improve transportation safety, mobility, and environmental impacts. The STOL provides a variety of services to support the advancement and deployment of vehicle-to-infrastructure technologies.
Manager – Research Computing Infrastructure - Northwestern University, Evanston IL USA
The Research Computing Infrastructure team is responsible for Northwestern University’s High-Performance Computing infrastructure. This consists of a fleet of physical servers, back-end storage, operating system, integrated networking, parallel filesystem, scheduler, cloud-based infrastructure, and other associated infrastructure and applications. This could also include consulting with researchers on the best solutions for their workload with could consist of deploying and managing non-HPC infrastructure. As Manager, you will lead a group of system engineers responsible for managing the HPC environments. Such duties include developing staff through mentorship and training, prioritize and delegate tasks and projects, and tightly coordinate with Research Computing Services to deliver stable, consistent solutions for the Northwestern research community. In this role, fostering and maintaining a positive and inclusive work environment is essential. You will have the opportunity to work with university leaders and peer institutions, and develop and maintain essential relationships across Northwestern University schools and departments.
Principal Data Scientist (Lead Data Manager) - Premier Research, Remote CA
Premier Research helps highly innovative biotech and specialty pharma companies transform life-changing ideas into reality. We have positioned ourselves right in the middle of the action, targeting unmet needs in Rare Disease and Pediatrics, Analgesia, Neuroscience, Oncology, Dermatology, and Medical Devices. We’re looking for a talented and energetic Principal Data Scientist (Lead Data Manager) to join our Clinical Data Sciences team! Working at Premier Research means being an individual - you will be recognized for what you do and you will truly have an impact in an aspiring, empowering and caring culture where people truly work as one team. You can grow and contribute your expertise with colleagues who are genuinely supportive regardless of location or seniority. Premier Research is on an exciting journey - there is a true buzz throughout the company, so come and be part of it!
Head of R&D Data Foundations - Sanofi, Toronto ON CA or Cambridge MA USA
Sanofi has recently embarked into a vast and ambitious digital transformation program. A cornerstone of this roadmap is the acceleration of its data transformation and of the adoption of artificial intelligence (AI) and machine learning (ML) solutions. This has enabled us, to accelerate R&D, improve manufacturing and commercial performance, and bring novel drugs and vaccines to patients faster, all in order to improve health and save lives. You are an experienced software engineer who is interested in designing and developing comprehensive solutions to support and facilitate business operations. You have a strong understanding of back-end and front-end technologies and have experience implementing highly functional solutions that can scale.
Sr. Product Manager for Hybrid/Cloud HPC/AI Solutions - Pengin Computing, Remote USA
Penguin is looking for a Senior Product Manager, Hybrid and Cloud HPC/AI to to work within our Cloud and Services Business team. The Senior Product Manager, Hybrid and Cloud HPC/AI is a high-impact position where you will have the opportunity to signifantly influence the product strategy for a new Penguin product offering. You will shape the product vision, prioritize features, and refine solution definitions within a portfolio of offerings by understanding the nuances of your customers’ journeys. You will get to work within a rapidly growing business and have a large impact every day.
Group Lead, Computational Biology (Genetic Parts Curation) - Ginkgo Bioworks, Boston MA USA
The Genetic Parts Curation (GPC) team supports Ginkgo’s mission by making genetic design more predictable and robust, via curation of highly characterized toolkits of genetic elements (“parts”) to deploy in Ginkgo’s portfolio of engineered bacteria, yeasts, filamentous fungi, and mammalian cells. As Group Lead of the newly minted GPC, you will be responsible for defining Ginkgo’s strategic vision for genetic part curation, and leading an interdisciplinary group of scientists and engineers to realize that vision. Your group’s activities will include building strategic alignment on priorities for part curation across Ginkgo’s technical teams; collaborating with organism experts throughout Ginkgo to establish best practices for experimental part characterization; defining analytics and performing computational analyses to monitor and improve robustness of curated parts and genetic designs that include them; and collaborating with our Software, Product, and Data teams to define digital infrastructure required to curate genetic parts and facilitate their re-use.
Head of Data Management - NatCen Social Research, London UK
The Head of Data Management provides leadership to NatCen’s Data Management function. Working across NatCen they will direct our data management function, working with Research and Operations leads to ensure that data management work is achieved within the agreed cost, time and quality constraints. The Head of Data Management will lead on developing systems and processes for efficient data management within NatCen and will oversee data archiving
Lead Bioinformatics Programmer, Children’s Microbiome Center - Baylor College of Medicine, Houston TX USA
The Lead Bioinformatics Programmer will assist investigators in the Texas Children’s Microbiome Center with data analysis, database construction, and data presentations in areas of microbiology, microbiome science, metagenomics, and bacterial genetics. This programmer will apply existing custom bioinformatics software as well as prepare detailed specifications and algorithms from which programs will be written. Will design, code, test, debug, and maintain programs for bioinformatics applications as well as research, assess, import, configure, and customize third-party bioinformatics software.
Manager, Research Computing - Boston College, Chestnut Hill MA USA
This position is responsible for management of the University’s high-performance research computing functions. This position provides infrastructure support for the high performance computing cluster and develops structure to manage resource demand growth. Additionally, this position will provide consultation and support to Boston College faculty and principal investigators’ research computing project needs. Provides faculty and their research groups with consulting, user support, and training in relevant applications and tools, sharing expertise and support for research design, analysis, and grants. Supports users of research facilities and assists with managing, analyzing and producing projects.
Manager Informatics and Data - Prince of Wales Hospital - Prince of Wales Hospital, Sydney AU
Working closely with the Prince of Wales Hospital (POWH) Cancer Services Executive through the Cancer Services Operation Manager, this position will provide over-sight and strategic development of all information needs for the POWH Comprehensive Cancer Services located at the Nelune Comprehensive Cancer Centre, including development and monitoring of quality processes, and alignment of information services to the strategic directions and priorities of the Cancer Services.
Director, Research Infrastructure Product Line Technical Lead - Merck, Rahway NJ or Boston MA or Westpoint PA USA or Schachen CH or London UK or Prague CZ
We are seeking an energetic and collaborative person to join the Research and Development Sciences value team responsible for enabling data and technology products and services that accelerate our scientists’ ability to discover and develop innovative medicines that change the course of human health. As the Technical Leader of the Research Infrastructure Product Line (RIPL) you will develop and evolve a vision for the suite of products that enable the generation and ingestion of research raw data from internal and external sources and the storage, management and lifecycle of these raw data in their repository of record.
High Performance Computing Lead - Boston Children’s Hospital, Boston MA USA
Directing/Managing/Supervising the design, development, implementation, integration, and maintenance of research technology infrastructure and systems capabilities to support the organization’s business objectives. Leading the design, implementation, and management of security systems and redundant data backups. Ensuring the development of cost-effective systems and operations to meet current and future research requirements. Overseeing the analysis of research problems and leading the evaluation, development, and recommendation of specific technology products and platforms to provide cost-effective solutions that meet business and technology requirements.
Program Manager - Invenia Labs, Winnipeg MB CA
Our researchers and developers work on projects with real-world impact, using their interdisciplinary expertise in domains such as machine learning, optimisation, power systems, data science, software engineering, and systems architecture. As the key partner for the Chief Science Officer, you will be responsible for three key areas: strategy, prioritisation, and delivery. Work closely with other senior leaders to develop research visions and strategies, product plans and roadmaps to meet our ambitious goals. Be involved in generating ideas for enhancements to Invenia’s systems, including technical validation, business case development and planning.
Principal Lead Software Engineer, eScience Institute - University of Washington, Seattle WA USA
This position will be part of a new endeavor to create a collegial, creative team, collaborating with University researchers to improve efficiency and reproducibility of research outcomes. Responsibilities will include application design and implementation, research design and collaboration, problem resolution, projects, and providing status reports to SSEC directors. This position will provide a major role in creating and staffing project teams, and will lead, mentor, and coach SSEC teammates. The position reports to the SSEC Head of Engineering (UW title: Sr. Principal Research Scientist/Engineer).