Research Computing Teams #140, 8 Oct 2022
This week I got a question about time-and-materials billing. So I want to talk a little bit about how consultants and other expert services firms increasingly see fees-for-service, and how we should see them to be of the most value to our community. I suspect this topic is going to be even more controversial than specialization is good and we should do more of it (#114), or making the case that we are vendors too (#123), or how too much of the framing around strategy discussions is self-important puffery (#130). But charging fees is an important topic if we want to sustain our teams and the research they support.
(By the way, the question came from a call from a member in our community, so a reminder that I’m happy to jump on a 30 minute advice or coaching call with anyone in our community who wants one - just pick a time).
Read on, or go straight to the roundup!
First, I want to emphasize that charging fees for offerings is not only ok, it’s good. I’d like to see more if not most of our teams have it as part of their revenue mix; how much of a part will depend on how they’re funded. We and our institutions talk about sustainability models quite a bit, but too often charging money for goods and services seems to be the sustainability model that daren’t speak its name. Which is odd! People are quite happy to exchange money for goods and services they value. There are core facilities all over your institution which get much of their revenue from fee-for-service quite successfully.
While transactions aren’t the only model by which we should provide services, they lend an undeniable clarity to our efforts. We know our work is valuable to researchers if researchers are willing to pay for it. And if researchers wouldn’t be willing to pay fees at a sustainable level for work that’s currently being done… well, that raises the very useful question of whether the thing being done is worth doing at all in the current manner.
Second, I’d like to point out that of all the fee models available to us, time-and-materials is widely understood to be the one to avoid whenever possible (and it’s possible way more often than you’d think). And yet too often it’s the default.
Time-and-materials downgrades collaborative relationships, and its incentives are inherently pathological. Because it distorts collaborative relationships into one of staff augmentation, it also traps teams who rely exclusively on it into being more of a temp agency than a centre of excellence (#133). That temp agency model is a useful way to address some needs in an institution, but it oughtn’t be all we aspire to.
Let’s focus on the second point today, time-and-materials.
The thing is, everyone hates time-and-materials billing. The client hates the uncertainty of not knowing how much something is going to cost; technical staff hate tracking their time at the needed level of granularity; finance and administrative staff hate chasing down the hourly numbers from staff and creating bespoke bills for the engagement.
Everyone hating it would maybe be ok if it generated amazing results, but it doesn’t. It erodes professional relationships, and has well-understood pathological incentives which include trapping a team in a particular way of working.
Time and materials billing erodes a trusting working relationship. Instead of focussing on the work, now the client feels that they have to oversee the project, keeping an eye on the line items (“Did it really take three hours to do this? Why didn’t you just..?”). And of course they feel that way. That’s how the engagement is structured. That’s the role they’ve been placed in by how this service is being offered.
And billing for time’s pathological incentives are pretty widely understood. I can’t even count how many times I’ve seen variations on the following scenario: a new tool (piece of equipment, software, external service provider) would make it possible to deliver a service more efficiently, making things better for the staff (less repetitive work), researchers (faster results), and institution (more staff time free to develop other services and support research in other ways). Unfortunately, the initial investment for the tool isn’t an eligible expense under the grant. For a team charging fixed fees, buying the tool anyway is a no-brainer; a clear business case can be made, everyone’s better off. For a team charging hourly, not buying the tool is a no-brainer. There’d be up-front costs they’d have to cover and their chargeable hours would drop - their budget would take two hits. Can’t be done.
Billing for fixed short periods of time, like hourly or even daily billing, forces a teams hand against making efficiency gains. It actively penalizes getting more efficient at doing something. It makes implementation details of how the work gets done as important as the actual service provided and the impact of the work. It’s the perfect analogue to the completely backwards “utilization is the most important metric” approach to running systems, but for running a team. Measuring how “utilized” someone is with billable work instead of their impact on institutional research or R&D.
Back in #128 I talked about bundling up the expertise in our teams and exposing them to researchers as products, productized services, or products, as in the diagram below. Having a spectrum of offerings not only makes it easier for incoming staff to have career paths where they can grow, it makes it easier to turn initial rare novel engagements into increasingly efficiently delivered services.
On one end of the involvement spectrum is using the team’s expertise to execute on efficient procedural work. That work lends themselves to products: turnkey, cookie-cutter operations that (as far as the client is concerned) could be automated. Upload some data, get a result in a week; get a fixed amount of storage; spin up a database; deliver one of a fixed set of trainings. Something the team has done a bunch of times before and is down to a science.
Somewhere in the middle is productized service — think of a pre-written statement of work with some fill-in-the-blanks for a particular engagement. Something the team has developed an SOP for over time. An initial consult for a data analysis project or experimental design. Architectural review. Grant-writing support. Customize a data collection web app, within fixed parameters. Each one is a little different in the details, but there’s a reproducible process with a clear deliverable.
For both products and productized services, some kind of fixed price is natural and beneficial for everyone. I like to distinguish between fixed prices the client can just look up (“catalog price”) and one which is given after some discovery exercise to understand the problem (“upfront price”). In general, catalog price is better all around if you’ve done this enough times to be confident on its reproducibility. For productized services that are still new, or for a client that wants a big change to the usual scope, though, doing the fixed price case-by-case can be useful.
Note that these fixed-price approaches put the risk of cost overrun on your team rather than on the researcher. It’s uncomfortable to hear this, but that’s good for science. That’s where the risk should be. Your team is the one that’s done this kind of work before, it’s the one that should best understand what’s involved. This encourages careful scoping and clear expectations around the product or service, which is also good for everyone. It also encourages doing a bit of stage-gating for risk management (e.g. a mandatory initial data quality step before any data analysis engagement, or a mandatory scope discovery engagement before the custom web app engagement). Again, good all around. Note that you will need to incorporate into the fee for the product or productized service the amortized cost of the risk you are taking off the shoulders of the researcher.
On the far end of the involvement scale are engagements for things there’s no SOP yet. More open-ended engagement. These can be really exciting - embedded team members working with novel approaches. Depending on the kind of work, some kind of upfront price may be possible, but often not. Another approach, especially in some sort of advisory role (overseeing some trainees doing some technical work) can be some kind of (non-exclusive) monthly retainer fee for being available for meetings and to offer advice and help as needed, subject to some parameters.
It’s possible to implement something a bit like staff augmentation in this category - rent a data scientist/software developer/etc for some project; have them embedded in the team, and charge by the hour. This can be the most useful service you can offer in some situations, particularly when there’s completely new work being done! But it’s a slippery slope. Once you start routinely offering work in this mode, especially billing for time, it is way too easy for a team to get sucked into doing this more and more until it eats the entire practice, leaving room for nothing else. It can be managable, but it takes real effort to maintain other forms of engagement in parallel with this model.
Why should that matter? Remember that the purpose of all of our teams is to have as big an impact on research in our institutions and communities as possible. Unintentionally ending up doing mostly staff augmentation work actively hinders us in making a large impact, just as falling into the trap of being a utility does! Once it becomes a common offering, staff aug work generally stops being “cutting-edge expertise”, because by definition the vast majority of researcher needs for our staff aren’t cutting edge. It ends up being routine, commodity, agency work, particularly when we’re billing for hours and materials. If we’re not actively trying to build other kinds of engagements, our real experts we were so happy to hire start getting bored and looking elsewhere.
Even if it was fairly cutting edge, the impact we make this way doesn’t scale - we aren’t building reusable things, we aren’t developing new knowledge, we’re just building one-offs. We want to have as much impact on research as possible, so this should worry us.
In addition, if we’re mostly doing staff augmentation, there’s no obvious career growth ladder for our team other than as project managers (which is fine, but it’s only one of a number of options that should be available to them). All we can offer them is whatever the next need is, which is probably routine and probably pretty similar to what they were doing.
And this is a bit of a perfect storm with the issues with hourly billing. Hourly billing doesn’t ask much of us, it’s easy and it pushes the risk to the researchers, but it doesn’t leave you any room to build or buy things that would make the team more efficient, or grow their capabilities.
There’s also a combination of effects on researcher relationships. Once researchers come to see a team as temporary hired hands for routine work, they’re going to have a hard time seeing more than that. The team starts to lose its ability to develop into a centre of excellence, where they come for advice or to train their own staff, or where their trainees could start to think seriously of going after their studies.
There’s nothing wrong with the staff augmentation model, when carefully and intentionally managed. But I’ve seen it spiral out of control too often.
We want to build a centre of unique expertise, and have broad impact on research by building tools, solutions, and procedures that can be efficiently and scalably applied. That means productization, it means preferring knowledge transfer and advising/overseeing to hands-on routine delivery work, it means focussing on impact over utilization of inputs, it means focusing on short engagements that produce new knowledge or reusable results, and it means being wary of the traps of time-and-materials and staff augmentation/embedding models.
Resources I really like for this are:
- Managing the Professional Services Firm - David H. Maister; the second half of the book on professional service governance doesn’t really apply, and substitute “profit” for “sustainability” everywhere else, but it’s a great overview on how professional services firms think about their offerings.
- I like the diagram in this HBR article as a heavily beefed-up version of the chart above
- Here’s a review paper on productizing services.
And now, on to the roundup!
Managing Teams
Every Achievement Has a Denominator - Charity Majors
After an amazing run at Mars, India says its orbiter has no more fuel - Eric Berger, Ars Technica
Our teams tend to read articles and compare ourselves unfavourably with others in tech, or with those at huge institutions. That’s wildly unfair. Our teams typically punch well above their weight. As Majors points out, every achievement has a denominator. What matters is the impact we have for the investment our institutions or communities have put into us.
Relatedly, Berger’s article. From one set of teams that’s expected to have outsized science and tech impacts on a shoestring budget to another, our collective hats off to the Indian Space Research Organization as their Mars Orbiter ends its mission after six years, on a total budget of just 25 million dollars (!!)
Research: Men Are Worse Allies Than They Think - Smith, Johnson, Lee, and Thebeau, HBR
A gentle reminder that we all pretty much always see ourselves as doing a pretty good job, regardless of whether or not that’s true. Presented here is the results of a large study focussing on the support of women in the workplace.
We men consistently rated men as being better allies or public advocates for gender equality than women did, and men reported seeing typical discriminatory behaviour against women (speaking over them, not giving them credit for their contributions, being asked to do clerical work, questioning their emotional state, dismissiveness….) at startlingly lower levels than women saw them.
What Makes a Great Manager? - Abi Noda
What Makes a Great Manager of Software Engineers? - Kalliamvakou et al, IEEE Transactions on Software Engineering
Noba writes up a blog post about a mixed-methods study of a set of developers and managers at Microsoft by Kalliamvakou et al. There’s a lot going on here, and it’s a bit difficult to summarize. The results end up being consistent with the Google Project Oxygen work I cite frequently, but they came at things quite differently
However, the research found that while a “sufficient level of technical knowledge is necessary”, technical proficiency ranked close to bottom in terms of importance compared to other attributes.
I think key here is that they need enough technical knowledge to perform their facilitation and support work, and depending on the team (and whether or not there’s a strong team lead) that many not have to be very much.
I also like how they distilled the important attributes down to this diagram, which is consistent with themes you’ve read here over the past years:
Technical Leadership
Use standups to spark collaboration (instead of wasting your time) - Harriet Wilmott
I’ve seen this same dynamic play out on a number of teams with standups, but also with one-on-ones. Using precious synchronous-meeting time to go through rote status updates is a scandalous use of resources. Status updates can be given asynchronously. If people are getting together for meetings, it should be for some more important purpose.
Wilmot suggests that good outcomes for a regular standup could be something like these:
- Someone is working on a problem someone else can help with, and that conversation is started
- Something’s taking larger than expected, and the team talks about how to adjust the plan
- Better shared awareness on the team about what’s happening
- Regular team interactions which make people more willing to talk to each other and ask for help
And if that’s the case, the meeting should be designed to support those goals. Routine status updates can be given in a tool, and then the standup can be used used to have higher-level discussions that take those as inputs. There are a bunch of useful tips (including troubleshooting ideas) in the article, and it’s worth reading.
Does Stress Impact Technical Interview Performance? - Chris Parnin
In #33 we saw the paper by Behroozi et al, with authors including Parnin, which demonstrated that the canonical whiteboard technical interview tested for how stressful the candidate found the situation than technical acumen, and that there was strong disparate impact on women and people of color.
In this follow-on post which I missed at the time, there was one simple way to greatly improve matters - just let the candidate solve the problem in private in a closed room.
Managing Your Own Career
Why Some Feedback Hurts (and What To Do About It) - Ed Batista
We talked about getting better at getting feedback recently, but I didn’t really cover the emotional component of it - getting feedback can hurt in the moment. And not only is that no fun, if you’re not careful you can react poorly in the moment, which can damage likelihood of getting more feedback in the future.
Batista talks about this in some length here; there are some things which trigger emotional reactions. There’s a few models for what causes this; one is that it calls into question some bit of our identity, or the relationship with the person giving it, or it feels untrue so we feel indignant. Another model is that it attacks something about our status, certainty, autonomy, relatedness to the other person, or seems unfair (SCARF).
Being aware of this, and ready for it, and paying attention to your feelings can help. As can postponing any immediate reaction to the content of the feedback, and accepting that the feedback is data and you don’t necessarily need to react to every data point. Batista gives some more suggestions in this article.
Build Your Career on Dirty Work - Stay SaaSy
In #70, we looked at Kaplan-Moss’ article, Embrace the Grind, where just doing the dirty work that no one else wanted to do could suddenly unlock a lot of possibilities. This article points out some of the other advantages of doing the dirty work - because no one else is willing to step up, there’s probably lots of low-hanging fruit to make improvements, and it can have huge impact.
If there’s something holding back your team, or your community, rolling up your sleeves and just grinding through the dirty work and improving things along the way can be good for the team, the community, and your own career.
Research Software Development
A Flexible Framework for Effective Pair Programming - Raymond Chung, Shopify Engineering
Shopify has thought a lot about pair programming; it’s a key part of mentoring juniors there, and they use it elswehere in the company as well. This article describes how they think about pair programming: The different styles
- Driver and Navigator
- Tour Guide
- Unstructured
And the different kinds of activities
- Code reviews
- Technical design and documentation
- Writing test cases
- Onboarding
- Bug-hunting
In any combination of cases, setting an agenda and some rules of engagement, and communicating well (including open-ended questions and positive statements) really helps.
Research Data Management and Analysis
Sharing GWAS summary statistics results in more citations: evidence from the GWAS catalog - Guillermo Reales, Chris Wallace
Another piece of evidence that sharing data is good for the data sharers:
We found that sharers get on average ~75% more citations, independently of journal of publication and impact factor, and that this effect is sustained over time.
The struggles of building a Feed Reader - Jack Evans
A recurring theme here is that software, data management, and systems as inextricably linked, and can’t be considered independently. Evans describes the challenges implementing a mature technology - an RSS/Atom feed reader. The biggest challenge? Messy data in responses.
Emerging Technologies and Practices
Discovering novel algorithms with AlphaTensor - DeepMind Blog
This is a pretty cool result - by turning finding optimal calculations into a game, the folks at Alphabet’s DeepMind used AlphaZero to find a submatrix-multiplication algorithm that beat Strassen’s matrix multiplication algorithm (which held the record for 50 years), and found versions for larger chunks of matrix. Even cooler, it could create variants for particular hardware.
This isn’t (despite Alphabet’s inevitable hype) the first time deep learning methods have been turned to linear algebra, and nor is the result likely to be super useful (no one implements Strassen’s algorithm, after all). But beating the Strassen algorithm is a genuine applied math milestone, and it does suggest the possibility of having deep-learning powered compilation tools help make significant changes to how some computations are performed.
Bath University adopts cloud HPC - Scientific Computing World
Nimbus technical specifications - University of Bath HPC Support Team
We’ll likely see more of this, even for institutions which have significant on-prem resources; Bath has adopted Azure for most HPC, with everyone getting sizable amounts of storage, Cycle Cloud being used to provide standard cluster-like experience for those who want that, and grants available through departments, through resource allocation competitions from the institution, of purchased through grants.
I’ll be really interested in following this to see how it actually works and what the pain points are.
Random
Adding hyperlinks to images of handwriting.
Interesting article describing how game development has some challenges with using Git - huge asset sizes, nontechnical stakeholders are key contributors and reviewers, testing for compliance, and proprietary data formats make git or other “standard” software version control systems a challenge.
A terminal-based emoji picker: smoji.
A nice illustrated tutorial on how Stable Diffusion works.
Course materials for a course on full-stack deep learning, which gives all the messy parts around the fancy model - data management, deployment, project management, development infrastructure, and troubleshooting - their due.
Making the point that Software Bills of Materials is a good start, but given the dynamic nature of software, non-deterministic build systems, and the lack of clear attestation mechanisms, it is only the beginning.
Intel is moving towards large-scale quantum chip production, which explains why Sapphire Rapids is stuck superimposed in a state of simultaneously shipping and not shipping.
Building 32 bits of magnetic core memory.
That’s it…
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
About This Newsletter
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.
Jobs Leading Research Computing Teams
This week’s new-listing highlights are below; the full listing of 190 jobs is, as ever, available on the job board.
Data Deidentification Data Analysis Manager - PwC, Remote USA
As a Manager, you’ll work as part of a team of problem solvers, helping to solve complex business issues from strategy to execution. Act to resolve issues which prevent effective team working, even during times of change and uncertainty. Coach others and encourage them to take ownership of their development. In this role, the Data Analyst will be working with Practitioners and functional teams to onboard data into the appropriate analytic data models; managing the technical elements of the data de-identification process; validating that deidentification and aggregation techniques have properly anonymized content; and ensuring that the business views of the data meet the business needs of analytic end users.
Product Manager - HPC Storage System - Lustre - HPE, San Jose CA or Longmont CO or Minneapolis MN or Spring TX USA
Develop and own the product strategy for HPC Storage at HPE; lead project execution and initiate product improvements to meet business goals. Work closely with engineering, executive management and other stakeholders across the organization to cultivate and drive the product strategy and roadmap for your product. Drive Executive approval of plans for the HPC Storage product lifecycle from concept to launch through management of people and projects.
Manager Research Computing Support Services - Northwestern University, Evanston IL USA
As Manager, Research Computing Support Services, you will manage and lead a team of computational consultants/facilitators in delivering computational support and services to researchers across Northwestern. These services include providing documentation, training, consultations, support to researchers, and partnership on research projects requiring the use of high-performance computing, high-memory analytics environments, and cloud resources. Through active partnership with the research community and staff, you will capture researcher needs, ensure alignment of services with Northwestern’s research goals, and oversee service improvement projects alongside other IT units. You will be responsible for training efforts to advance researchers’ skills in using computational services. You will manage and lead a team of computational consultants/facilitators in delivering computational support and services to researchers across Northwestern, while guiding professional development and delivering regular feedback. You will partner with colleagues and oversee service improvement projects. Additionally, you will identify and measure performance indicators of computational research support services
Director - Data Science - Chan Zuckerberg Biohub, San Francisco CA USA
We are seeking an outstanding Director, Data Science to join our Platform team and lead the Data Science team, reporting to the President of CZ Biohub. The Director of Data Science is responsible for directing the efforts of a dynamic team of approximately seventeen data scientists, in coordination with the Biohub’s internal research groups and technology platforms, including quantitative cell science, infectious disease, genomics, mass spectrometry, bioengineering, and computational microscopy.
Software Engineering Team Lead - RStudio, Remote USA
We are seeking a Software Engineer leader to join the RStudio Connect product team. Specifically, the team that ensures Connect integrates well with the diverse and cloud-centric IT environments our customers operate in. Help us make the product setup and configuration easier, integrate with different IAM protocols, increase observability, and add features to better leverage cloud infrastructure primitives.
Bioinformatics Manager - Moderna, Cambridge MA USA
Moderna is seeking a highly motivated bioinformatician with project management experience to work as part of a highly collaborative, multi-disciplinary, and fast-paced team. The successful applicant will design mRNA sequences computationally using our state-of-the-art sequence engineering pipeline and oversee a team to provide mRNA bioinformatics support to our platform and therapeutic programs.
Director of Engineering, Healthcare Research - Flatiron, New York City NY or San Francisco CA or Remote USA
In this role, you will lead an initiative, made up of several engineering teams, to equip physicians with the information and tools to deliver high quality care to America’s oncology patients. Reporting to the VP of HC Engineering, you will provide technical management and guidance while working cross functionally to set the direction of the business line.
Project Manager – Bioinformatics, Aparicio Laboratory - University of British Columbia, Vancouver BC CA
The post holder will support and work closely with Dr. Samuel Aparicio, leader of Molecular Oncology, UBC Nan and Lorraine Robertson Chair of Breast Cancer, in managing breast cancer research projects, coordinating international multi-PI research studies and interacting with funders and advocates in reporting the activities of the biology and drug development program. They will also support the bioinformatics activities of Dr. Aparicio’s research program. Preference will be given to candidates with experience in bioinformatics and cancer research. Ability to articulate science clearly in a written form and a high degree of independence and self organization are core skills for this post. More information on Dr. Aparicio’s research group can be found here: https://aparicio.molonc.ca.
Research Software Engineering Manager - University of Leeds, Leeds UK
As the Research Software Engineering Manager, you will own and be accountable for IT’s strategic partnership with researchers across campus. You will build, develop and lead an established team of RSEs to meet the strategic challenges of our research partners; engaging with stakeholders from across the University including researchers, educators and other teams in IT. The existing portfolio includes a wide range of research domains including Artificial Intelligence, Bioinformatics, Computational Fluid Dynamics, and Data Science, and this is expected to expand in the coming years into a wide range of emerging research areas.
Software Engineering Senior Manager - Dassault Systemes, Brisbane AU
As a Software Engineering Senior Manager, you will work as part of a global software team; collaborating with geologists, mining engineers and application owners to design and implement software solutions for the mining industry.
Research Computing Manager - Swarthmore College, Swarthmore PA USA
The Research Computing Manager serves as a technical expert on the College’s research and high-performance computing infrastructure and is responsible for maintaining existing capabilities, developing new functionality, and supporting faculty and student work. The Research Computing Manager will maintain core systems and software along with supporting the specialized needs of campus researchers and instructors via individual consultations and classroom presentations. This position will collaborate closely with faculty, staff, students, and partners at other institutions to provide broad technology support for operational issues, research, and teaching.
Director Advanced Data and Storage Management (HPC) - Princeton University, Princeton NJ
The Director of Advanced Data and Storage Management reports to the Associate CIO for Research Computing and manages the group responsible for the vision, design and support of data storage and management for advancing innovative research at Princeton University. This role will provide leadership to the implementation and support of the TigerData service, a comprehensive set of data storage and management tools and services that provides storage capacity, reliability, functionality, and performance to campus. To successfully implement TigerData, this role will closely partner with the Director of Research Data and Open Scholarship in the Princeton University Library.