Research Computing Teams Link Roundup, 18 Dec 2020
Congratulations, everyone; we did it.
If you’re at a University, 2020 is now or is soon to be officially at a close. For the rest of us, while there is some work remaining to be done, things are winding down. We made it to the end of 20-frickin’-20.
I started this newsletter together with you in January, which seems like a decade ago. It’s been a hard year for our teams, and a hard year for managers. We’ve had to keep things together and moving with the world falling apart; help team members through incredibly tough times and keep the research and researchers who depend on us going.
We’ve done incredible work, and because of what we’ve done as managers our teams are going to come out in late 2021 stronger than when this started. The trust we’ve built with our teams by seeing them through the tough times will make the team work even better together. Our improving and making more intentional our team communications, necessary for the abrupt move to all-distributed, will be of benefit in the years to come. Our upping our management and prioritization skills will help our team through whatever future challenges come our way.
I’ve learned a lot writing this, and I hope it’s been helpful to you to read this over the past year. Over 100 of you have subscribed and stayed subscribed through some mis-steps and mis-fires, and I appreciate it and your feedback. From your feedback I’ve learned:
- Almost all of us were flung into management without any sort of preparation
- Research computing teams are awash in technical information but almost nothing about any of the other aspects of running our teams, providing service to our researchers, or advancing our own careers
- There’s disappointingly little information out there about topics like funding in the very specific context of reseach computing
- Organizations are still trying to figure out how to organize research computing teams in their institution - centralized, embedded, distributed, or some combination thereof, and business models for same are even more confused
- Hearing each others questions during Ask Managers Anything was really popular, and we’ll revisit that next year
I think we all more than deserve a couple of weeks worth of rest. I’ll be back January 8th, recharged and ready, and I hope you will be too.
And now, without further ado, the last link roundup of 2020!
Managing Teams
Say “No” to Triangulated Feedback - Esther Derby
This one hits a little close to home this week.
Derby's article talks about the perils of "triangulated" feedback - team member A tells you something about team member B and you bring it to team member B. A team is a group of people who are accountable to each other in working to a common goal. By being a cut-out in these accountability conversations we short circuit these needed conversations, eroding trust, and give ourselves a bunch of worse-than-meaningless busywork to boot.
The reason this hits home is last week I actively made a situation between two team members significantly worse by interjecting myself too soon and too forcefully. I felt like I was doing the right thing - my intense personal preference is to avoid these situations, so taking decisive and immediate action "seemed" right to me - but it absolutely was not. In this particular situation, monitoring and checking in after a day or so would have worked much better. To be clear, being completely uninvolved is not a solution either, but we can't be a first- or even second-choice option for dealing with between-team-member conflicts except in extraordinary circumstances.
Why Capable People Are Reluctant to Lead - Chen Zhang, Jennifer D. Nahrgang, Susan J. Ashford, and D. Scott DeRue, HBR
In a study of 400 MBA students, three risks held people back from stepping up to lead, in projects or in taking decisive action:
- Interpersonal - risking friendship/collegial relationships
- Image - "I don't want to seem like a know-it-all"
- Accountability - "I'll be blamed for bad results"
As we try to make sure our team members have growth opportunities, and increase their scope by giving them more responsibility and project to manage, these are the key concerns they are likely to have (or, frankly, we are likely to have as we grow in our careers). To mitigate this, the authors suggest
- Go the extra mile to support risk-sensitive team members
- Manage conflict, and treat respectful conflict when it happens not as a catastrophe but a normal part of humans working together on things they care about
- Give (or take!) increasing responsibility in small manageable increments
Managing Your Own Career
Writing a Performance Self Review for Software Engineers - With an Example - Gergely Orosz
Orosz writes this in the context of getting ready for a performance review by your boss. Many of us don't have such reviews in any meaningful way. But this is a terrific process to go through every quarter or so, for yourself and for your team as a whole. It'll help you communicate your value to your boss, yes, but also your teams value to administration, potential new collaborators, and external stakeholders; it'll help you focus on the positive accomplishments and identify areas for improvement; and, more or less for free, it'll help you keep your resume up to date for that next opportunity.
Orosz suggests focussing on:
- Goals/expectations for the period ahead, and comparing to those from the period behind
- List your accomplishments (if you do this in succinct bullet points they also become items you can put on your evolving resume and LinkedIn page)
- Talk about the "how" - how you work with people, examples of helping people/teams, etc.
- Reflect on competencies
Tech Lead Management roles are a trap. - Will Larson
When I was asked at my SORSE talk if it was possible to be both lead developer and manager, I replied that anything was possible but it is really, really hard. The most stressed I've been in the last couple of years was when I've had both significant technical and managerial responsibilities - they are completely different skillsets requiring your mind to be in different kinds of places. Bouncing between the two is definitely playing the game in hard mode.
Larson agrees, especially for new managers:
The reality is that when you're trying to learn something brand new, like team management in this case, you're almost always going to be better off getting to focus on that area.
Product Management and Working with Research Communities
Collaborating with Someone You Don’t Really Know - Rebecca Zucker, HBR
Starting projects with new collaborators is a pretty integral part of our line of work. Zucker suggests clarifying things right from the start - these conversations can be a bit awkward but go way easier at the start of a working relationship. The good news is that I think our experience as managers makes it simpler:
- What are our goals and process for this project?
- Who will do what, and by when?
- What are our individual preferred working styles and strengths?
- When and how will we give each other feedback on our working relationship?
- What do we need from each other to do our best work?
I honestly think I wouldn't have been able to answer my part of the last three questions clearly before having worked as a manager for a while. It's only been since I've been working with colleagues with very different working styles, and making it my job to understand and accommodate those working styles, that I can actually recognize my own.
Introducing Cloudflare Pages: the best way to build JAMstack websites - Rita Kozlov, Cloudflare
We all need webpages for our organizations, teams, and projects. For a lot of us in the technical world, JAMstack solutions like Jekyll make the most sense - they rely on the tools we use every day. But if you don't want the pages to be absurdly slow, you can't just rely on github pages. Our team uses a slightly Rube Goldberg contraption involving GitHub, Travis-CI, AWS S3, AWS CloudFront, AWS Route 53, and AWS Certificate Management. It works really well, and is super cheap, but honestly was a pain setting up.
Cloudflare has a beta now for Cloudflare Pages which cuts out the middleman and builds and pushes your jamstack page directly to their CDN. Future versions will even support dynamic webpages using cloudflare workers (which I've been meaning to play with). Worth taking a look at if you don't already have a solution you like.
COVID-19 Teaching Experience: 17-313 Foundations of Software Engineering - Christopher Meiklejohn
A review of online learning in 2020 - Tony Bates
In the past year, a lot of you have learned a lot about online teaching and training, and those hard-won lessons are things that we'll all be able to build on in coming years. These two articles reflect on some lessons learned.
Meiklejohn gives some very hands-on reflections teaching a software development course in a hybrid mode over the past year, and the main takeaways are that it can be done and successfully, although rejigging in-person materials to work successfully with online tools is hard.
Bates' article outlines 10 higher level lessons learned, each of which has its owns short article:
- Online and blended learning will increase substantially post-COVID-19
- Support for instructors is essential for quality online learning
- We know how to do quality online and blended learning, but we can also learn from emergency online learning
- COVID-19 showed the need for more flexible assessment methods
- COVID-19 resulted in innovative teaching, but will it stick?
- We are beginning to see the advantages of media and open educational resources for teaching and learning
- More attention needs to be paid to online access and equity
- We need more flexible learning spaces
- Lessons learned for administrators
- We need more (and better) data
The structure and interpretation of scientific models - Konrad Hinsen
You don't hear it as much now but it was everywhere in the early 2000's - a new pillar of science, not just theory and experiment but computation! And then came data! And machine learning eventually came around too and eventually there were pillars everywhere and the whole thing seemed kind of silly.
Hinsen makes a better distinction, between observations and models, and those models can be empirical, or - the real purpose of science - explanatory.
Research Software Development
DRY is a Trade-Off - Moshe Zadka
A reminder that everything in our work is tradeoffs, and even really good "rules" are bad when applied too dogmatically. Having some partially-repeated code in your project is way better than forcing yourself into the wrong abstraction too early.
What I Wish Someone Had Told Me About Tensor Computation Libraries - George Ho
This is a really high level overview and categorization of array computation libraries (I'm sorry, as someone who just barely passed a general relativity course I just can't call them Tensor computation libraries) like PyTorch, TensorFlow, Theano, JAX, and others. He categorizes them by how they perform the three key classes of functionality of such libraries:
- Defining a computational graph
- Performing operations on the graph
- Optimize the execution of the graph
Ho wrote a similar categorization and explanation of probabilistic programming languages like Stan and PyMC4 which also looks good.
d’Oh My Zsh - Robby Russell
This isn't a new article, but it crossed my browser the other day, and it's a really nice overview of how a software project became a huge success with incredibly modest beginnings - initially not much more than a list of dot files - but by being useful right from the beginning and by very judiciously choosing additional functionality to implement.
In defense of blub studies - Ben Kuhn A pretty central tenet of this newsletter is that there's no magic to running research computing teams, it's all just about being deliberate and focussing on the basics. Kuhn makes a similar argument for developing software - know your tools deeply and well.
Research Computing Systems
Where do I go now that CentOS Linux is gone? Check our list - Jim Salter, Ars Technica
Ars Technica, of all places, has a good roundup of CentOS alternatives including some I hadn't heard of such as Springdale Linux and HPE ClearOS.
Emerging Data & Infrastructure Tools
GitOps Decisions - Ian Miell
Another central tenet of this newsletter is that a lot of tasks that seem like a lot of work are just making explicit a bunch of decisions that were previously implicit. And making things clearer and more explicit is a pretty central job of a manager.
Mielle describes the process of moving to gitops with even a quite simple setup. There's a dizzying number of decisions which have to be made and implemented - how many repos, where does config live, how do you handle multiple environments, etc. But these aren't new decisions, they're decisions that have been previously been made implicitly, and by going through the work of making things explicit you have everything documented and automated making development, deployment, and onboarding new team members incredibly easier.
Events: Conferences, Training
2021 International Workshop on Software Engineering for Computational Science - June 16-18, 2021, papers due early February 2021
We seek contributions from members of both communities that describe perspectives, research outcomes, and lessons learned (positive or negative) from the development of computational science software.
Random
A safe, minimal, Bash script template.
A really nice self-guided advanced compiler course from Cornell.
Even the things we take most for granted in computing weren't obvious at the time and needed to be invented. It took almost a decade to invent the else clause in "if-then-else", and it wasn't for lack of trying.
Nomad 1.0 for those with applications that are outgrowing docker compose but for whom kubernetes would be overkill (which is most people).
Starting August 20201, GitHub will require authorization tokens (rather than username/password) for all git operations. From an operations point of view it's interesting that they're choosing to schedule two brownout dates in June and July - a "shot across the bow" to make sure people are read for the hard August end date.
A lot of already delicate lock-free code is going to break when it moves to ARM.
Ever since the inception of the term there’s been a lot of talk of STEM fields. With the importance of social sciences and humanities increasingly clear in the era of social media and now with the pandemic, expect to start hearing about SHAPE fields - Social sciences, Humanities, and the Arts, for People and the Economy.
That’s it…
That’s it for the year.
Take care of yourself and those close to you over the holidays, and congratulations to you and your research computing teams for all you’ve managed to accomplish in the face of all you’ve had to face to accomplish it this year. Be well, and see you in 2021.
Jonathan
Jobs Leading Research Computing Teams
Highlights below; full listing available on the job board.
Quantum Software – Scientific Manager - Cambridge Quantum Computing, Cambridge UK
ou will come from a scientific background, but importantly will enjoy working on broader business matters including developing our teams, leading collaborations with our partners and helping to manage the many exciting scientific and operational projects we are working on.
This role will have a lot of variety and be key to the effective functioning of the division. You will be working very closely with the Head of Quantum Software in managing the division, giving your input into the technical and commercial direction. You will also use your knowledge and experience to help coach and support our scientific and technical teams in their delivery.
You will also lead and run various projects and initiatives of your own, which may be scientific, commercial or operational in their nature. In doing so you will look to implement project management techniques to shape and drive projects through their full lifecycle, balancing resource and capability requirements. You will also become a first point of contact to all other business unit heads as they support our development.
Biostatistics & Data Management Director - GSK, Weybridge UK
As the Biostatistics & Data Management Director, you will deliver medical/scientific excellence and dynamic leadership of data management and statistical activities on preclinical and clinical studies, health outcomes and other clinical evidence gathering activities conducted by GSK Consumer Healthcare.
You will lead and manage a team of experts in data management, data programming, biostatistics by providing technical/scientific expertise and dynamic leadership on best operational and leadership practices. You will be interacting with internal teams such as Category, Category Medical, Regional Medical, Clinical Development and GSK Pharma teams, as well as external groups and individuals, such as Investigational Sites, Contract Research Organizations, Health Care Practitioners, Government agencies, Health Care Organizations to help achieve company objectives and to build GSK reputation.
Director, Data Systems - BlueDot, Toronto ON CA
Reporting to our Chief Technology Officer, you will lead a team of enthusiastic and talented data engineers, data scientists and data infrastructure professionals. You will further expand this team and help bring new talent to the organization.
The role requires a strong technical leadership in data governance, modern data ingestion techniques, scale and automation of our data pipelines and cloud native data infrastructures. You will drive technical innovation by utilizing artificial intelligence, natural language processing and related technologies.
You will be responsible for overall design and implementation of advanced data products with a focus on data quality, integrity and security. You will closely collaborate with a multidisciplinary BlueDot teams and co-own business outcomes.
Research Computing Manager - Morgan Law (Recruiter), London UK
My University client is looking for a Research Computing Manager, or IT Manager with technical specialism in High Performance computing, working with the Infrastructure Manager to deliver highly available and effective services to their staff & students.
You will have experience of building, managing and maintaining highly available and performant HPC systems requiring in-depth knowledge of advanced IT Infrastructure, compute and storage technologies and research applications.
You will provide support and consultancy for school and departmental hosted services as well as providing third line technical support to the IT Service Desk and Service Delivery teams.
Manager, Scientific Software - Guardant, Redwood City CA USA
Guardant's Molecular Residual Disease (MRD) program is applying cutting-edge methods in next-generation sequencing (NGS) to develop blood-based, non-invasive approaches to detect early stage cancer recurrence using a variety of genomic and epigenomic markers. As a manager within the MRD team, you'll be responsible for leading a cross-functional team of bioinformaticians and software engineers dedicated to building the tools necessary to develop and operationalize clinical-grade assays. Consequently, the ideal candidate will have a strong foundation in bioinformatics principles and a background leading groups responsible for delivering documented, tested, clinical biotech production software.
Senior Infrastructure Engineer, High Performance Computing - Guardant, Redwood City CA USA
Guardant’s HPC team builds and operates the computational technology backbone of the company.
This includes scalable data storage that holds PBs of genomics data, high performance compute clusters running a custom bioinformatics pipeline in production and R&D environments, and the software infrastructure that hosts an ecosystem of services for internal data processing and external data integration. To facilitate Guardant Health’s fast growth in the next few years, the HPC team is looking for a strong technical engineer who can help maintain and help grow the HPC infrastructure during its aggressive expansion, while working with corporate IT, SQA and DevOps/SRE teams.
This role can be remotely worked part-time, but requires a very hands on, on-premise presence when on rotation, minimally.
Manager, Research Cloud Development and Operations - University of Melbourne, Melbourne VIC AU
The Nectar Research Cloud is powered by OpenStack and provides computing resources to researchers across Australia. We are looking for an exceptional person to lead the operations, maintenance, expansion and continual improvement of the Nectar Research Cloud services and manage a diverse team of expert cloud DevOps engineers. The successful candidate should be passionate about the ongoing DevOps practice and running the large distributed system that is the Nectar Research Cloud, enabling the next generation of research capabilities across many research disciplines. The position will liaise with the national Research Cloud node partner operators at remote sites and coordinate their operations.
Head of Research Software Engineering - Exeter University, Exeter UK
We are seeking an experienced leader to head up our new Research Software Engineering (RSE) Group. This is an exciting opportunity to support the further development of the RSE role in UK academia.
RSEs will make a vital contribution to the University’s research portfolio by developing and applying professionally usable software tools to address real-world data science, modelling, simulation, and other challenges across the spectrum of computational and social science. The group will support cutting-edge research, and its impact across the university.
As the Head of Research Software Engineering you will shape the future development of research software engineering skills at the University, and play a pivotal role in delivery of Exeter’s research e-infrastructure more widely.
Senior Research Computing Officer - Durham University, Durham UK
The post-holder is expected to develop and draw upon technical and domain context of the provision of research computing systems, including the Hamilton HPC system, to ensure ARC provides appropriate research computing platforms, through forming strong collegiate research relationships with Academics at all levels across the organisation. The post holder will provide specialist knowledge to colleagues within the research community, consultation and ensure effective knowledge transfer both across the institution and represent the University externally on matters of expertise. The post holder will also have the opportunity to provide and support services ARC provides within the regional/national context, such as the EPSRC tier-2 HPC system “Bede”, work on physical infrastructure deployments, working with teams in CIS, Estates and/or external providers as appropriate.
Product Manager, Health Equity Tracker - Morehouse School of Medicine, Atlanta GA USA
In this role, you will work closely with a small team of engineers to:
• Understand the process by which health equity policy is made
• Engage with and learn from a wide variety of relevant parties including epidemiologists, legal scholars, health care providers, lawmakers and more to continually improve the Health Equity Tracker product
• Devise, plan, execute and launch new features to better serve the needs of policy influencers identifying and solving health inequities
• Balance competing priorities and interests to maximize impact and advance the cause of health equity
• Develop a strategic vision for Health Equity Tracker for COVID-19 and beyond
• Ensure a successful project handoff from the Google.org team
Sr. R&D Staff Software Engineering - Oak Ridge National Lab, Oak Ridge TN USA
We are currently seeking qualified applicants for a Software Engineer Scientist in the Proliferation Detection and Deterrence Section within the Nuclear Nonproliferation Division. The Proliferation Detection and Deterrence Section focuses on the R&D for detection of potential proliferation activities associated with the nuclear fuel cycle; forensics and incident response following a nuclear detonation; engineering, testing, and evaluation of systems that support IC and counterproliferation activities; and implementation of assessments and programs seeking to understand, assess, and/or diminish the attractiveness of special nuclear material (SNM). Specifically, this position will reside in the Detonation Forensics and Response Group. This group focuses on nuclear detonation fallout prediction for support of improved capabilities in detonation characterization and forensics and provides enhanced tools to support incident response and consequence management.
Director, Biostatistics - CRISPR Therapeutics, Cambridge MA USA
The Director of Biostatistics is responsible for leading, developing and implementing statistical solutions to optimally support all phases of clinical trials and decision making. The successful candidate will function as the lead statistician across clinical development programs, and may supervise study level statistician in-house and at CROs and be accountable for all biostatistics study deliverables. The role will include key trial statistician responsibilities including statistical design of all phases of trials, authoring of SAPs, and conducting just-in-time analyses and data exploration. Industry experience is required; experience with regulatory filings strongly preferred.
Reporting to the Head of Biometrics, the successful candidate will have the opportunity to take a leadership role in the Biometrics function at CRISPR, and must possess the desire and ability to work with study teams up through and including leadership responsibilities.
Manager, Computational Infrastructure - CRISPR Therapeutics, Cambridge MA USA
Develop and maintain the underlying compute infrastructure for processing and analyzing large-scale genomic data. Lead the design, development, and maintenance of bioinformatics software and pipelines. Work with research scientists to convert research prototypes into robust scientific software optimized for performance, scalability, and best coding practices. Train and guide bioinformatics engineers in tools, database architecture, software architecture, analysis methods, and pipelines. Study pertinent literature to identify, evaluate, and incorporate new bioinformatics applications and computer algorithms to improve CRISPR Therapeutics bioinformatics pipelines. Ensure basic quality assurance of data and software generated by designing and implementing test methodologies
Data Science Manager, Intact Lab - Intact Insurance, Toronto ON or Montreal QC or Québec QC CA
Manage a multidisciplinary team in designing, developing and applying Machine Learning models (pricing, telematics program, fraud, process automation, etc.); Help build the industry’s best data science team by keeping our specialists motivated and engaged in an agile environment and helping them grow their careers; Collaborate with partners across the company to develop our AI vision, identify and prioritize AI use cases and implement solutions.
Sr. or Principal IT High Performance Computing Systems Technologist - Raytheon, Tucon AZ USA
Work across IT functions to standardize processes and procedures. Work with Cybersecurity professionals to maintain security compliance requirements. Provide documentation and training on advanced IT HPC processes and procedures. Be a mentor to junior team members. This candidate must be able to obtain a final Department of Defense (DoD) SECRET security clearance and have the ability to obtain required Department of Defense Directive (DoDD) 8140 / 8570 Certification requirements (CompTIA Security+ CE or equivalent certification and an Operating System Certification) within 6 months of hire date.