Research Computing Teams #110, 19 Feb 2022
Hi!
I won’t lie; when I started offboarding myself from my current job a few weeks ago I was a little anxious. I was pretty sure that team members would step up and learn quickly. It felt relatively unlikely that we’d discover gaps that could threaten near-term milestones. One worries nonetheless.
Instead, I’m really impressed with how well everything has gone. If anything the team has come together even closer, rallying to fill such holes as my departing creates. There’ll certainly need to be another hire - the team is losing a person’s worth of effort, after all. But the big picture work, both technical and stakeholder facing, is in extremely capable hands. I’ve felt increasingly superfluous this past week. That’s the best possible outcome.
But that also means that I’ve been surrounded all month with examples of things I could (should!) have been delegating and documenting earlier. Activities I thought only I could perform are in fact somehow getting performed. Things I could have usefully have documented ages ago, are only now finally getting written up. It’s not a disaster, nobody died of it. And yet there were real missed opportunities for team member growth and for me to have spent more time focussing on strategic goals.
That’s regrettable, but I’m not raking myself over the coals for it (much). Managing is like performing research, or building technical solutions. You never get to a point where you’ve learned enough to stop honing your skills. There’s only continued growth, or stagnation. I’m leaving this job, project, and team in better shape than I found it. I’m leaving the role a better manager than when I arrived, too.
With that note (this newsletter will stop being all about me next week, I promise!), on to the roundup!
Managing Teams
The Inverse Relationship of Quality and Quantity of Candidates in the Hiring Funnel - Nadyne Richmond
In an increasingly tough hiring market, we can’t just post a job ad somewhere and hope that we get a flood of applications from amazing candidates. That approach works in faculty and postdoc searches, where the jobs are incredibly scarce and applicants are plentiful. But research computing is different. The world’s awash in technical job openings. A “Fire and forget” approach to getting candidates for jobs isn’t enough.
Richmond points out that the quality of candidates for her job is much higher for the labour-intensive, lower-quantity methods of recruiting candidates - such as candidates like the manager contacted directly, come from internal referrals, or were contacted by trained recruiters.
As we progress in our career we can build a network of people we know would be good hires or who we trust to recommend good hires. But we can only know so many people! And we want to be able to hire from outside of our own network, if only to make sure we’re getting broader perspectives and expertise into the team.
What have you found works for you? Do any readers in academia have any experience with using recruiters to hire research computing and data team members? What other approaches have ben successful?
Managing people - Andreas Klinger
This is a bullet-point list aimed at new managers, and so there’s some basics here - but it’s always useful to review the basics, and we always have new members joining the list (hi!) including new managers.
I like this list because it expresses a lot of ideas that have come up on the list before, but in ways I haven’t read. I wouldn’t necessarily express all of those thoughts this way, but it’s useful to get another take on ideas. Some highlights:
- As a manager, everything is your fault
- You manage processes; you lead people
- Processes are expectations made explicit
- In every discussion/project/problem/situation, it needs to be clear who makes decisions
- Avoid back and forth
- Trust through transparency
- Don’t confuse autonomy and abandonment
I think the last one I chose, “don’t confuse autonomy and abandonment”, is particularly relevant in academic areas of research computing. We tend to expect and grant a large degree of autonomy - which is great! But that doesn’t mean not giving our team members what they need to succeed, including as much context and, yes, guidance and direction as they require. The trick is not to always be giving them more guidance and direction than they require.
8 Tactics to Help Manage Perfectionists at Work - Alexandria Hewko, Fellow Blog
Our community, at the intersection of research and tech, tilts towards perfectionism. Being committed to doing things well is great! And, like all good things, it can be taken to an extreme which becomes dysfunctional.
I think we all can recognize the signs of perfectionism; Hewko suggests an approach to leading a perfectionist on your team. First is (as with all team members) to acknowledge, to yourself and to them, their positive traits, and lean into those strengths where possible. Then, Hewko tells us, it’s necessary to set a bit of direction. Coach them to delegate (and help them avoid micromanaging the delegated task), and gently take control of their priority lists so they’re not rabbit holing down unnecessarily. Build and use lines of communication by making sure to build trust in your one-on-ones so they can share what’s going on, give constructive feedback, and let them know that coming up short is ok. Finally, make sure they’re in the right role and working on the right things for their skills and tendencies.
Product Management and Working with Research Communities
Toolkit for Cross-Disciplinary Workshops - Borhane Blili-Hamelin, Beth M. Duckles, Marie-Ève Monette
Research computing and data work is inherently cross-discipline. In our line of work success requires that at least two fields, the computational and the domain science, come together. In larger collaborations there might be many fields. In those cases it too often falls to the technical team, the ones charged with doing much of the the actual deliverable work, to be the glue connecting the communities. It’s hard work knitting together a team with diverse expertise, requiring skills that fall well out of the realm of the technical.
This work by Blili-Hamelin et al outlines their approach to hosting cross-disciplinary workshops. It could be very useful for large collaboration kickoff meetings too, or exploratory meetings looking for a point of collaboration.
A key point of the report is that the beginnings of cross-disciplinary discussions can feel slow (excruciatingly so in my experience!). But much of the actual output of the collaboration is exactly the development of a common language and shared mental model that allows translation between the groups. Groping towards that shared mental model is the slow initial part. Rushing that phase of discussions undermines the collaboration’s potential, while building and documenting it clearly and strongly will make everything else easier.
But doing that takes some preparation:
We found that cross-disciplinary workshop activities needed to be sequenced and built to make attendees feel comfortable learning new perspectives and empowered to experiment together with them.
They have a whole plan of action in the report, I own’t try to summarize here, but sending out pre-surveys and pre-reads, having ice breakers, lots of questions and curiosity, and brainstorming all play a role. They also have an interesting take on impromptu panel discussions; the “fish bowl” discussions.
Research Software Development
Diffrax - Patrick Kidder
We’ve talked more than once recently about JAX here (#98, #103) and it’s undeniably cool. But in research computing what’s valuable about a package isn’t the tool itself but what it enables.
We’ve seen PDE simulations taking advantage of JAX (#98). Here’s a differential equation solver - ODE, stochastic DE, or controlled DEs that’s built on JAX. It uses the capabilities of that library to expose a coherent API, and to allow fast, JITted ODE solves in Python on CPU, GPU, or Google’s TPUs.
Building for the 99% Developers - Jean Yang
In research computing, it’s easy to watch what happens in the tech industry, the cool tools being used and approaches taken, and feel like we’re in the backwater.
Yang reminds us that the stuff that gets a lot of attention as the cool new technology or process in tech isn’t necessarily super widely adopted even in tech. The approaches that the FAANG or other big companies take are to solve specific problems, in environments with very different tradeoffs than our own. As Yang says, “There is no gold standard development environment”. There are only problems to solve, and a variety of choices of tools with which to solve them. The point is to solve the problem, not to use something cool.
Seamless, Fearless, and Structured Concurrency - Evan Ovadia
This is an interesting article on threading which talks about “fearless” (race-free) and “seamless” (can access all data) concurrency, and connects OpenMP threading, which many of us are familiar with to approaches for concurrency in Rust and Pony, and uses all of those ideas to describe how the authori is implementing concurrency in their own language called Vale. It’s a quick but interesting walk through a few different approaches to threaded concurrency.
Pathways to Enable Open-Source Ecosystems (PEOSE) (22-572) - NSF
Its fantastic to see increasing funder interest in providing funding open source ecosystems. If only the call was for longer than 1 or 2 years and directly funded maintenance. Still, bit by bit..
The goal of the PEOSE program is to fund new OSE managing organizations, each responsible for the creation and maintenance of infrastructure needed for efficient and secure operation of an OSE based around a specific open-source product or class of products. The early and intentional formation of such managing organizations is expected to ensure more secure open-source products, increased coordination of developer contributions, and a more focused route to impactful technologies. … Importantly, the PEOSE program is not intended to fund the development of open-source research products, including tools and artifacts. The PEOSE program is also not intended to fund existing well-resourced open-source communities and ecosystems. Instead, the program aims to fund new managing organizations that catalyze community-driven development and growth of the subject OSEs. The expected outcomes of the PEOSE program are (1) to grow the community of researchers who develop and contribute to OSE efforts, and (2) to enable pathways for the development of collaborative OSEs that could lead to new technology products or services that have broad societal impacts. OSEs can stem from any areas of research supported by NSF.
Research Data Management and Analysis
PipelineDP Brings Google’s Differential-Privacy Library to Python - Patrick Zhang, InfoQ
Google and research institutions don’t have a lot of situations in common, but “having lots of data and wanting to be able to let researchers analyze it with some conditions” is one of them. Research groups who gather health or social sciences data may want to allow other users analysis over the data with some certainty that privacy violations will be rare. Differential Privacy grants that, under certain conditions.
IBM and Microsoft have already released differential privacy libraries; here Zhang tells us about Google andn OpenMined open sourcing their PipelineDP package, which plugs in nicely to Spark as well as being a standalone Python library.
Research Computing Systems
OS upgrade for KLONE - Nam Pho, U Washington Research Computing
Interesting - UWs new cluster, KLONE, was running CentOS 7, and with the CentOS 8 changes had to figure out what to do. They ended up going with Rocky 8 over some holiday downtime, and it’s been running for a month or so in GA. It sounds like there were a few re-compiling issues for users code but it went pretty smoothly. Love to see a migration story that’s boring from the user point of view. Do we have anyone from UWRC here? I’d be very curious to get the “behind the scenes” reporting from this migration.
Emerging Technologies and Practices
What Will AMD Do With Programmable Logic and other Xilinx IP? - Timothy Prickett Morgan, The Next Platform
So both Intel and AMD now have significant FPGA investment - Intel with the 2015 purchase of Altera, and now AMD’s purchase of Xilinx.
I won’t comment too much on this, except that it’s potentially of interest to the community and TPM’s take is always very informed by deep experience with the technical computing market.
I don’t think it’s controversial for me to say FPGAs haven’t really found a niche as standalone general purpose accelerators. But there’s been real success targeting particular niches like genomics. FPGAs have a long history in telecom and now smart network-cards suggesting they’d be valuable in datacenter scale networking. And TPM brings up Intel’s previous experience with “Frankensockets”. That particular experiment wasn’t super successful, but that doesn’t mean the idea of blending the programmability of a general purpose core with on-socket programmable logic, by either vendor, couldn’t be really interesting.
Intel Targets DAOS Object Storage at more than HPC - Daniel Robinson, The Next Platform
Garage: An Open-Source Distributed Storage Service you can Self-Host to Fulfill Many Needs - Garage Team
Two very different kinds of object storage for different kinds of research computing - HPC and data resiliency - needs both came up this week.
I’m surprised to search the archives and see that DAOS hasn’t come up before. DAOS is the open-source Distributed Asynchronous Object Storage that came out of the US DOE labs (LANL, I think?) and is being commercialized by Intel. The idea is to take advantage of the capabilities of Open Fabric Interface and Persistent Memory/NVMe to provide distributed data via a number of different models, from fairly classic object store “blobs” or MPI-IO to directly exposing distributed arrays and key-value stores.
POSIX filesystems on individual workstations are amazing pieces of technology. They work so well that most people can safely avoid ever thinking about them - people don’t restart their laptops because their filesystem is acting wonky. But they’ve never scaled to large clusters, and big scientific workloads don’t really need them - e.g. MPI-IO doesn’t require (or even really benefit from) a POSIX filesystem on the back end. Something that can help nudge scientific workloads towards other storage approaches, like object store but also like array or key/value for structured data that needs slicing and dicing, could be really cool.
When Aurora finally gets up and running, DAOS will be managing 220 PB of data, so we’ll see how it works under real load. And from there, Robinson tells us that Intel plans to target other areas such as AI. It’s not hard to imagine that data storage could be the first place we really see “HPC/AI convergence”.
For other uses, Garage is an easily-self-hosted geo-distributed object store more for web or data hosting in a highly-available way. It’s very decidedly not for HPC applications. It has basic S3 compatibility, so it’s relatively straightforward to tie into other applications. For teams looking to make data assets available with self-hosted solutions, this might be a choice worth looking into; has anyone played with it? I know there are Nextcloud providers on the list.
Random
Oh this is great news - arXiv papers are getting DOIs.
Interesting online book on implementing Algorithms for Modern Hardware, covering in significant depth instruction-level parallelism, profiling, numerical arithmetic, external memory (as you know, I think this is going to get increasingly important), cache, SIMD parallelism, and some interesting case studies. I really like the section on cache associativity, which is pretty unintuitive and can lead to surprising performance bugs.
Google Slides is Trolling You.
Using Mermaid syntax in Markdown on GitHub to get rendered flowcharts or graphs or sequence diagrams or gantt charts is now live. We’ve already started using this. No more firing up Keynote or draw.io just to draw a few boxes with arrows!
Write your own sudo. CR/LF: LF as a way to pass some needed time on teletypes while the carriage literally returned.
Helpful ssh scripts like ssh-ping, ssh-certinfo, and more: ssh-tools.
Perl code that is only syntactically correct on Fridays. because perl is a cursed language.
It’s easy to forget that some very basic things in computing are genuinely hard - e.g. there was very active work improving float-as-a-string algorithms for printing floating point numbers up to the 90s, and even figuring out the optimal way to draw circles on a pixelated screen is nontrivial.
I’m a sucker for useful interactive explanations of things, it’s a great example of the power of computing for insight. Here’s an interactive article explaining GPS.
If you do technical computing and you want to try your hand at functional programming, OCaml is a really solid choice. OCaml from the very beginning.
Universal Paperclips: Help the AI running a paperclip factory achieve sentience.
I’m really fascinated by these new open-source no-code/low-code tools for building internal applications, like Appsmith. Is anyone using things like this for internal or researcher-facing applications? How’s it going?
That’s it…
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
About This Newsletter
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.
Jobs Leading Research Computing Teams
This week’s new-listing highlights are below; the full listing of 146 jobs is, as ever, available on the job board.
Senior Manager, AMD Research - AMD, Bellevue WA USA
AMD Research is an entrepreneurial research organization with a superb track record of driving research innovations into AMD products. We generate innovations in processor architectures, graphics, interconnects, memory technologies, and software to create new business opportunities for AMD.
AMD Research seeks a passionate, collaborative leader with strong technical skills and the initiative to motivate an expert team. Technical expertise in one or more of the following categories: AI and Machine Learning, Data Science, Big Data and Big Compute, Graphics and Gaming, Software: platforms, infrastructure, runtimes, Architectures: system and micro-architectures, Cloud computing, Scientific Computing
Director of Computing Facilities for Teaching and Research - University Of Massachusetts at Amherst, Amherst MA USA
The Director of Computing Facilities for Teaching and Research fills a mission-critical IT role in the Department of Mathematics and Statistics, a department with over 70 faculty, 65 graduate teaching assistants and 30 unsupported graduate students, 11 staff members, over 900 majors, and over 15,000 students enrolling in its courses each year.
The incumbent leads the departmental IT team and has overall responsibility for the effective management, administration and operation of all facets of computing in the department. The incumbent also serves as the department’s lead systems architect and programmer, lead network and security analyst, and both provides and coordinates tier 2 and off-hours systems support. The incumbent is responsible for the development and programming of security, authentication, and other applications as necessary for the department’s research, teaching, and administrative missions.
Senior Bioinformatics Scientist - Tufts Technology Services - Tufts University, Somerville MA USA or remote
Reporting to the Director of Research Technology, the Senior Bioinformatics Scientist will work closely with members of TTS Research Technology to lead bioinformatics services for Tufts University faculty and students. Key responsibilities include providing consulting services on research project and grants to Tufts biomedical researchers, developing Omics workflows and curating reference data for the support of bioinformatics research on the Tufts High Performance Compute Cluster (HPC) and Tufts Galaxy Server, supervising Bioinformatics Specialist and internship positions within TTS Research Technology, providing training in analysis and visualization methods for Omics data in the form of workshops and modules in semester courses and organizing university-wide groups and events for the bioinformatics community. The ideal candidate is enthusiastic about both research and teaching and enjoys working in a collaborative environment.
Principal Software Engineer, Data Architecture - Advanced Agrilytics, Remote US or CA
Advanced Agrilytics is an agronomic tech company that provides farmers with actionable, customized strategies to deliver sustainable outcomes at the sub-acre level. Our independent hands-on approach combines field specific data with agronomic research to meet farmers at the cross-section of technology and a personal agronomist. We are looking to hire an experienced Principal Software Engineer with a focus on data architecture to organize and scale our data needs across the company. The Principal Software Engineer will play a key role in understanding our current data landscape and implementing a strategy to scale with our company growth. If you are passionate about data and applying your craft in the agricultural space, we encourage your application.
Developer Relations Manager, Quantum Computing - NVIDIA, New York NY or remote VA USA
The Quantum Computing organization is a small, strong, and visible group both inside and outside of NVIDIA while Quantum Information Science is an exciting area to drive strategy. We need a self-starting leader to continue to grow this area. Do you have the rare blend of both technical and relationship skills? Are you passionate about groundbreaking technology? If so, we would love to learn more about you!
Science Manager – Computational Physics Group - Atomic Weapons Establishment, Reading UK
AWE is currently recruiting for a Science Manager to join our Computational Physics Group to lead a technical team to deliver against AWE Programmes. This is a great opportunity for any experienced Science Managers who want to work in a unique organisation whilst also enjoying the benefits of having every other Friday off!
Research Science Manager - Gesture Learning Interfaces, Reality Labs Research - Oculus, Toronto ON CA
As a Research Science Manager at Reality Labs Research, you will oversee research into novel gesture learning interfaces that help users to become experts in the advanced interactions that our cutting edge technologies enable. You’ll be responsible for developing and supporting a team of research scientists and leading the technical direction of the research. In our collaborative environment, you’ll partner with other researchers, designers, and engineers to create the technology that makes AR pervasive and universal.
Director of Research Advanced Computing Services - University of Oregon, Eugene OR USA
Reporting to the Associate CIO for Technology Infrastructure, the Director of Research Advanced Computing Services (RACS) is responsible for providing administrative leadership, fiscal leadership, and oversight for RACS, including the High-Performance Computing (HPC) Core Facility. In addition to overseeing HPC, the Director is responsible for building out mid-tier services that position RACS as a full-service destination for research information technology services. The Director is also responsible for aligning the RACS workforce with university objectives for diversity, equity, and inclusion, as well as increasing diversity among faculty customer base by lowering barriers to entry to HPC and performing outreach to a broad range of faculty from multiple academic disciplines across campus.
Director, Research Computing and Digital Scholarship - Pomona College, Claremont CA USA
Collaborate with faculty to provide expert support for research and teaching in areas such as text-mining, data visualization, network analysis, virtual and augmented reality, GIS mapping, 3D modeling, mathematical, statistical and graphing software used in the sciences, social sciences, and humanities.
Associate CIO - Advanced Research Computing - New Jersey Institute of Technology, Newark NJ USA
The Associate CIO - Research Computing will provide leadership in advancing NJIT’s research by architecting platforms that fuel strategic research growth at this premier technology institute. A researcher at heart, with deep expertise in networking, advanced research computing, and data sciences, this individual will envision the next-generation of research computing at NJIT by fostering collaboration across NJIT’s diverse technology research faculty. This position will lead engagement and outreach with the university’s research faculty, expand collaboration with national and international research organizations, and ensure outstanding service to the university’s research faculty. The position will build a team to manage the infrastructure, provide training, and related services to create an enabling experience for the university’s research faculty. The Associate Vice Provost of Research Computing will lead the successful journey of modernizing legacy platforms leveraging cloud, enhance governance and decision making, and establish predictable and consistent services to the university’s research community. All duties will need to be performed in accordance with university policies, procedures and core values
Senior Research Software Engineer, Centre for Digital Humanities - Princeton University, Princeton NJ USA
As Senior RSE, you will be an integral member of CDH’s work to design and implement high-quality, sustainable software to further innovative research in the humanities and the data-driven or computational sciences. You will work closely with CDH Lead Developer and other research software team members, and will collaborate with faculty, student, and campus partners to help translate research priorities into software needs, including analyzing data, implementing models or simulations, and developing software modules or tools. You’ll have the opportunity for technical leadership on collaborations that align with your expertise, and contribute to research articles, presentations, and course modules.
Director of Advanced Computing - University of Bristol, Bristol UK
The Director of Advanced Computing is the strategic owner for the University’s Advanced Computing and Research Services, setting the agenda and direction for high-performance computing, research data storage and research software engineering. The role provides a rare and exciting opportunity for the successful applicant to shape the future of our Advanced Computing Research Centre and to truly make a difference in the research field. Combining complex technical knowledge with strategic, leadership and influencing skills, the post is a wonderful opportunity for someone looking for their next position within Advanced Computing.
Scientific Manager in Data Assimilation - UK Met Office, Exeter UK
The Met Office has embarked on an exciting and ambitious programme, to redesign its operational forecasting capability for future high-performance computing (HPC) platforms. The current Met Office Data Assimilation (DA) system is not expected to be efficient across the range of possible alternative HPC solutions likely to be available on the 2025 timescale. The new software is being developed within the Joint Effort for Data assimilation Integration (JEDI) framework from the Joint Center for Satellite Data Assimilation (JCSDA).
Senior Scientific Manager, The Centre for Applied Genomics - SickKids, Toronto ON CA
We are recruiting for a Senior Scientific Manager to work within The Centre for Applied Genomics (TCAG). Reporting to the Director of TCAG, the successful candidate will join a multidisciplinary team providing genomic experimentation and analysis.
Senior/Principal Developer Advocate for HPC and Batch - AWS, Boston MA USA or remote USA
Ideally, you’re someone who comes from a software and/or infrastructure background and have worked in research or engineering environments with HPC clusters, but has an unmet need to be be involved in guiding the broader industry and solving more than just one problem at a time. You have immediate credibility with Startups, Developers, Architects and IT Managers. You love to share your passion with others and exhibit good judgment in selecting strategic opportunities to do so.
Group Leader / Scientist Founder: AI Enabled Design and Optimization of Antibodies - AION Labs, Rehovot IL
AION Labs the Israel-based alliance of AstraZeneca, Merck, Pfizer, Teva, Israel Biotech Fund and Amazon Web Services (AWS), powered by BioMed X is supporting the establishment of a new, fully funded startup company in the field of: AI Enabled Design and Optimization of Antibodies for Targeted Therapies. The group leader / startup founder position is intended to suit candidates who would like to develop themselves towards an entrepreneurship path, and who typically have a PhD plus four to eight years of additional research experience in academia and/or industry.
Product Manager, Quantum Computing - NVIDIA, Santa Clara CA USA or remote VA USA
We are looking for a technical, user-focused Product Manager to build exciting new products to enable researchers in the area of Quantum Computing. As a product manager for the product you will establish a vision, gather requirements, set product roadmaps and provide go-to-market strategies. You will work with software developers, hardware developers and research scientists to enable scientific breakthroughs in this exciting area. The Quantum Computing team is a small, strong, and visible group both inside and outside of NVIDIA while Quantum Information Science is an exciting area to drive strategy. We need a self-starting leader to continue to grow this area. Do you have the rare blend of both technical and product skills? Are you passionate about groundbreaking technology? If so, we would love to learn more about you!