Research Computing Teams Link Roundup, 12 June 2020
Hi!
Sorry for the late roundup this week - the week kind of got away from me.
But there’s been lots to talk about, so on we go!
Managing Teams
Architecture Jams: a Collaborative Way of Designing Software - Gergely Orosz
Proposals and Braintrusts - Nathan Broslawsky
These two articles both describe approaches to usefully open up architectural or other proposals to input from a group. The first, an “Architecture Jam”, is sort of half-brainstorming, half-architectural white boarding session; it can work remotely, but is definitely synchronous. The second is more asynchronous - writing up a proposal, and sending it off to a group of people whose job is, explicitly, to improve the proposal.
Either could work in our context. There are two keys for the architecture jams. First is to have a plausible but ideally (IMHO) unfinished proposal to give the group something concrete to kick off discussion. Second is to facilitate discussion well to make sure people are contributing (some good pointers are here, which is worth reading in its own right).
The more formal proposal and request for comments has the advantage that more of it can be asynchronous; but you’d really need to be sure of the starting point, and to ensure that those entrusted with improvements feel like they can propose significant changes.
On anti-racist management practices - Rachel Hands, Managing Equitable, Effective Teams
Bias doesn’t start with skin color - Chelsea Troy
Rachel Hands’ article is a nice list of resources and describes the mindset one has to be in to make use of the resources. Given how hard it is to make changes in our organization, we certainly don’t feel like we hold power, but as managers we do.
Chelsea Troy’s article is useful to read with Hands’: it’s easy to think we’re above being racist because we don’t recoil from people with black skin or think in crude stereotypes, but the more collectively damaging consequences come from behaviours and reactions that are more subtle than that.
In academic and adjacent circles, we like to think that we’re above needing such things, but we’re really not. A glance at #BlackInStem is enough to confirm that, but this week we’ve also seen a pretty scathing internal report leak from Oxford about white supremacist bias and a Canadian prof managed to get a pretty racist and sexist article published in a chemistry journal (arguing that diversity of workforce was making Chemistry worse). And tech of course is a dumpster fire. So in academic tech we have some work to do.
Product Management and Working with Research Communities
Ten Simple Rules for Starting (and Sustaining) an Academic Data Science Initiative - Micaela S. Parker, Arlyn E. Burgess, and Philip E. Bourne
Starting up any big multi-stakeholder, interdisciplinary research computing efforts is fraught and is more about people than technology. Three authors who have been through it bring you a list of ten rules. Four stand out for this audience:
- Don’t Try to Own Everything - you’re trying to build partnerships, having your group do everything isn’t helpful; relatedly,
- Leverage Core Service Groups - in particular, “Libraries have been in the information business for centuries”;
- Establish a Set of Guiding Principles - What’s in scope, and what’s out?; finally
- Hire Staff, and Support Them - probably high on our minds but less so at the VPR-type layer; it isn’t feasible to do something big by having a lot of people working on it off the corner of their desks.
Bioinformatics challenges in multidisciplinary research - Mina Ali
The reason I prefer to talk about Research Computing as a whole rather than research software development/systems/curated databases/…, or breaking things out into bioinformatics/data science/simulations/… , is that the same issues come up over and over again.
We’ve had articles in the roundup before about setting up a data science team in an organization and the challenges of having it be its own thing (and thus isolated) vs having team members scattered and individually embedded (and so don’t get the growth opportunity of working on multiple projects with colleagues). Mina Ali’s article talks about exactly the same issue with bioinformaticians, because it’s exactly the same problem.
Balancing the need for having some kind of embeddedness in problems with being part of a community is tricky, but it has to be done for the work and the individuals to succeed.
Research Software Development
Evidence for the importance of research software - Michelle Barker, Daniel S. Katz, Alejandra Gonzalez-Beltran
A nice list of papers, talks, and other resources on the topic of the impact of research software. There’s also a continually updated Zotero group library and Github repository.
Lessons learned in a decade of research software engineering GPU applications - Ben van Werkhoven, Willem Jan Palenstijn, and Alessio Sclocco
This is an interesting paper written by a team that has worked on GPU applications for digital forensics, image analysis, physical simulation, analysis pipelines, and geospatial databases. There are technical discussions in here — about use of host and device memory, libraries for various languages, that different parts of the code will need different approaches, and the usefulness for roofline plots to understand what is constraining performance at any given point.
More interesting to me are the higher-level observations:
- The difficulties of dealing with essentially legacy code, inclduing the need to write tests before doing any porting
- GPUs often allow going to larger scales, but that often means that other parts of the overall pipelines break down and need work
- The need for managing researcher expectations - results won’t be bit-for-bit the same
- The need to communicate performance results carefully (e.g. “It’s 100x faster on GPU!” “…Well, the thing about your old code was…”)
It’s a short easy read but the higher-level discussions carry over well beyond the particular case of working with GPUs.
A Look at Chapel, D, and Julia Using Kernel Matrix Calculations - Chibisi Chima-Okereke
An interesting view of using Chapel, D and Julia for some fairly basic numerical operations. It covers both the performance and the ergonomics aspects of the language, and to an extent also the responsiveness of the communities.
On the performance side, I don’t like rules like this in these sorts of comparisons:
We disallow any use of BLAS
It makes the performance comparisons unrealistic. There’s no shortage of interesting small computational problems that don’t have highly optimized libraries available that could be chosen instead.
That pet peeve of mine aside, I think the article is a good overview of what it takes to get performing code in those languages; the author’s sympathies like with D (it’s a D language blog this is posted on, after all) but Dr Chibisi Chima-Okereke is evenhanded about the strengths shortcomings of each of the languages and the community support for each.
There Are No Bugs, Just TODOs - Lukáš Linhart
Defects are not the fault of programmers - Hillel Wayne
These are two useful articles for clarifying the purposes of issue trackers — they’re not about blame or bad code, they’re about todo lists. They make different points; the first is that anything that doesn’t help the issue tracker be good for generating todo lists is something that can be usefully stripped out. The second is that bugs are about surprising interactions as often as they are about bad code (I think this is especially true in research software which is often quite subtle).
Research Computing Systems
Effort to Fund National Research Cloud for AI Advances - AI Trends
New ‘supercomputer’ to aid research for 90 firms - Shawn Pogatchnik, Irish Independent
I’m really interested in the accelerating drive to create data science/AI cloud type infrastructure, often supporting private sector users, outside of the usual academic research computing structures. The AI Trends article talks about a US bill supported by a bipartisan coalition of Representatives and Senators; Shawn Pogatchnik’s article is about a modest sized cluster procured by Ireland’s National Centre for Applied Data Analytics and Artificial Intelligence for member companies. The trend for these sorts of systems seem to be more cloudy and less HPCy, both in business model and architectures.
Scientific Computing World is running a survey on HPC for Science, with questions for both users and systems managers.
Emerging Data & Infrastructure Tools
Getting machine learning to production - Vikki Boykis
The Ultimate Guide to Deploying Machine Learning Models - Luigi Patruno, ML In Production
More research computing users are not only using research computing systems to train models, but to serve predictions based on those (frequently updated) models. That puts interesting demands on our teams. Putting the models into production combines the challenges of making curated data sets available and of running services in production.
Vikki Boykis, who has a great newsletter you should consider, described the start to finish of a simple, troll-y ML project which generates fake VC-written Medium think pieces. The machine learning part is just a tiny part of the whole process, and this relatively short read gives a good overview of what is involved. Luigi Patruno’s multi-blog-post writeup is more comprehensive, but digging into those details is a lot easier after seeing a small worked example like Boykis’.
MicroK8s, a lightweight but full kubernetes, is now natively available on MacOS and Windows for teams looking for developer or CI kubernetes setups or for those who want to learn Kubernetes beyond what minikube can do.
Calls for Proposals
SC20 Early Career program - Applications Due 31 July 2020
The Early Career program provides a one-day series of special sessions for early-career researchers, educators, and technical professionals. This includes academic, industry, and laboratory staff and post-docs within the first five years of a permanent position. The Early Career program is available to participants by application only. The program aims to help participants secure a better understanding of the issues and challenges faced while navigating a successful research career. The program will include engaging interactive sessions aimed at helping participants develop their professional skills, as well as a strategic vision for their future.
Random
Maybe of interest to some here - a Juypter kernel for sqlite.
This twitter thread has a lot of nice responses with templates for project strategy, planning, and roadmap documents.
Moving away from “master/slave” and “whitelist/blacklist” type terms in our projects is a good and useful thing to do. It’s really easy to rename git default branches - git branch -m master <newname>
” and git push -u origin <newname>
is all it takes. And given how many different ways projects use their default branch, the word master doesn’t even mean anything useful. For your project, is it the trunk? production? prerelease? Any of those are better and more informative names.
If you use nginx in your shop you likely already know this, but this week I learned about topngx and its predecessor ngxtop which provide top-like access to nginx logs.
When I was doing fluid dynamics simulations full time, one of my favourite fun facts was that F1 racing limits the amount of aerodynamic simulations teams can do because too many simulations would be an unfair advantage. Those rules are about to get tighter.
Taichi is a language for fast interactive graphics embedded in Python. The examples are super cool.
A cursed C implementation of tic-tac-toe, with all logic and even input handling in the printf statement.
That’s it…
And that’s it for another week.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
Jobs Leading Research Computing Teams
I’m reposting the first job, which I never do, but it’s to work with our team on a great project – so if it looks like you, give it some thought!
Bioinformatics Team Lead - Canada’s Michael Smith Genomics Sciences Centre, Vancouver BC CA
The GSC is currently looking for a senior bioinformatician to develop approaches to analyzing and sharing “big data”. This role is instrumental in the development of the CFI-funded cyberinfrastructure project, CanDIG—a state-of-the-art data sharing platform contributing to the development of an international effort to facilitate information exchange as part of the Global Alliance for Genomics & Health (GA4GH). CanDIG also supports large-scale provincial and national data sharing projects, including BC Cancer’s Personalized OncoGenomics program, the Terry Fox Research Institute (TFRI) PROFYLE project and the TFRI-led Marathon of Hope Cancer Centres Network—a major federal initiative to accelerate the adoption of precision medicine for cancer in Canada.
Associate Vice President, Research Computing - University of Chicago, Chicago IL USA
The Associate Vice President for Research Computing is a position within the RNL that provides strategic direction and leadership for UChicago research computing endeavors, establishes a vision for research cyberinfrastructure, defines and delivers services relevant to the needs of the UChicago research community, and advances UChicago Research Computing.
Sr Computer Scientist/Research Engineer, Transportation Management Systems - , San Antonio TX USA
Serve as a software developer and researcher on a team developing software solutions for programs making a positive impact on society in Intelligent Transportation Systems (ITS) technology areas such as Integrated Corridor Management Systems (ICMS), Decision Support Systems (DSS), Advanced Traffic Management Systems (ATMS), Smart Cities, and Data Analytics Platforms; utilize advanced data science skills and techniques in state-of-the-art software development environments; perform in all phases of the development lifecycle, including requirements definition, software/systems design, implementation, testing, and integration; initiate advanced research and development programs; interact with clients and make technical presentations. Work will include assisting with project management and team leadership
Senior Manager, Software Development - Kindred AI, Toronto ON CA
Lead architecture and development of infrastructure related to reliable productization processes such as system design, code design, automated testing, deployment and release of complex HW/SW systems.; Communicate with Senior Management and Cross-Functional Teams; Align the releases with existing software team processes.
Senior Manager, Machine Learning - Bell, Toronto ON CA
We are looking for a Senior ML Engineering Manager who has experience productionizing, maintaining, and optimizing machine learning applications. The successful candidate will manage a team of ML Engineers, and help define, build, and maintain a machine-learning-as-a-service environment at Bell.
Director, Data Management - GS1 Canada, Toronto ON CA
Lead a team of data analysts to create, manage and maintain GS1 Canada’s data center of excellence, including documenting and maintaining our data assets, managing our attribute mapping capability and providing governance over our backend data systems; Work very closely with Business and Enterprise Architecture teams and help build data strategy; Support Data COE in translating strategic requirements into usable enterprise data architectures, which may include data architecture