I’ve had to have a few different difficult discussions around our project this week, and while they were exhausting it’s been great to clear the air. And as far as I can tell they’ve strengthened rather than weakened the working relationships.
I also sat through a meeting with peer who essentially ran a denial-of-service attack on the meeting through his need to talk at length about anything even tangentially related to the topics at hand. It (almost) killed any chance that the meeting would accomplish anything. I’ve been trying to push myself to talk less and ask questions more during conversations; some days are better than others, but that experience has encouraged me to redouble my efforts.
It’s been a long seven days for a number of us, for a lot of reasons. I hope the week has been good to you and your team.
And on to the link roundup.
Hogan, taking lessons from her mothers pastoral care when she was going up, shares some humane steps you can take if someone unexpectedly shares grief with you over a significant tragedy in their lives. These steps are human for both the team member and you. The steps are:
Starting a new manager relationship - Sally Lait
A lot of us in research computing become managers by being promoted within an organization, so we sometimes have the advantage (and disadvantage..) of an extended handover process between the previous manager and you.
Lait here describes her preferred method for taking over a team in such a situation:
Asynchronous Meetings: Everything You Need to Know - Fellow App
As we get more and more comfortable with distributed teams, there’s increasing interest in written asynchronous team communication. It has the advantage of retaining a record and allowing people to contribute on their own schedule. Some things are hard to do asynchronously - it’s hard to imagine an asynchronous one-on-one being successful - but some are quite easy like status updates.
We know (from, for instance, open source) that complex decisions can be made with exclusively asynchronous communication; it’s even possible to set up to set up asynchronous meetings after a fashion - circulating an agenda with a deadline for items, have everyone contribute notes and discussions with a deadline, and proceed with decisions however you usually handle decisions.
There are definitely going to be discussions - such as one I had today where I badly misunderstood a fundamental point and didn’t realize it - where the high-bandwidth back-and-forth of synchronous discussion unblock things much more quickly than emails or shared documents. But a lot of routine communications can be done in writing and asynchronously. Getting used to working more in that mode is going to be something of a superpower for recruiting more distant employees, and teams who master it will also have a huge advantage in international collaborations.
Remote Onboarding Changes the New Hire Experience - Shane Hastie, InfoQ
More and more of our hires are going not going to be working in the same space as us, and that makes onboarding - something a lot of research computing teams are kind of sloppy about - even more important to do well. Hastie brings together a couple recent articles on the topic which encourage:
And some specific directions:
Research Computing Infrastructure and Researcher Engagement: Notes from Neuroscience - Bose, Antoniades & Pellman, PEARC ‘20
This is a very interesting conference paper from PEARC 2020 describing a discipline-focused research computing group to support a (then)-newly formed institute of neuroscience labs in Columbia. The goal was to focus on supplementing existing services by providing local infrastructure and training, and to facilitate partnering across labs.
So the group had a very clear focus, which was a huge advantage, and in addition prioritized a focus on tasks and services, as well as automation and self-service with possible. Those focusses meant they could communicate very clearly with labs around needs, speaking about tasks and services rather than implementation details, and it meant they could provide “bursting” to commercial clouds, all with a very small team. Future directions include assisting with software development, software consulting, and improving scheduling, monitoring, and visualization of existing workflows.
It’s really great to see what can be accomplished with a small team that has a laser-focus, and I’d love to hear more about how they developed their initial service catalog and built support for self-service and automation.
Towards making formal methods normal: meeting developers where they are - Alastair Reid, Luke Church, Shaked Flur, Sarah de Haas, Maritza Johnson and Ben Laurie
As mentioned before I’m quite impressed with how tooling like formal methods are advancing and I’m optimistic about their use for research software especially for subtle kernels of methods
This is an overview of where one group sees formal methods going, and if the topic interests you it’s worth reading for that alone; but it’s also worth reading as a product strategic plan:
A set of Common Software Quality Assurance Baseline Criteria for Research Projects - Orviz, Lopez, Duma, David, Gomez, and Donvito
Coming out of the EOSC Synergy effort, an extensive checklist of criteria for “production strength” research code, to be e.g. deployed as a service to communities in the INDIGO Data Cloud. The criteria are broken down into categories:
In most areas the actual recommendations aren’t that opinionated - e.g. processes must exist and be documented but they aren’t generally specified. This makes sense for a field as broad as research computing.
I haven’t seen such a comprehensive list before; this is a good conversation starting point conversation for any mature research software product.
Here’s a controversial take - in research computing we tend to want to use Jupyter Notebooks + binder or RShiny apps to provide interactive calculations, even in situations where we could work faster and communicate more effectively with Google Sheets.
CIRES’s COVID-19 airborne transmission tool in Google Sheets is a nice example - it’s modelling functionality that’s only expected to be relevant for a couple years, people can trivially copy it and use it on their own, and the sort of decision makers who can set policies for institutions or jurisdictions are probably very familiar with and trust spreadsheets.
And Google sheets are actually pretty good for pulling data in from other sources and working with, as Pot’s article points out with IMPORTHTML and IMPORTFEED.
Interview with Meinte Boersma on Domain-Specific Languages - Federico Tomassetti
Boersma has a book coming out on Domain Specific Languages (DSLs). DSLs are starting to be seen more commonly in research software, as a way to make it easier to write code for a restricted but still rich problem. You could think of FEniCS as a DSL for finite element equation solving, or for a more explicit example Julia’s DifferentialEquations package.
Tomassetti’s interview with Boersma covers when a DSL is and isn’t a good solution to a problem, and also on the process of writing a book.
Running HPC workloads with Red Hat OpenShift Using MPI and Lustre Filesystem - David Gray, RedHat OpenShift
There’s increasing interest in running research computing workloads in Kubernetes environments. This blog post gives a high-level walkthrough on running GROMACS with MPI in OpenShift in particular.
One area that used to be complex for HPC-type workflows in kubernetes used to be access to high-performance POSIX file systems. Operators - Kubernetes’ way of encapsulating complex operations elements of a workload - now exist for lustre and MPI jobs, making that easier.
What’s interesting is that the FS is actually the easy part; for decent network performance, you currently have to bust out of the cloud-native SDN for performance. It’s interesting to see the different reaction to that in different communities - in cloud-native land this is treated as being deeply suspect, as the software defined networking is an important security consideration, while the HPC world takes a much more YOLO approach to such considerations.
Relatedly, there’s a new (I think) best practices guide for running MPI jobs on Google Cloud.
3rd OpenMP Users Developer Conference - 1-2 Dec, Online, Free
Free workshop on OpenMP; full day hands on tutorial on the 1st in EU/UK friendly covering the basics and OpenMP for GPUs, followed by a talks in the UK afternoon on the 2nd, covering code generation, load balancing, ARM, and the OpenMP roadmap.
For those involved in doing or supporting computational fluid dynamics, Dyke’s 1982 “An Album of Fluid Motion” is now available for free online. This is a lovely and beautiful book for developing intuition about fluid flow, especially (but not exclusively) incompressible flows.
The pinnacle of human achievement in wordprocessing software, WordPerfect for DOS, has been updated. Reveal-codes forever.
Free preview of an Apress book on Data Parallel C++ covering C++ and SYCL.
A HOWTO on routinely backing up your emails from an IMAP server.
How to setup secure ssh via public URL through a firewall with nginx, lets encrypt, and ssh reverse tunnelling.
RCEs exist for git-lfs on windows - best get patched if your team or users are using that combination.
Syncious, an HPC-specific cloud provider.
Swift is an interesting language but it hasn’t yet gotten (or, really, earned) much traction in research computing. With really excellent numerics work going on, and now a pretty sophisticated looking roadmap for concurrency (not parallelism, to be clear, but concurrency) I wonder if that might slowly start changing.
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Highlights below; full listing available on the job board.
Scientific Data Management Architect - Science and Technology Facilities Council, Didcot UK
CLF is now building a new laser centre - the Extreme Photonics Applications Centre (EPAC). Your main role will be to work with a team of scientists and engineers to design and deliver the data management system of EPAC. You will join the EPAC team in CLF, working with experts from Diamond Light Source and other STFC departments such as Scientific Computing and ISIS. Depending on your expertise, you will also have the opportunity to lead parts of the data management system. This is an excellent opportunity for an experienced or emerging data scientist/engineer to develop further: the successful candidate will have a stimulating work environment with exposure to various cutting-edge technologies.
Programme Manager - Supercomputing Wales, Cardiff UK
The role holder will be responsible for identifying, planning and coordinating a set of interdependentactivities and projects within the Supercomputing Wales programme. This specific appointment has arisen due to Cardiff University’s leadership of a Welsh university consortium to deliver the five-year programme, along with Swansea University, Bangor University, Aberystwyth University. Part-funded by the European Regional Development Fund through the Welsh Government, the £15 million programme will enable Wales to compete globally for research and innovation that requires state-of-the-art computing facilities to simulate and solve complex scientific problems. The programme includes investment in two upgraded supercomputer hubs at Cardiff and Swansea and recruitment of a new group of Research Software Engineers to develop customised software that harnesses the power of the facilities to perform multiple computational tasks simultaneously at very high speeds
Senior Research Computing Applications and Data Specialist - Softworld, Boston MA USA
Master’s degree with 5-8 years of experience or an equivalent combination of education and experience in the field of Engineering, Computer science, Statistics, or related field. 3-5 years of professional experience in developing statistical data analysis using Python or R Proven understanding and professional experience applying machine learningdeep learning concepts and techniques such as random forest, support vector machines, RNNs, CNNs, LSTM, etc. Hands-on experience in installing and programming in common frameworks such as scikit-learn, TensorFlow, Keras, Theano, Caffe, etc. Demonstrated proficiency in multiple programming languages (Python, MATLAB, R, and C) and the ability to quickly learn new programming languages and tools as required GPU programming experience (CUDA or OpenCL) highly desired Experience utilizing and scripting for Linux HPC clusters Strong analytical skills required with an ability to manage multiple projects and deliverables
Senior Research Software Developer - University College London, London UK
Senior Data Scientist - Loblaws, Brampton ON Canada
The Data, Insights & Analytics team at Loblaw is hiring a Sr. Data Scientist (Finance Analytics) with experience working with modern data science tools and technology and possessing a deep understanding of data mining and analytical techniques. A track record for driving business insights out of vast quantities of data and presenting to senior management (especially within a Finance context) is a must. Reporting to the Senior Director, Data, Insights & Analytics, the incumbent will be responsible for leading problem solving, conducting analyses and developing production-ready models across various parts of the Loblaw enterprise. We’re looking for a critical thinker capable of creating data driven solutions to help transform a whole host of financial related processes, reduce costs/risk and enhance profitability.
Product Manager, Data Technology - Hootsuite, Vancouver BC Canada
We’re looking for a Product Manager to help us synthesize organization, strategic, developer and user information into a clear roadmap that iteratively leads our data infrastructure team towards achieving our mission of Empowering Hootsuite with Data. You’ll be working with analysts and business stakeholders to help design data solutions that solve internal customer problems and with the Data Technology development team to implement those solutions. This role reports to the Senior Manager, Data Technology and is based remotely out of Hootsuite’s Vancouver office.
Senior Research Computing Systems Engineer - University of Southampton, Southampton UK
This is a key role in support of the research being carried out using the University of Southampton’s High Performance and Data Intensive Computing (HPDIC) facilities. Following a recent restructure you will be joining a team of dedicated research computing engineers who are supporting the current systems and their use in a diverse range of research topics from Quantum Chemistry simulations and AI modeling to Climate modelling, medical imaging and COVID-19 research. We are currently planning for a refresh of the facilities and this role will play a key role in supporting the planned and commissioning process.
Operations Coordinator, Computational Solutions - Dana Farber Cancer Institute, Boston MA USA
The Operations Coordinator reports to the Senior Director of Computational Solutions with daily oversight from the Business Operations Manager for the Department of Informatics & Analytics (I&A) at Dana-Farber Cancer Institute (DFCI). This position has comprehensive administrative and business-technical operations responsibilities for the Computational Solutions (CS) organization within the department of Informatics & Analytics.
The role will be responsible for running day-to-day operations while contributing to administrative process improvements and assisting with the planning and tracking of both operational budgets and capital projects. Responsible for the operation and administration of the expanding Informatics Services Core within Computational Solutions and supporting the implementation of certain technical modernization projects within the scope of our Research Computing services.
Programme Manager for International COVID-19 Data Alliance - Health Data Research UK, London UK
HDR UK requires the services of a programme manager to support the overall initiative and delivery of one or more “Driver Projects” designed to help inform and deliver impact from a high profile, charity-funded international collaboration which will create an international Alliance of data custodians and digital workbench for researchers and innovators to support our COVID-19 response by enabling the sharing of data at scale. The individual will also be responsible for the Secretariat support of key Alliance governance bodies.
Data Access Services Manager, UK Data Archive - University of Essex, Colchester UK
The activities cover administering registration and access activities, producing user support materials and web content, statistical output checking, and delivering training for those who are seeking to use, and using, social research data.
You will work with Service Directors to roll out access to business and administrative data under the Digital Economy Act, taking care of safe data protocols and putting into place smooth workflows. You will lead on the Service’s programme of training for Safe Researchers and contribute to other areas of training, working with the Service’s User Support and Training Directorate in Manchester. You will be expected to represent the Service at various high-level meetings and events.
Senior Manager, Data Engineering - University Hospital Network, Toronto ON CA
You are an inspirational leader capable of both tactical and strategic planning and decision making for multiple sub-teams that make up your portfolio in a fast-paced, challenging and complex environment.? You are hands-on capable and well-versed in database administration and design, hardware and software infrastructure, security, database integrity, optimization, business intelligence, cloud technologies and data analytic infrastructure workflows. Your technical background has experience with multiple relational and non-relational database management systems and you have a flare for making data intelligible.