Quick: what’s your team’s specialty?
Your team’s specialty is its reputation for what it’s good at. Not what you think your team is good at; what matters is what specific thing your stakeholders (funders, clients, institutional decision makers) think your specialty is. What they recommend you for to peers, what they recommend funding you for to decision makers.
In the post-pandemic world, researchers are used to getting their support remotely from anywhere. To compete, your team will need well-defined specialties; and “HPC” or “research software development” isn’t a specialty.
Read on, or go straight to the roundup.
The pandemic isn’t over, but the end of this phase has begun, and with September (“academic new years”) here, it’s a good time to think about the future. Last October I wrote about what post-pandemic research computing is going to look like, and it’s holding up pretty well. With researchers now very comfortable getting research computing and data support virtually and with budgets under pressure, there is going to be a lot more competition for research computing and data teams. Research collaborations are going to be looking elsewhere more and more often - academic teams at other institutions, or with commercial companies (either commercial cloud vendors for compute, or emerging collaborations between well-known names, like NAG and Azure, for services).
This is an opportunity for well run, focussed teams to grow and prosper. But it’s going to take more planning and forethought than decades past, where one could count on having a near monopsony, of being the only available seller of services to local researchers. It’s going to take developing and maintaining a strong reputation for a small set of specialties.
“HPC” may sound and feel like a specialty within the community, but to researchers and decision makers it’s incredibly generic and so meaningless. It’s not a technical term, but a term of advocacy and marketing which has been come to mean resources for anything from high throughput batch services to huge tightly coupled simulations to single-node multi-GPU code runs. Even advocates for the term define it as “anything bigger than what a researcher could provide on their own” which is incredibly generic, and so necessarily meaningless. How can your team’s specialty be “anything”? A team is expecting researchers to recommend them for “anything?” There’s a reason why VPRs would be just as happy contracting it out (e.g. see table 2 here).
“Services and expertise for quickly analyzing public-health bioinformatics data”, “a platform for firing off and monitoring aerospace CFD calculations”, “a centre of excellence for digital humanities data curation and archiving”: these are examples of specialities - products, services - that researchers and institutional decision makers can see the value of and be willing to put money into, services and products and teams that researchers can recommend to each other. They are areas where a team could build a strong reputation - they could be the group that researchers recommend to collaborators when they chat about research needs.
“Research Software Development” at least, to its credit, doesn’t pretend to be a narrow specialty - it’s a broad area which can encompass any area of software development in support of research work. As a result, a team can’t have a specialty in “Research Software Development”; it can have a specialty in “web applications and mobile apps for data collection”, or “GIS analysis tools” or “agent-based simulations for social sciences modelling”. But almost certainly not all three at the same time.
Even so, research software development is too specific in one unhelpful sense. It could be that researchers are just looking for your team to write some software for them, hand it over, and be done. But increasingly, researchers are looking not just to be delivered some software, but for a team to host the software, run it, operate it - and/or collect and curate data to be used with the tool, for tests or otherwise. Focusing solely on research software development, as a separate activity from systems operation or data analysis and management, can be overly limiting.
Ok, so what does all of this have to do with competition?
One of my venial weaknesses is spending too much time on twitter. I’m seeing increasing concern there from research computing teams that cloud vendors or teams using cloud vendors are coming into their institutions and winning or trying to win contracts for projects that “should” have gone to the in-house teams. I’m hearing complaints that the external bids are for amounts of money 2x or more what the in-house team says they could do it for. Incredibly (and almost certainly incorrectly) I’ve even heard 10x.
Reader, as hard as it is to believe, those complaining see this as an affront, and a threat, rather than the enormous opportunity it is. (And affront was taken. There were lots of dark murmurings about slick sales teams trying to fool gullible senior administrators. And, you know, I’m sure it’s comforting for the teams that might lose out on these contracts to think that the vendor mesmerized the simpleton decision makers with their entrancing slide decks, and so hoodwinked them into considering an overpriced contract. But (a) have they never seen a vendor pitch? Being sold at for 50 minutes is just as excruciating for senior decision makers as it is for us, and (b) it’s self-serving twaddle to imagine that just because someone higher up made a decision to work with someone else they must clearly be dumb. If they assume the only reason someone wouldn’t work with their team is that the decision maker is dumb, they’re going to end up making a lot of poor and uninformed decisions.)
If a contract at your institution is won - or even in serious contention - that is 2x what you estimate you could have provided the services for, that’s not evidence that the external contractor is overcharging. It’s evidence that your team is undercharging, that you could have proposed doing more to support that project and the researchers, and that you’re leaving money on the table. It’s also evidence that you haven’t fully convinced the relevant decision makers that you can provide that service; they don’t see it as being part of your specialty.
Clearly your institution found it worthwhile to spend or consider spending that 2x, because they understood that it was worth at least that much to them to have those services. A bid for half that amount having failed or being questioned means that they really didn’t believe the in-house team could do it as well. That’s revealed-preferences data that you can use. (And if I truly believed someone at my institution was seriously considering spending a premium of 10x (1000%!) to work with an outside company rather than work with my team, well, that would occasion some serious soul searching.)
Cloud providers and other external contractors do have advantages. They have a library of reference architectures they can deploy, so they can pitch (say) CFD solutions to the mech eng department, and bioinformatics pipeline solutions to the biology department. They can pull from a library of testimonials to demonstrate that they can do the work.
But so can you. You have access to all the literature to search for how others have deployed such solutions. You have (or should have) testimonials from the people that matter - research at that very institution. And you have a network of deep relationships in the institution, relationships based on collaboration on research problems. Those relationships and collaborations and shared expertise is something the external contractors have no chance of matching.
If you’re in danger of losing out on these sorts of competitions, it’s because you’re not communicating your specialities in a way that matters, in a way that’s convincing, to the people who could pay for your services. They can’t see how your “HPC batch services” connects with “a digital twinning platform for building simulation”. They don’t see “GIS exploration for private social sciences data” as being an obvious of your “Research Software Development” effort - where’s the data part?
You have specialities - if you don’t know what they are, ask the researchers who keep coming back. How do they describe what you do? What would they say your speciality is, how do they talk about you to their colleagues? What would you have to demonstrate to them to have them recommend their colleagues to you?
Once you have those specialities, you can start playing to your strengths, and communicating them endlessly. You can make a point of reaching out, having your team talk at conferences in the specialties, and at departmental colloquia. You can be well-regarded enough in your institution for those specialties that external contractors pitching work within your speciality never get in the door. You can start more easily hiring people that are interested in that specialty. A specialty builds on itself, snowballs. You can start steering future work towards that specialty to build on it, and start directing work well outside the specialty to somewhere else - where it does fit inside their specialty.
Yeah, that last part is scary. Sticking to this path isn’t easy. It means turning down opportunities that aren’t in or adjacent to your specialities. Especially for new teams, eager to please, this can be scary.
But as anywhere in research, your team’s reputation is all that matters. Your team has a reputation, has stuff it does and doesn’t do. Did you choose it, did you shape it, or are you content to just let it happen?
Your team can be extremely strong in, specialize in, develop a reputation in, any of a number of things. But not all of the things. Being a manager or leader means choosing.
And now, the roundup:
Yak Spotting - Aviv Ben-Yosef
A good way of catching yak-shaving (video example here), for ICs or for managers - “Why are you doing this?”
Creating an Internship Program for Software Engineers - Tom Sommer & Gábor Zöld, Level→Up Podcast
Understanding the mentorship mesh - Daniel Peck, LeadDev
Academic research computing teams take on student interns more commonly than teams in industry do. The industry teams who do implement internship programs, however, build them out to take much fuller advantage of the opportunities of such a program than most academic teams (mine included) do.
Internships are a lot of work for those hosting the interns. But that work can pay off, for large enough organizations. Some advantages of a robust internship program, routinely bringing in several interns at a time, include:
Constant hiring, and so continual maintenance and improvement of your hiring process, even at times when you’re not hiring permanent staff;
Constant onboarding, and so continual improvement of your onboarding documentation, explicit documentation of implicit knowledge, etc, which makes information easier to find for everyone, not just the new interns;
Improved recruitment of known-quantity high achieving candidates from your internship alumni;
Constant influx of new tools, new ideas, etc. from the interns (a trivial but telling personal example: it was interns who finally got me using VSCode after 20+ years of vi use).
Level→Up engineering, which is a routinely interesting podcast, has started writing up blog posts from their interviews, which is handy (I’m very hesitant to include un-skimmable resources like audio or video in the roundup unless they are exceptionally relevant). In this episode, Zöld interviews Sommer about Redbubble’s internship program, and what’s worked for them.
Actively working with local organizations to source interns
Defining clear expectations
Having a lightweight version of the same interview process as for permanent staff
Interns arrive in cohorts, so they go through the experience together, even though…
They are embedded separately into teams, and may rotate between teams
Interns have volunteer mentors separate from their managers
Intern’s growth is monitored, for their own sake and for future hiring - rapid growth is a stronger signal for later hiring than high but constant level of technical competence
Interns who are on a path for hiring still go through the permanent-staff hiring pipeline
Peck’s article talks about the next stage of onboarding, for permanent hires. For this longer-term mentoring, having a single mentor isn’t enough; Peck prefers a “mentorship mesh” approach, with a team of people providing different kinds of mentorship:
A peer they’ll be working with, on the same career path (but somewhat advanced)
A senior person who is still further advanced
A domain expert in the area they want to develop expertise in
Their team lead/mentor
Management as a Technology - Nicholas Bloom, Raffaella Sadun, John Van Reenen, Harvard Buisness School Working Paper 16-133
Management is important, and the issues involved are complex, and like a lot of things that are important but complex, it’s the subject of a lot of study. That study looks different than those of us who came up in the natural sciences are used to - people systems are way harder to examine than, say, fluid systems - but it can be every bit as insightful.
This working paper from a few years ago results from interviews with 11,000(!) manufacturing firms in 34 countries, asking questions about whether the company followed industry best practices, and their practices around process improvements, performance review and tracking, feedback given, clear targets, high performers rewarded, low performers removed, and hiring and retaining staff. (The anonymized data is available by filling out a form at http://worldmanagementsurvey.org/)).
They then analyzed the data in a slightly unusual way - using the existing framework of analyzing the adoption of a new technology across firms and companies, and looking to see if the new technology improved productivity or not. But here the “technology” is good managerial practices.
The technology of management (literally a body of knowledge and techniques) might be a useful mental model. It’s not magic, it doesn’t require inspiration or a particular personality type - it’s the roll-out of a well understood set of practices. That doesn’t make it easy, but hey, adopting new technology can be challenging.
Bloom et al. find that management “technology adoption” accounts for 30% changes in total factor productivity across entire organizations, or even between countries. If anything, I’d guess that number is understated, since management approaches can be pretty heterogenous within an organization so there’s likely an averaging effect. I don’t find it hard to imagine at all that well-managed teams are at least 30% more productive than replacement-level management of teams; just take a look at the software development waste article below.
Ditching high overhead and running your own research institute - Travis Metcalfe, Astro Better
A lot of us assume that the only way to work on grant-supported research is in academia or adjacent institutions; but that’s not the case. Metcalfe’s article has numbers that are specific for US granting agencies, but granting councils all over the world have over the past decades become much more friendly to working with small for-profit or non-profit corporations. In the article, he describes how he now accepts awards from NASA and NSF through his own nonprofit organization.
Wilson briefly summarizes a paper by Sedano, Ralph, and Péraire, who looked at eight software development projects at Pivotal, a software development company, for 2 years and five months, interviewed team members, and analyzed retrospectives. They identified nine broad categories of wasted time and/or effort in the projects:
Building the wrong feature or product
Mismanaging the backlog
Having to re-do work
Implementing unnecessarily complex solutions
Extraneous cognitive load (from technical debt, large/complex stories, poor tools/code, etc)
Knowledge loss, and
These will likely sound familiar, and can be a useful list to keep to hand as a caution at our next planning meetings.
Kaitai Struct - the Kaitai project
So on the one hand, I think that in research computing we invent our own bespoke file formats way too often, spending more attention to how the data is laid out on disk than the APIs to get the data on and off disk.
But sometimes HDF5 or the like is overkill (double check though!) and format-create you must. You can improve matters by (a) making sure the format is binary, so you’re not serializing/deserializing to text all the time, and (b) do it with some sort of well-defined specification so that it’s easy to have implementations in a number of languages (and can test their conformance) against the spec.
Rethinking Best Practices - Will Gallego
This is a nice thoughtful article on the role best practices do and should play.
I find that in technical fields, there are two bad and opposite attitudes to best practices; overly-deferential unquestioning “that’s what everyone does”, and just as unquestioning knee-jerk “that would never work here”. Gallego walks us through a third way, with a quote from a 2016 paper by Klein, Woods, Klein, and Perry in J. Cognitive Engineering and Decision Making:
> We should regard best practices as provisional, not optimal, as a floor rather than a ceiling.
If you’re starting from scratch, or even rebooting practices in a team, industry best practices are the right starting point. They’re common for a reason, and if you choose a different starting point odds are better than even that you start off further from optimal. They also give you access to a language and a literature to discuss and compare approaches with other practitioners.
But effective teams are constantly experimenting and measuring, and so learning, what works for that group of humans on that set of problems. Very little should be left to received wisdom.
Researchers often have libraries of bash scripts or make files for processing data, especially when there’s lots of files - but error-handling and logging code is hard (and prone to bugs!, e.g. #11 with the Ariane 5 crash, and #80 with iPhones). So bash scripts are typically not particularly robust, and make’s “the filename is the metadata” approach works really well right up until it doesn’t.
Workflow management tools are starting to get simple enough to set up and get started with that they can be plausible replacements for scripts or make for these routine tasks. They’re also becoming popular in bioinformatics, where abusing the filesystem with fragile bash scripts has been a traditional way of life.
Snakemake is a popular workflow package for those already doing a lot of work with python. In this short article, Brown shows how to replace a simple bash-loop script with a snakelike file, which is a little longer but has several advantages:
Extends in complexity gracefully (python libraries and templating readily available)
Automatically gives parallelism, like make, but allows external metadata sources readily
Robust error handling
Built-in logging available
Docker with IPv6 and Network Isolation - Sumit Khanna
One - maybe overkill - way to avoid the confusion of host and container networking, and to be relatively certain the local container network is isolated, is to have the local container network be over IPv6. Docker handles this natively but the configuration isn’t obvious.
Here Khanna walks us through setting up an IPv6 NAT (to avoid having to have a dual IPv4/6 stack) and then configuring a load balancer and services behind the NAT in its own isolated IPv6 network.
If you use Docker Desktop, you’ve probably heard that while it remains free for personal use and non-commercial open-source development, it will cost money for professional use in other contexts.
We’ll wait to see how that plays out for (say) research use which is often (but not always, but maybe mostly should be) open source. In the mean time, you might want to check out some alternatives, especially if it’s a tool that only gets used intermittently.
BUT please consider not running for the hills every single time a productivity tool starts charging money.
We’re pretty hypocritical about this in research computing. We (rightly) complain that it is way too hard to get ongoing funding for research computing and data products (software, data resources, etc). And it is! But as soon as a tool we use isn’t free any more, we bail.
This particular tools is going from free to something like $7/mo/team member. $84/year/team member for a tool that the team finds useful is not a lot of money. If this particular tool genuinely isn’t worth that for your team members, because they only use it occasionally, then ok, cool, makes sense. But - stuff costs money. If there’s stuff that makes your team more effective, please consider finding the money.
Gateways 2021 - 500 word abstracts due 22 Sept, Conference virtual 19-21 October 19-21
From the website: Topics include, but are not limited to:
Architectures, frameworks, and technologies for science gateways
Science gateways sustaining productive, collaborative communities
Support for scalability and data-driven methods in science gateways
Improving the reproducibility of science in science gateways
Science gateway usability, portals, workflows, and tools
Software engineering approaches for scientific work
Aspects of science gateways, such as security and stability
AI and ML for science gateways
Social research on science gateways
Use cases and lessons learned from science gateways
INTERACT for Engineering Leaders - 30 Sept 2021, Online, Free
From the website,
> Interact is the community driven conference for engineering team leads, managers, VPs and CTOs looking to improve themselves and their teams.
PMI Kickoff: Free Project Management Training (Review) - Elizabeth Harrin, Girl’s Guide to PM
A review of a short (45 min) free intro to project management training from PMI - worth thinking about if you’re considering having someone on your team play more of a role in managing projects.
A 1981 TRS-80 adventure game found, fixed and put online.
Making homemade silicon chips.
Learning modern C++ by writing a JSON parser from scratch (editor’s note: do not write a JSON parser from scratch).
The tac command, if you’re not familiar, is cat but prints the lines in reverse. Ever been tired of how low-performance TAC is? Finally, using SIMD acceleration in rust to create the world’s fastest tac.
A discussion of Linux’s three “stopwatch clocks”, CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC, and CLOCK_BOOTTIME, why they exist when the linux clock exists, and where they may or may not fail you.
Computing technologies come and go, but deep research computing expertise doesn’t really have an expiry date. Here’s a nice article on an image processing person being introduced for the first time to FFTs when they were looking instead for machine learning magic to improve scans of printed images.
Intersting. A commercial (but free initially for modest use) whole-system profiler for C/C++, Rust, Go, Python, and more - Prodfiler.
Interesting comparison between the “efficiency” and “performance” cores on the M1, may be relevant to other such big-little architectures. For a vector dot product they can (but normally wouldn’t) perform within a factor of 2 of the performance cores.
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
About This Newsletter
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.
This week’s new-listing highlights are below; the full listing of 134 jobs is, as ever, available on the job board.
Associate Director, Data Science, Medical and Real World Data Analytics (MRWDA) - Otsuka, Princeton NJ USA
This Associate Director role in Medical and Real World Data Analytics is for an expert in computational data analysis, mathematical modeling using both classical and modern methodologies from machine learning, complex systems, sensor data analysis, Engineering (except detailed software engineering). This expert will have a strong fundamental background (accomplishments demonstrated through knowledge as well as original work published in peer-reviewed avenues) to quickly pick up new areas or supervise work of others in new areas through a wide set of analytical skills.
R&D Engineer (Experienced Project/Team Lead) - Sandia National Laboratories, Albuquerque NM USA
We are seeking an experienced Engineer to join our dynamic team. This role will act as a team leader within a hardworking team of computer scientists and software engineers to develop path-finder capabilities for customers starting new projects and to develop new capabilities for customers looking to extend, modernize, or otherwise transform their existing software products. The work will require considerable technical and interpersonal skills to help customers who may not have deep knowledge of software development and computer science understand what is possible and to envision winning solutions to their problems. The selected applicant will need to have the ability and willingness to work on site at Sandia/NM.
Director, Research Computing - Boise State University, Boise ID USA
Responsible for leading and directing day to day operations, staff, and projects for the OIT Research Computing team. Implements the overall strategic and tactical plans for the development and support of the University’s user constituencies for ongoing support of computing, storage, consulting, and software tools for Boise State researchers.
Senior Data Science Manager - Unity, Vancouver BC CA
The [email protected] team empowers teams across Unity to take advantage of terabytes of data our company ingests, processes and builds infrastructure around to do their best work. From user segmentations to predictions and optimization we create a solutions that allows other teams at Unity to create best-in-class services and tools for Unity users.Hire and develop a data science team that can improve Unity’s understanding of its customers and products. Mentor data science talent to combine business understanding to scientific and technological problem-solving. With engineers, build data science solutions for production
Manager of User Services - Columbia University, New York NY USA
The Department of Systems Biology Information Technology (DSBIT) is a core IT team that manages an impressive High Performance Computing and Big Data Storage facility devoted to cutting-edge research in molecular and systems biology. DSBIT strives to provide high-level IT and Research Computing support for numerous departments/institutes/research-labs, including the Herbert Irving Comprehensive Cancer Center, the Institute for Cancer Genetics and the JP Sulzberger Columbia Genome Center. DSBIT is searching for a Manager of User Services to lead our talented IT team. The position manages DSBIT DevOps on high-end Servers, complex software environment for highly skilled researchers (PhD students and above). The position requires significant customer service skills involving personal interaction, multi-tasking, and strong time-management skills. Our Data Center services run in a 24x7 mode, occasionally attending emergencies during off-hours is a part of the job.
Associate Director, Early Clinical Development, Nonclinical Statistics - Pfizer, La Jolla CA USA
Responsible for ensuring sound statistical thinking and methods are utilized in the design, analysis, and statistical interpretation of studies focused on oncology preclinical and translational research. Collaborate with various scientists in the design, analysis and reporting of biological and ‘omics data from target identification, assay development, in silico, in vitro and in vivo studies, including statistical interpretation of published or internally generated results in order to inform decision-making. Interact with translational assay lead, research project lead and external experts to assure sound quantitative approaches are applied to data collection and analysis.
Senior Data Scientist/Data Science Manager - Harnham (Recruiter), London UK
This is an exciting opportunity within a leading tech company in the food space. This company’s analytics function is growing rapidly going through round after round of funding and they have one of the best functions in the market. This leading food/tech company are heavily driven by Data and Analytics and are looking for a Senior Data Scientist and a Data Science Manager to deliver actionable and meaningful insights using advanced analytics across a wide variety of projects. This company has a vast amount of data to work through and as a Senior Data Scientist and a Data Science Manager, you will be working closely with talented individuals to deliver essential insight that will allow the company to continue growing.
Manager Health Info & Data Integrity - Winnipeg Regional Health Authority, Winnipeg MB CA
Under the general direction of the Shared Health Director Health Information Services, the Manager Health Information and Data Integrity will manage a team that will contribute to leading, planning, implementation, adoption and maintenance of health systems, operational standards, policies and processes for health information to meet the needs of patient & client service providers. This role will also support health information data integrity standards in compliance with data governance outlined by Provincial Information Management and Analytics department and Provincial Clinical Teams to provide health information services connections across the province.
Manager, Data Quality - Data Office - Scotiabank, Toronto ON CA
Data Quality, part of Data Office, is responsible to promote and govern all Data Quality operations within Scotiabank. Data Quality operations includes, Data Quality (DQ), Data Quality Issue Management (DQIM), Visualizations and Data Profiling. This position will be mainly responsible to support the Data Quality operations with necessary support for others streams of Data Quality team’s initiatives. Support Chief Data Office (CDO) in establishing DQ and DQIM initiatives. Provide consultation and guidance to project by conducting and coordinating live demos, solutioning and walkthroughs of current tools and process. Provide analysis to project teams in terms of work effort and budget estimation
Senior Analyst/Manager, Data Quality Issue Management - Royal Bank of Canada, Toronto ON CA
Ensuring data integrity begins with an understanding of financial services data, data quality controls and identifying and resolving data issues. As a Senior Analyst/Manager, Data Quality Issue Management you will be accountable for driving improvements in Risk and Finance data through the identification, root cause analysis, and management of data issues through to resolution for business critical processes and regulatory reports. Reporting to the Associate Director, you will define and monitor data quality controls and resolve data issues for business partners across the organization including Finance, Group Risk Management, Personal & Commercial Banking, Capital Markets, and Wealth Management.
Director, Machine Learning - Gaggle, Remote US
We are looking for a technologist to lead a distributed team to design, implement, validate and run the machine learning models that our customers and employees use to protect kids and save lives. At Gaggle, we help schools prevent suicide, self-harm, violence, and bullying. Identifying new opportunities to use our data to protect students. Lead an ML team at building models to identify questionable text and images. Data analysis and design of experiments. Automating and optimizing the process to prepare data and train models.
Data, Analytics and Colloid Science Manager - Unilever, Bedford UK
The successful candidate will join BPC Research at an exciting time, with the opportunity to influence and lead our work to meet the reformulation challenges associated with Beauty and Personal Care’s People and Planet Positive commitments. You’ll be working within an environment benefitting from existing state-of-the art Digital facilities, such as the Materials Innovation Factory built in partnership with Liverpool University: https://www.liverpool.ac.uk/materials-innovation-factory/ and the Advanced Manufacturing Centre in Port Sunlight https://www.unilever.co.uk/news/press-releases/2018/unilever-officially-opens-new-north-west-innovation-centre.html and significant investment in data and analytics across our Global sites. PhD in physical sciences or equivalent research background; a background in physical chemistry is preferred; a background in colloid science is most preferred. OR MSc in physical sciences plus relevant experience in the field of data and analytics/predictive modelling.