RCT #163 - Measure What Matters - Logic Models. Plus: Reducing the lottery factor; Polars; LLMs in Production; Cloud-first databases.

the outcome is the entire point

                April 9, 2023

            RCT #163 - Measure What Matters - Logic Models.  Plus: Reducing the lottery factor; Polars; LLMs in Production; Cloud-first databases.

            Last week (#162) I spent a lot of time on training evaluation and the Kirkpatrick model, but the underlying idea has much wider application.
We chose training interventions — or other services or products for our community — because, consciously or unconsciously, we have some mental model for how this work will impact research and scholarship in our organization.  There’s typically some pretty simple mechanism that we’ve hypothesized.  That mechanism is observable, even if qualitatively, and from those observations we can improve the intervention’s impact.
Tools Like Logic Models Describe How Our Activities Drive Research Impact
The Kirkpatrick model can be thought of as evaluating along different stages of some logic model:

The logic of our intervention is that we marshal some inputs, perform some activities, that generates some outputs, and those outputs lead to some desired outcomes.
Crucially, the outcome is the entire point.  There’s no reason to do the work except for the outcomes.  This isn’t a hobby, where the activity is its own goal.   We’re professionals, performing badly needed work, in a space where there’s far more that could usefully be done than we can possibly do.  Everyone is relying on us for the outcomes.  Not the outputs.  Not the activities. Certainly not the inputs.
The purpose of our team isn’t to teach classes, or run a computer system, or write software, or curate data, or perform data analyses.  Those are activities we perform.  They are useful.  It’s even possible that they are the most useful things we could be doing.  But they aren’t the purpose.  The purpose is the impact on research or scholarship the activities enable downstream.
Tools Like Logic Models Tell Us What To Measure And Improve
By measuring the quality of our inputs, the effectiveness of the activities, the usefulness of the outputs, and their applicability to some outcome, we can iterate and make our desired outcomes more likely.
By designing our interventions with the outcome in mind - like with hiring (#135) - we are much more likely to achieve those outcomes.  And it effects everything - which inputs we chose, what activities to perform, what outputs to aim for.
And yet, we tend to focus on the inputs or maybe the activities, because they’re easy to measure.  But they’re they least relevant of all.  Bad inputs can limit our impact.  But fantastic quality inputs of the wrong thing, or inputs where there’s no mechanism to support to outputs or outcomes, are pointless.
The Kirkpatrick model for training evaluation implies a fairly straightforward, linear, logic model:

Inputs: materials, venue, instructors… - evaluated via Reaction-type surveys.
Activities: instruction, exercises, pedagogy… - evaluated via educational assessments like pre- & post-tests
Outputs: Behaviour change; using the new skills/knowledge - evaluated via followup discussions, surveys, observation
Outcomes: Research & scholarship impacts - evaluated via followup discussions, testimonials, grants, publications, invited talks, citations…

Problems at each stage of the mechanism will limit impact downstream.  So being able to reason about and observe the process from start to finish is what helps us design, iterate, and improve what we do and how we do it to have the maximum impact.
The Mechanism, Not the Names, Are What Matter
You may well have noticed that even in the simple Kirkpatrick model above, the distinction between inputs and activities are a little fuzzy.
And if you have done volunteer board work for nonprofits, or are otherwise privy to that communities discussions, you’ll know that there’s sometimes five levels in a logic model, or the names used differ.  Or that there’s debates about logic models vs theory of change vs outcome mapping.
Those discussions are that community’s R vs Python, vi vs emacs, Ubuntu vs Rocky, MySQL vs Postgres, Rust vs C++. For practitioners who do that kind of work all day, those distinctions doubtless matter a great deal.  Happily, we can proceed unburdened by such concerns.
The important thing from our point of view is that we end up with some kind of proposed, observable, mechanism of how our resources and what we do add up to some meaningful impact.
These Tools Can Guide Programmes of Activities: RCT
I spent an hour or so trying to come up with a made-up example of a non-trivial logic model that would make sense to most of the readership, and I couldn’t come up with anything I liked.  So let’s tackle a real example we should all automatically have some kind of familiarity with, and that I spend some time thinking about - our nascent ResearchComputingTeams.org community.

The diagram is above, with possible future items are in dashed boxes.
Let’s look at the desired outcome (just one in this case): “Self-sustaining community of new strong research computing, software, and data leaders, in jobs where they can make a difference”.  That is not something that could be measured quantitatively with any degree of rigour, and yet it’s clear enough that any proposed effort could be evaluated against it.  It’s a beacon to steer by.
You certainly noticed that the diagram is kind of messy.  That’s good.  It’s important that with finite resources — and everyone always has finite resources — the activities and outputs interlock to support the outcome or outcomes.  It means the activities reinforce each other, rather than just being a todo list of unrelated items.
Ah yes, resource constraints.  I have about 12 hours a week for this (and Manager, Ph.D.) right now.  That’s not a problem to lament; it’s a boundary condition to be incorporated into the solution.  Sometimes, advocating for more resources is a possible activity; here it isn’t in the short term, and in any case, there’s no point in searching for more resources until we’re already having the greatest possible impact with existing resources.
The RCT effort in general, and this logic model in particular, is a hypothesis about having impact in on leadership in research computing, software, and data science/engineering/curation.  That hypothesis needs to be confirmed through data; and activities and outputs have to be improved with that data to support the outcome.  My main data collection mechanism at the first few steps for the foreseeable future is conversations with readers and community members.   That’s a vital feedback mechanism.  Right now, lack of that input is the rate limiting step.  There’s just not enough discussion.  I have two levers to work with there:

Increase engagement per community member somehow
Increase community size

There’s no point in building out any of the other activities until more feedback is happening; again, I’m aiming for impact, not simply doing these activities for their own sake.
Another argument for focussing on increasing community size and engagement: you’ll have noticed there’s currently no activities leading to the desired output of “community of RCT peers”.   There was a #research-computing-and-data channel on Rands Leadership Slack, but we lacked the critical mass to keep that going, so I let it get archived.   More and more engaged community members will allow some peer-to-peer support through a number of mechanisms, which will building towards the desired impact without increasing my time involvement.
As a result of this current focus on community size and engagement, you may have noticed that the interviews - which were going pretty well, I think - are on hold for now.  They’re good and useful things to do, for a few reasons!  But as done, they don’t address the biggest problem I have in increasing impact, and they take real time, so they’re on pause.
Also - and I hate this - I’ve recently realized that I’m going to put the job board on hold for now.  Again, it’s useful, I get comments that people really like it, it helps me keep a finger on the hiring pulse.  I personally think it’s really important, so the argument isn’t that it’s not good to have.  But it just doesn’t address the bottleneck in impact RCT currently suffers from, and it takes a surprising amount of time to populate and maintain.  If I want to have as much impact as possible with the current resources, the energy going into that has to be redirected, and the job board has to be put on hold.
Pausing the interviews and job board will help me respond faster to the input I do get (I’ve been unacceptably slow on this lately) and put things in place to get more input.  That has to be the priority.  If I’m not measuring how things are going, there’s no point in doing the work, because I won’t be doing the things that matter.
One last thing on resources and inputs: just because I can’t free up more of my time doesn’t mean I can’t find ways to increase the amount activities.  I can find other individuals or groups to work with - other teams in the same space aren’t competition, they’re potential collaborators (#142).   One thing I’d like to do is to see if I can find someone / some group in academic/public sector core facilities, or private sector data science, to work with on some activities.  That would allow me leverage my time with that of others.
The Value Of These Tools Is In How They Clarify Thinking
You might have noticed that my use of logic model above was kind of sloppy.  There’s items in the middle that could arguably be either activities or outputs, or should be shuffled around.  If I were using this to get money from a funder, I’d have to be stricter about meeting their categorizations for things, and redraw the diagram and label things accordingly.
But we don’t actually care about those distinctions in and of themselves.  We use tools like this principally to clarify our own and our teams’ or stakeholders’ thinking. If we have to distort the tool a little in the service of that clarity, that’s fine.  Like “strategy” vs “tactics” vs “objectives” vs “goals” (#130), the format and taxonomy aren’t what matters.  (Unless this is principally a compliance exercise that gatekeepers demand.  If it is, do what you have to do). What matters is the insight that tools like this or others help you come to.
There’s many other tools out there — SWOT, Wardley maps, Business model canvas, Value chains, Where to play/how to win, etc etc.  They are just tools. They have proven themselves useful to people in some context in the past; if they’re suggested to you, maybe they’ll be useful for you, too, or maybe they won’t be.  None of them produce answers; they’re just ways of organizing your or a teams’ thinking and communication in a way that might or might not help in your current circumstance.  If you try one and it doesn’t do much for you, it’s certainly possible that you’re doing it wrong, but it’s more likely that it just doesn’t bring to the surface or clarify or communicate anything that you need surfaced or clarified or communicated right now.  That’s fine.

Anyway, that discussion of RCT in an RCT issue was a little more meta than I had planned.  Was it useful?  Have you tried similar exercises when designing programmes of activity?  What’s been helpful for your team, and what hasn’t?  I’d love to hear - hit reply, or email me at jonathan@researchcomputingteams.org.
And now, on to the roundup - quite short this holiday weekend.  If this is a long weekend for you, I hope you find it enjoyable; if it’s a particularly special time of the year for you I hope you find it meaningful.
Managing Teams and Individuals
This week over at Manager, Ph.D. I talked about finding time constantly in small ways to make time to work on, rather than in, the team.  And there were articles in the roundup on:

The art of delegation
Being the (emotional) thermometer, not thermostat
Steps towards strategic thinking
Letting go of doing all the deciding yourself

Technical Leadership
Reducing the Lottery Factor, for Data Teams - Jacob Adler
Adler writes about data science or engineering teams, but the main considerations of small, expert, fast-learning teams with very mobile team members applies for teams across of our disciplines.

We’re lucky to have such excellent foundations of knowledge, but at what point does all that siloed knowledge become a liability? A hypothetical to demonstrate why: what happens if they win the lottery next week and resign? We may find ourselves scrambling trying to interpret their work, unprepared for certain situations, or even locked out of accounts.

I like how Adler sets up this article, with growing and evolving knowledge sharing practices as the team grows.
For one team member,

It is never too early to start documenting
Have a single source of truth for credentials
Be verbose with comments
Have a changelog for data sets, code…

Once you have two to five members,

It’s never too early to have a really strong onboarding process - both to share knowledge and to discover gaps in documentation
Assign scavenger hunts
Pair programming (or analysis or systems work or…) and peer review for knowledge sharing (in both directions)

Once you have five to ten team members,

As specialities develop, develop strong cross-training across practices
Start recording videos for the knowledge base to capture context that is hard to share in text

I’d add that for our teams, external knowledge sharing (giving talks and presentations or blog posts) can start happening at the 2-5 team member size, and can be super useful for visibility and professional development.

Research Data Management and Analysis
Polars for initial data analysis, Polars for production -  Itamar Turner-Trauring
After years of pandas being the only real Python data frame library in town, Polars came on the scene about a year ago.  It’s still fast-moving, which has pros and cons (I count 13 small breaking changes in the most recent release).  But it’s a very fast data frame library for Rust and Python (exposing a python API for data related work was a very good product decision), and well ahead of the latest Pandas had an Apache Arrow back end for fast columnar work (which is mostly what we want in our line of work).
In this article Turner-Trauring demonstrates using Polars for data exploration (where eager evaluation is useful, to keep one’s options open) and production (where lazy evaluation is useful, for speed and memory optimization).
Maybe relatedly, if you’re going to play with some new technology to learn it, why not use some fun datasets?  Esther Schindler advocates for using fun data sets to play with, and provides some pointers, in Groovy Datasets for Test Databases.

Emerging Technologies and Practices
A lot of teams are going to be asked to host Large Language Model (LLM) inference services in the near term.  MLOps in general is a lot more involved than just hosting an RShiny app or Jupyter notebook, and LLMs - especially if the plan is for them to frequently be updated - are still more involved.  Here’s a promising-looking free LLMs in Production workshop this coming week (13 Apr).

Building a database in the 2020s - Ed Huang, PingCAP/TiDB
Huang discussions considerations when building a cloud-first database. It’s interesting outside of that context, though.
I think a lot of our teams (especially system teams, but also software development) don’t yet have enough experience with building and maintaining cloud-native systems to realize that it can be a lot more than just running a program in a VM on AWS instead of on the local cluster.  There are opportunities, and user expectations, in cloud-based API-driven services which are very different than the more static deployments we’re used to.
Not that everything has to be elastically scalable, multi-tenant, and serverless from the point of view of the user, but those are possibilities and capabilities that cloud deployments give us, and that we should avail ourselves of more often than I think we do.  Huang’s article discussions of why that is matters pretty broadly.

Random
The term database “sharding” maybe came from an in-universe reference to parallel universes in a late 1990s MMORPG?
A simple mutex web service.
A simple command line tool for handling URLs - trurl.
Fine-Tuning LLaMa models on stack overflow questions and answers, with RLHF.  Approaches like this are going to worth looking into in the near future for answering repeated user questions at any centre large enough to have large user question databases.
Relatedly, there’s never been a better time to start learning or exploring deep learning, with new tutorials and tools popping up everwhere - Hello Deep Learning, GPT4All, or tuned lens for examining what’s going on in a transformer model layer-by-layer.
Walking carefully through current best practices in implementing hash tables (don’t implement your own hash table).
SSH authorization key experiments.
An text-based IDE which requires only VT100 capability, which means it can be run essentially anywhere - orbiton.

That’s it…
And that’s it for another week.  Let me know what you thought, or if you have anything you’d like to share about the newsletter or management.  Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
About This Newsletter
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology.  It’s teams, it’s communities, it’s product management - it’s people.  It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly.  But no one teaches us how to be effective managers and leaders in academia.  We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.

Jobs Leading Research Computing Teams
This week’s new-listing highlights are below in the email edition; the full listing of 193 jobs is, as ever, available on the job board.  This will be the last job board highlights for a while; I’ll leave it the board up on the web page for a couple of weeks.
Director of Cloud, HPC, and Web Services, School of Arts and Sciences - University of Pennsylvania, Philadelphia PA USA 

The Director of Cloud, HPC, and Web Services at School of Arts and Sciences Computing (SAS Computing) has a strong technical background and demonstrated experience leading IT teams in managing, innovating, and implementing technical services and solutions in a higher education environment. This role provides strategic direction and functional leadership for three critical infrastructure services for SAS: cloud Linux services (primarily AWS) supporting research, teaching, and administrative applications; on-premises Linux high performance computing (HPC) supporting cutting-edge research; and cloud-based Web Services supporting over 100 websites.
Associate Director of High Performance Research Computing - New Jersey Institute of Technology, Newark NJ USA 

The Associate Director of High Performance Research Computing will build, hire and supervise a team of Research Computing Facilitators (RCF) including full time staff and student workers. The individual will lead the team to develop and provide research computing support and services to faculty and researchers leveraging the university’s advanced computing research infrastructure and related resources.  The individual will be responsible for collaborating with faculty and student researchers across campus to understand and support their needs by developing and configuring software solutions that enable research projects. The individual will work with peers in the Information Services and Technology team to develop solutions to meet the researcher's needs. The individual will provide ongoing consulting and training sessions promoting the best way to use the infrastructure and related software.  The right candidate will be self motivated, keep up with trends and emerging technologies in the field, and provide forward thinking ideas to the research community as applicable. As a key member of the university’s advanced research computing team, this individual will collaborate with academic and technical colleagues to develop proposals and assist with securing external grants.
Director of Advanced Computing & Data Services - Clemson University, Clemson SC USA 

Clemson Research Computing and Data (RCD) is seeking a nationally recognized leader in advanced computing research support as Director overseeing a team of research scientists (computational research support specialists, ACI-REFs) in support of Clemson University research. Reporting to the Executive Director for Research Computing Engagement, the Director serves as a leader, actively engaging in outreach across Clemson to identify potential new users of advanced computing resources (including HPC, HTC, Open Science Grid, ACCESS, and cloud services). The Director of Advanced Computing and Data Services leads a team of research facilitators, all of whom work closely with faculty, post-doctoral associates, students, and research staff to understand their research, allowing for the customization of training and support solutions. The position leads those who provide application and user support to all areas of the Clemson community.
Manager, High Performance & Scientific Computing - University of Tennessee, Knoxville TN USA 

The successful candidate will have knowledge, skills, abilities, and experience to be an effective HPSC Manager. The HPSC Manager under the supervision and direction of the HPSC Director will: work closely with the HPSC full-time and student staff to deploy, operate, document, and manage the research cyberinfrastructure; deploy and manage hybrid cloud technologies and resources; manage and lead project teams to accomplish HPSC projects and projects crossing OIT and other organizational boundaries; consult with and support researchers to solve issues and make the best use of the services and computational, storage, and networking resources to fulfill the research mission of the University; work with stakeholders to identify and resolve technical barriers to research; and supervise, mentor, and work collaboratively with the HPSC full-time and student staff.  Work with and support the HPSC staff and OIT Operations for: the safe operation of equipment in the data center; the racking, unracking of equipment; diagnosing and replacement of equipment parts and components; and working with vendors for support of equipment.
Director of High Performance Computing (HPC) - Drexel, Philadelphia PA USA 

The University Research Computing Facility (URCF) is a centrally reporting core facility that provides access to high-performance computing hardware and software resources for research computing. The URCF occupies a 1,600 sq. ft., 0.6 MW, climate-controlled server room that hosts the NSF-Drexel funded Picotte shared HPC cluster, which consists of 4,224 compute cores and 48 Nvidia Tesla V100 GPUs along with a high-capacity storage system. The Director of High Performance Computing (HPC) is responsible for the design, installation, monitoring and maintenance of hardware, software and networking equipment for HPC systems in the URCF. The position reports to the Operations Director of Research Core Facilities (part of the Office of Research & Innovation). The Director of High-Performance Computing will also work closely with the Faculty Director of the URCF, who leads a Faculty Advisory Committee charged with helping the facility develop and meet its strategic, financial and operational goals. URCF financial administration is provided by the Office of Research & Innovation.
Research Software Manager, Advanced Research Computing - University of Birmingham, Birmingham UK 

The postholder will be part of the Advanced Research Computing Team (ARC) in IT Services, a well-respected and highly performant team with a national and international profile. As the postholder develops in this role, they will be expected to collaboratively work with academic colleagues to secure external funding for future software development projects, to meet the growing demand for support in this area and ensuring sustainability of software created for research.
Manager, Neuroinformatics Operations, Krembil Centre for Neuroinformatics - Centre for Addiction and Mental Health, Toronto ON CA 

The Krembil Centre for Neuroinformatics (KCNI) is hiring world-leading specialists to transform our understanding of mental health by organizing, integrating, analyzing, visualizing and modelling data across all levels of the brain —from genes to circuits to behaviour. The KCNI is currently seeking a Manager, Neuroinformatics Operations. Reporting to the Operations Director, Krembil Centre for Neuroinformatics, the Manager Neuroinformatics Operations provides oversight and people-management for the Research Informatics/Neuroinformatics Operations portfolio, including Research Informatics services encompassing front-line IT support for all CAMH research programs through research software, hardware, data management and analytics. The Manager Neuroinformatics Operations supports the mission of the Krembil Centre and core operations including staffing, financial reporting and program management. This position is responsible for the provision of information technology and bioinformatics services to CAMH’s research community, planning and management of computational research infrastructure, and a research IT support services team.
Inaugural Director of the Center for Biomedical Informatics - Rutgers University, New Brunswick NJ USA 

Posting Summary Rutgers Biomedical and Health Sciences (RBHS) invites exceptional candidates to apply for the role of inaugural Director of a new Center for Biomedical Informatics that is being launched to promote excellence in the broad area of biomedical informatics. The Center will incorporate existing expertise in biomedical informatics, including bioinformatics, clinical informatics, clinical research informatics, public health informatics, and translational bioinformatics, and will facilitate and expand novel programs in this area to transform research, education, and patient care. It is envisioned to be a matrixed organization, where faculty will have their academic home in the Center but their primary appointments in appropriate departments.
Head of Technical Programmes - Genomics England, London UK 

We are currently recruiting for an experienced Head of Technical Programmes to lead the planning and delivery of key initiatives across our on-premise and cloud environments. The candidate will play a key role in organising, aligning, and coordinating activities such as data centre refresh programmes, data migrations, and high performance compute projects. They will work closely with the broader Scalable Technology tribe to shape our infrastructure strategy and ensure we have a robust on premise and cloud storage environment.
Director, Research Computing - Simon Fraser University, Burnaby BC CA 

The Director, Research Computing is responsible for providing strategic, operational, and administrative leadership to the delivery of researcher focused services to meet the diverse needs of the university community. The Director oversees a large portfolio including large research computing facilities including storage facilities, country-wide collaborations and services, high-performance network design, and operations. The director is responsible for defining and implementing strategies focused on delivering researcher-focused services, while leading a team dedicated to providing outstanding researcher support across SFU and in partnership with other IT and Academic units. As a key member of the IT Services (ITS) senior leadership team, the role participates and contributes to the development of the ITS strategic plan and leads continuous improvement initiatives within the Research Computing portfolio. The Director also reports, in a dotted line relationship, to the Associate Vice-President Research and International to assure close alignment with the University research priorities.
Research Computing Manager - Sunnybrook Health Sciences Center, Toronto ON CA 

The Research Computing Manager will maintain core systems and software along with supporting the specialized needs of campus researchers and instructors via individual consultations and group presentations. This position will collaborate closely with the hospital IT workforce as well as SRI scientists, staff, students, and partners at other institutions to provide broad technology support for operational issues, research, and teaching. The position will be responsible for managing our deskside support team, system administration team and REDCap team.
Head of Biostatistics - Altis Labs, Remote CA 

Our purpose is to transform clinical development by leveraging AI to enable faster, cheaper, and more successful clinical trials. As Head of Biostatistics (and our first biostatistics hire) at an early-stage company that is developing a novel method for treatment effect quantification, you will play a crucial role in helping us achieve our mission. You will be responsible for guiding both our internal evidence generation as it relates to prognostic model development, as well as guiding the applications of our prognostic models in our clients’ clinical trials. You will thus be working closely with leadership and across our teams.
Director, Research Knowledge Management Products - Digital for Research - Moderna, Cambridge MA USA 

Moderna is seeking an experienced product management leader with a proven track record of success in high-value technology companies and a strong background in Knowledge Management (KM) to lead our team. As a product management leader in this role, you will be responsible for driving our KM capabilities forward and will work closely with our research scientists to help them access and utilize the information they need to advance their work. With this as your key focus area, you will lead a team of product managers/junior product managers to develop and execute product strategies that leverage the latest technology and approaches in the biotech industry. If this sounds like the role for you then come help Moderna continue to put the tech in Biotech!
Sr Technical Manager-ML/AI/Data Science - SRI International, Princeton NJ USA 

The Vision and Learning group in CVT is searching for a Sr. Technical Manager who will lead cutting edge applied research projects in areas such as vision and language, adversarial robustness, explainable machine learning, zero-few shot learning, transfer learning, satellite image analytics, human behavior analytics and social multimedia analytics.
Director, Scientific Computing and Data Curation Division - US Environmental Protection Agency, Durham NC USA 

The Center for Computational Toxicology and Exposure (CCTE) within the Office of Research and Development (ORD) is a scientific organization responsible for carrying out EPA's mission to protect human health and the environment by developing and applying cutting edge innovations in methods to rapidly evaluate the potential human health and environmental risk associated with exposures to chemicals and ensure the integrity of the freshwater environment. The Director of CCTE's Scientific Computing and Data Curation Division (SCDCD) will oversee the development the information architecture necessary for integrating, transforming, and managing large scale data streams related to assessing the risk of chemicals and ecological integrity. The Director oversees the development and management of software applications and decision support workflows that visualize, analyze, and integrate complex data sources related to chemical safety and freshwater ecology.
Centre of Excellence Chief Operating Officer, Quantum Biotechnology - University of Queensland, Brisbane AU 

The Australian Research Council (ARC) Centre of Excellence (COE) in Quantum Biotechnology (QUBIC) is a nationally funded research Centre within the School of Mathematics and Physics. QUBIC is seeking to appoint a CoE Chief Operating Officer (CoE COO), known locally as Deputy Director, Strategy and Operations as an integral member of the senior leadership team to lead the Centre’s national operations. The role will be responsible for high-level strategic, operational and performance planning to enable the Centre to achieve its ambitious research goals and objective and to create broad impact through its training, outreach, equity and translation programs.
Data Platform Manager - St Vincent's Health Australia, Sydney or Melbourne AU 

We have an amazing opportunity for a Data Platform Manager to join our Data Governance & Analytics team within Digital & Technology. As Data Platform Manager, you will be critically important in the strategic direction, scaling, design, development, and operation of Enterprise Data Platform that includes the Azure Data Environment and Power BI Analytics Platform. You will spend your days leading a team of data engineers and data analytics specialists, working to understand data use cases in the business; then, design, build, and deploy scalable data products for internal and external stakeholders.
Engineering Manager - Veeva, Toronto ON CA or Remote 

Veeva is the leader in cloud-based software for the global life sciences industry. Committed to innovation, product excellence, and customer success, our customers range from the world’s largest pharmaceutical companies to emerging biotechs.We are looking for multiple Engineering Managers to lead and recruit a team of highly skilled engineers. You are comfortable working in a rapid, agile environment and thrive when challenged with solving complex problems. In this role, your focus will be creating amazing software solutions for our customers and make a positive impact on people’s daily lives.
Data and Analytics Manager - University of the West of Scotland, Paisley UK 

The Data and Analytics Manager will lead a team of analytics and insight-focused staff, delivering business-critical analytics and reporting across all aspects of the University including, but not limited to, utilising internal and external data to forecast student demand, delivering persuasive analytics on student performance, informing research performance strategy and evidencing resource allocation models.  Working closely with academic schools, the post holder will strive to understand customer data-needs, and work collaboratively with leaders of professional services across the University.
Oxford AI Research Group Project Manager - University of Oxford, Oxford UK 

We are looking for an ambitious and entrepreneurial Project Manager to oversee the day-to-day running of the research group. The Project Manager will be responsible for the operations of the research group, will act as a force multiplier on the group’s impact, and will promote its global reputation. This post is an exciting opportunity for someone who is keen to support and collaboratively work with a fast paced and world-leading research group. You will be responsible for translating the vision of the group into an operational plan and executing the same. You will coordinate with multiple stakeholders across industry and academia to organise and manage the group's project portfolio which is composed of many projects (> £3m).
Principal Scientist (Discovery Bioscience) - Cancer Research UK, Cambridge UK 

Biological exploration of novel potential oncology/immuno-oncology drug targets and biomarkers. Development and execution of cell-based assays to enable drug discovery projects. Expansion of the Bioscience group capabilities. Characterisation of small molecule and/or antibody for target validation
Manager, Clinical Data Scientist - Pfizer, Kirkland QC CA 

As part of the Data Monitoring and Management group, an integral delivery unit within the Global Product Development (GPD) organization, the Clinical Data Scientist is responsible for timely and high quality data management deliverables supporting the Pfizer portfolio.  The Clinical Data Scientist designs, develops, and maintains key data management deliverables used to collect, review, monitor, and ensure the integrity of clinical data, oversees application of standards, data review and query management, and is accountable for quality study data set release and consistency in asset/submission data.

Don't miss what's next. Subscribe to Research Computing Teams: