Technology & Innovation
day one project

Erika DeBenedictis的采访

02.02.22 | 6分钟阅读 | 威尔·里克(Will Rieck)的文字

2022 Bioautomation Challenge: Investing in Automating Protein Engineering
施密特期货首席创新官托马斯·卡利尔(Thomas Kalil)采访了生物医学工程师Erika DeBenedictis

Schmidt Futures正在支持一项倡议 - 2022年的生物自动化挑战 - 旨在加快蛋白质工程领域的领先研究人员的自动化。美国科学家联合会将担任这一挑战的财政赞助商。

该计划是由Erika DeBenedictis设计的,他还将担任计划主任。Erika拥有MIT的生物工程博士学位,还曾在西雅图华盛顿大学的生物化学家David Baker的实验室工作。

Recently, I caught up with Erika to understand why she’s excited about the opportunity to automate protein engineering.

Why is it important to encourage widespread use of automation in life science research?

自动化提高了生命科学的可重复性和可扩展性。如今,很难在实验室之间转移实验。这会减慢整个领域的进步,无论是学术界还是从学术界到工业的进步。自动化允许无摩擦共享新技术,从而加速了新技术的更广泛可用性。它还使我们能够更好地利用科学劳动力。生命科学中广泛的自动化将使所花费的时间从重复实验和更具创造力,概念上的工作转移,包括设计实验并仔细选择最重要的问题。

您是如何对自动化在生命科学中扮演的角色感兴趣的?

我开始在生物研究生院gineering directly after working as a software engineer at Dropbox. I was shocked to learn that people use a drag-and-drop GUI to control laboratory automation rather than an actual programming language. It was clear to me that automation has the potential to massively accelerate life science research, and there’s a lot of low-hanging fruit.

为什么这是鼓励采用自动化的合适时机?

The industrial revolution was 200 years ago, and yet people are still using hand pipettes. It’s insane! The hardware for doing life science robotically is quite mature at this point, and there are quite a few groups (Ginkgo, Strateos, Emerald Cloud Lab, Arctoris) that have automated robotic setups. Two barriers to widespread automation remain: the development of robust protocols that are well adapted to robotic execution and overcoming cultural and institutional inertia.

自动化在生成机器学习所需的数据方面可以扮演什么角色?当今公开可用的数据集有哪些局限性?

There’s plenty of life science datasets available online, but unfortunately most of it is unusable for machine learning purposes. Datasets collected by individual labs are usually too small, and combining datasets between labs, or even amongst different experimentalists, is often a nightmare. Today, when two different people run the ‘same’ experiment they will often get subtly different results. That’s a problem we need to systematically fix before we can collect big datasets. Automating and standardizing measurements is one promising strategy to address this challenge.

Why protein engineering?

Alphafold的成功强调了每个人使用机器学习来了解分子生物学的价值。机器学习的指导性闭环蛋白工程的方法越来越发达,自动化使科学家从这些技术中受益匪浅。蛋白质工程也受益于“机器人蛮力”。当您设计任何蛋白质时,它总是很有价值,可以测试更多的变体,从而使该学科从自动化中受益。

If it’s such a good idea, why haven’t academics done it in the past?

成本和风险是主要障碍。哪种方法可自动化和远程运行有价值?自动化会像预期一样有价值吗?这是完全不同的研究范式;会是什么样子?即使假设一位学者希望继续花费30万美元用于进入云实验室的一年,也很难找到资金来源。很少有实验室拥有足够的可支配资金来支付这笔费用,设备赠款不太可能支付云实验室的访问费用,而且很明显,NIH或其他传统资助者是否会在预算中以这种费用有利地看待这种费用。R01或等效。此外,在没有数据证明特定应用程序的自动化实用性的情况下,很难寻找资金。总之,只有很多进入障碍。

You’re starting this new program called the 2022 Bioautomation Challenge. How does the program eliminate those barriers?

该计划旨在允许学术实验室几乎没有风险和无费用测试自动化。邀请小组提交提案,以了解他们想自动化的方法。选定的建议将获得三个月的云实验室开发时间,以及慷慨的试剂预算。成功自动化其方法的小组也将获得过渡资金,以便他们可以继续使用其云实验室方法,同时使用其全新的初步数据申请赠款。这样,实验室就不需要预先投入任何资金,并且能够在寻找长期资金之前决定他们是否喜欢自动化的工作流程和自动化结果。

从历史上看,在自动化方面进行的一些投资令人失望,例如1980年代的通用汽车或2010年代特斯拉。我们可以从其他行业的经验中学到什么?有风险吗?

For sure. I would say even “life science in the 2010s” is an example of disappointing automation: academic labs started buying automation robots, but it didn’t end up being the right paradigm to see the benefits. I see the 2022 Bioautomation Challenge as an experiment itself: we’re going to empower labs across the country to test out many different use cases for cloud labs to see what works and what doesn’t.

Where will funding for cloud lab access come from in the future?

Currently there’s a question as to whether traditional funding sources like the NIH would look favorably on cloud lab access in a budget. One of the goals of this program is to demonstrate the benefits of cloud science, which I hope will encourage traditional funders to support this research paradigm. In addition, the natural place to house cloud lab access in the academic ecosystem is at the university level. I expect that many universities may create cloud lab access programs, or upgrade their existing core facilities into cloud labs. In fact, it’s already happening: Carnegie Mellon recently announced they’re opening a local robotic facility that runs Emerald Cloud Lab’s software.

BioFabs和核心设施将扮演什么角色?

In 10 years, I think the terms “biofab,” “core facility,” and “cloud lab” will all be synonymous. Today the only important difference is how experiments are specified: many core facilities still take orders through bespoke Google forms, whereas Emerald Cloud Lab has figured out how to expose a single programming interface for all their instruments. We’re implementing this program at Emerald because it’s important that all the labs that participate can talk to one another and share protocols, rather than each developing methods that can only run in their local biofab. Eventually, I think we’ll see standardization, and all the facilities will be capable of running any protocol for which they have the necessary instruments.

除了蛋白质工程外,生命科学中是否还有其他领域可以从云实验室和机器学习的大规模,可靠的数据收集中受益?

我认为有很多领域会受益。与可重复性斗争的领域是手动重复的和时间密集的,或者从将计算分析与数据紧密整合在一起的领域都是自动化的良好目标。显微镜和哺乳动物组织培养可能是另外两个候选者。但是,如果有机会收集数据,社区需要做很多智力工作,以表达可以通过机器学习方法解决的问题。