Biotechnology companies have on-premises data from scientific instrumentation that needs to be easily stored by the researcher, secured and be available for data analytics. This Panel Discussion between PTP and AWS discusses this hybrid cloud reality and how PTP’s deployment of AWS Storage Gateway is the answer.
Moderated by Gary Derheim, this team of Cloud and Life Sciences experts included Taze Miller of AWS along with Bill Amsbaugh, Aaron Jeskey and Eric Ransden of PTP. We discussed the data challenge faced by biotechnology companies, what AWS Storage Gateway is, considerations for implementation, lessons learned from PTP’s experience and the impact that this solution has had in several use cases.
More about the AWS Storage Gateway:
In many modern research and discovery (R&D) laboratories, scientists have a array of scientific instrumentation used to provide data for a variety of experimental purposes. The types of instruments and output data are primarily based on the mission of the research programs. For example, small molecule drug discovery may leverage large databases to contain chemical registry data and associated properties which lab robotics systems assist in high-throughput screening assays to assist in biochemical endpoint readouts, such as IC50 curves. As lead molecules progress through the pipeline, additional groups will leverage the cellular assays to interrogate the impact the molecule has on target cell lines. This stage requires additional analysis platforms, such as rapid cell preparation, dispensing, and subsequent In Vitro high content imaging and analysis for determining morphological impacts on cell viability and impact of cell treatment across various concentrations of target molecules. As molecules progress furthers, subsequent In vivo studies may be conducted to evaluate ADME in select animal model, such as mouse, rat, dog, or other appropriate analogues. During these studies, as variety of data are produced including histology data which can become large in the quantity of data produced. In parallel, DNA sequencing and bioinformatics efforts may also produce large dataset to evaluate genetic validation of the intended target.
Scientific Data Flow
During the discovery process, many data types will be produced, analyzed, and correlated. Data storage architectures vary dramatically across organizations and is dependent upon the resources of an organization, skillset of the IT support staff, and technology selection. In the past, IT support teams in early-stage companies would struggle to acquire the budget to support enterprise-class storage to provide a centralized data repository. With the adoption of Cloud technology, the cost of storage has dramatically dropped in recent years, however the challenge of migrating the data from on-premises to the Cloud remains a challenge. Depending on the physical network connection to the Cloud, large data sets may take too long to upload therefore creating delays in data analysis. Additionally, interruptions in network connectivity can also jeopardize the experimental data itself if lab equipment attempt to directly stream data to the cloud. Therefore on-premises data storage is required.
PTP-AWS Storage Gateway
PTP has, in conjunction with AWS, engineered a solution to this issue. By combining AWS’s Storage Gateway technology with PTP’s storage technology platform, labs of any size can take advantage of a high-speed storage solution that caches 8-12 TB locally while seamlessly moving data to AWS. The local high-speed data cache provides high performance for rapid data analysis of all types. Rules can be easily configured to move data to the AWS Cloud based on many data attributes, such as file type, age, header information, storage availability, and many other attributes.
Benefits
For scientists, the Gateway provides several advantages over traditional storage solutions. As with traditional storage solutions, the Gateway can be mapped as a logical network drive on each instrument (i.e. – O:\data\Leica). Each instrument can be pointed to locally stream the data to that network drive, which has internal disk redundancy, which is safer than instrument PCs that typically have single points of hardware failure. Since data stored on the Gateway is synchronized to the Cloud, this provides inherent disaster recovery. For scientists, this also protects instruments protocols that have been carefully developed for each assay by backing them up to the Cloud. Version control can also be enabled using AWS technology once storage in the Cloud. Additionally, analysis servers can be created in the Cloud as well to perform complex data analysis leveraging HPC clusters or servers with a lot of horsepower to shorted analysis times.
An additional benefit IT support teams is it takes the guesswork out of forecasting data storage needs each budget cycle, as AWS provides nearly infinite storage capacity. The data migration rules can be tuned as needed to keep a reasonable buffer of storage capacity available. Furthermore, since PTP’s Gateway technology includes a virtualization component, the Gateway can also host Windows servers including Active Directory for directory replication to the Cloud to sync with both Azure and AWS services.
Summary
For life science companies of any size or maturity level, the PTP AWS Storage Gateway is a cost-effective way to fulfill the storage needs of scientists while providing scalability, redundancy, and organization agility in an easy to manage, cost effective way. Finally, there is a turn-key solution providing scientists with a bridge from on-prem to take advantage of the power of the Cloud.
Watch the full length panel discussion HERE!
More on PTP’s Life Sciences Practice HERE!