I had the privilege of presenting recently at the DataDirect Networks Life Sciences Field Day, hosted by Rockefeller University in New York. You can now watch the video and access the slides online, but I thought I would summarize the key takeaways and provide some additional references to scientific papers.

Key Takeaways:

High Content Screening (HCS) is a key area of drug discovery which is showing promise for the use of deep learning. Cellular phenotyping can benefit from multi-scale convolutional neural networks (M-CNNs), not only in identifying and narrowing the compounds of interest but also in predicting effective concentrations.

Machine Learning and AI projects require a diverse set of expertise and perspectives, driving interdisciplinary collaborations. It is important to maintain a holistic and balanced approach across scientific, technology, data science/analytics, and business domains to yield optimal results. For example, how much data you are working with depends upon your perspective: are you considering a single well vs. a 1536-well plate vs. the set of images produced for a given experiment vs. the effective run-rate for a production lab?

The wet lab portion of HCS requires extensive experimentation and tuning of compound and target sample preparation, fluorescent staining techniques, improvements in optics and data capture, etc. When viewed from the wet lab perspective, the downstream analytics, or the infrastructure required to process the data, can each become a rate-limiting step. If the deep learning effort becomes solely focused on performance, it may lose sight of model convergence and accuracy; conversely, it may become exquisitely overfitted to a single training dataset; whereas, what is required is a generalizable model which can readily support multiple experiments in production (with appropriate accuracy and performance).

Several recent discoveries have had a measurable impact on this project and deep learning generally — scale multiple workers per node and across a cluster; after the initial warmup period, scale the learning rate linearly with the global batch size; apply learning rate exponential decay, followed by aggressive decay. Scaling to 4 workers per node resulted in a 2x performance gain, while extensive learning rate tuning was essential to achieve model convergence and state-of-the-art accuracy with a global batch size of 256.

A number of teams have worked independently on these considerations; the references included below are not meant to be definitive; consider them as good breadcrumbs for you to pursue according to their relevance. Perhaps the obvious subtext here is that the field is both nascent and rapidly evolving, and it is wise for teams (and the larger role of our open source community) to continually scan the literature to keep up on the latest best practices.

Embrace heterogeneous computing both on-prem and in the cloud; collaboration across science and scientific computing is essential to maximize the use of available infrastructure. Utilize system architectures that incorporate a balance between the demands of compute, memory, I/O and storage. Avoid silo workloads run on purpose-built appliances which cannot readily scale to meet the demands of heterogeneous science; use the technologies and architectures which deliver the best value, yet build a system architecture which embraces and maximizes the effective use of these diverse technologies. Efficiencies become more apparent when viewed across the mix of production workloads, not as isolated single workload single node benchmarks.

Employ benchmarks and optimization in the stages where they deliver the greatest impact, during scaled production deployment, transitions between technologies or algorithms or shifting from R&D to production. Develop sandboxes and allow scientists the flexibility to work in their language of choice during early stage R&D, where the focus is on developing new algorithms with higher accuracy, less emphasis on performance until higher throughput becomes required. Use the latest optimized frameworks and libraries like Intel optimized Python to deliver near native performance. Deploy Singularity containers to deliver improved manageability, reproducibility, and scalability; design-in security from the ground up.

Dig Deeper:

View Perspicace Industry Highlights on High Content Screening
original scientific paper which forms the basis of this collaboration:
- Godinez et al, A multi-scale convolutional neural network for phenotyping high-content cellular images. Bioinformatics, 2017.
- Public Dataset: Broad Bioimage Benchmark Collection BBBC021; Human MCF7 Cells – compound profiling experiment
research in learning rate optimization:
- Samuel L. Smith, Pieter-Jan Kindermans, Chris Ying & Quoc V. Le, “Don’t Decay the Learning Rate, Increase the Batch Size.” Published as a conference paper at ICLR 2018, arXiv:1711.00489 [cs.LG]. See also ICLR2018 review comments.
- Goyal et al., “Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour”, 2018.
- Valeriu Codreanu, Damian Podareanu, Vikram Saletore, “Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train.“ arXiv:1711.04291v2 [stat.ML] 15 Nov 2017.
- Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, Kilian Q. Weinberger, “SNAPSHOT ENSEMBLES: TRAIN 1, GET M FOR FREE.” arXiv:1704.00109v1 [cs.LG] 1 Apr 2017.
- Smith, Leslie N. “Cyclical Learning Rates for Training Neural Networks“, arXiv:1506.01186 [cs.CV], original 2015, updated April 2017.
scale multiple workers per node using simple run-time config changes:
- Boosting Deep Learning Training & Inference Performance on Intel® Xeon® and Intel® Xeon Phi™ Processors
use of Horovod to scale TensorFlow across multiple nodes:
- Using Intel® Xeon® for Multi-Node Scaling of TensorFlow with Horovod
summary of latest CNN Horovod best practices at the end of this (unrelated) paper on biomedical image segmentation
- Training Deep Convolutional Neural Networks with Horovod* on Intel® High Performance Computing Architecture
- Get the code: https://github.com/NervanaSystems/topologies
get the latest Intel optimizations — Intel® optimized TensorFlow Wheel, Intel® optimized Python Anaconda
learn about DataDirect Networks shared parallel storage solutions including DDN A³I: ACCELERATED, ANY-SCALE AI

Do these experiences resonate with the challenges and breakthroughs you have seen working with CNNs? Do you have other best practices to add here? Where are you currently applying machine and deep learning? Reach out to us here at info@perspicace.ai.

Many Thanks to DDN George Vacek, Yvonne Walker, Manousos Markoutsakis, and John Holtz for making the event possible! Check out the other presentations at LSFD18.

Deep Gratitude & Acknowledgements to the Novartis and Intel teams for the amazing collaboration over the years.

|pɛʀs.pi.kas|

DDN LSFD18: Perspicace presents "High Content Screening in Drug Discovery" /November 8, 2018 by Kristina Kermanshahche

Key Takeaways:

Dig Deeper:

Featured Posts