Home

Lessons From Real-World AI Deployments

Matthew Oostveen, Chief Technology Officer, Asia Pacific & Japan, Pure Storage

Matthew Oostveen, Chief Technology Officer, Asia Pacific & Japan, Pure Storage

Artificial Intelligence (AI) continues to capture the world’s attention, and indeed it’s been doing so since the 1950s with ground-breaking work from the likes of Alan Turing and John McCarthy. After several false starts, we’re finally starting to see the fruition of the early efforts of those pioneers almost 70 years on.

Pure Storage by comparison only celebrated its 10th anniversary as a company. But in those ten years, it has created tremendous disruption in the storage space by thinking out of the box and bringing performance levels in storage up to par with the other two legs of IT infrastructure - compute and networking. Big data and AI are intrinsically linked to one another so it was natural for Pure Storage to turn its attention to AI.

Partnering with NVIDIA, Pure Storage created the first AI reference architecture in the industry - AIRI. This has enabled the organisation to be involved in cutting-edge AI use cases such as autonomous cars with Zenuity, scientific research with Core Scientific, and cancer research with Paige.AI.

As a result of working with these companies, we’ve come away with some lessons which could be instructive for other companies that are looking to deploy AI.

1. AI is a Data Pipeline

The first thing you should know is that AI thrives on data.

With new deep learning techniques, the greater the volume of data, the better the performance of AI. However, data in legacy disk or tape storage presents a challenge to AI because they are not as accessible and can’t be read quickly enough. New AI frameworks and tools such as Caffe2, mxnet, Pytorch and TensorFlow, and faster GPUs from the likes of NVIDIA are putting additional strain on legacy data infrastructure, which holds back AI development.

2. Don’t throw your Data into a Data Lake

Way back in 2014, a consultant from PricewaterhouseCoopers said: “We see customers creating big data graveyards, dumping everything into HDFS (Hadoop Distributed File System) and hoping to do something with it down the road. But then they just lose track of what’s there.”

The world of data analytics has since changed. When Google created the Google File System 15 years ago, which inspired the creation of Hadoop and HDFS, the assumptions about data then were: that typical file sizes were large; access was sequential; hardware failure was a norm; data was batched, and networks were slow. This led to data platforms on distributed disks that had lots of disks in nodes, 3x data replication, batched workflows and fixed compute to storage.

We live in a very different data environment today - where containers are gaining credence; file sizes are smaller; access is random, workflows are real-time; where apps and data have to evolve quickly, and where your infrastructure has to be elastic.

3. To Cloud or not to Cloud

One of the first things you should consider before you embark on your AI project is whether you will run it on-premises or in the cloud. It really is dependent on your needs. A cloud-based service enables you to start immediately without having to be bogged down with building your own infrastructure.

However, if you are concerned about costs, on-premises is superior despite the high initial sticker price. Pure did a comparison of the cost of purchasing your own infrastructure such as AIRI and renting equivalent capabilities on a cloud service and found that over three years, on-premise cost 60% less than cloud.

Onwards and Upwards

The applications of AI are limited only by your imagination, and general AI will open up even further possibilities in the future as machines add to humanity's creative force. It’s staggering to think how far we’ve come in a relatively short space of time. With the industry pulling in the same direction and solutions such as AIRI publicly available, real-world AI is truly here. See This: Energy Tech Review

Mohammed Anisur Rahman, Senior Executive Vice President (SEVP) & Chief Information Officer (CIO), NCC Bank Limited, Bangladesh

Thomas Knapp, CIO, Waterstone Mortgage Corporation

Mansoor Karatela, CIO, Brisbane Airport Corporation

Kapil Mahajan, CIO, Safexpress Private Limited

Emilio Buzzi, MBA, Engineer, SMPC, PMP, PfMP, ITILf, Chief Technology Officer, TRF

Kesavan Sivanandam, Group Head- Global Airports, Ground Operations, AirAsia

Kim Ashley-Smith, Head of Office Operations Programmes and Steve Andrews, Head of Architecture & System Delivery, Herbert Smith Freehills

Dr Daniel O’Sheedy, Group IT Director, Enero Group [ASX: EGG]

Lessons From Real-World AI Deployments

Big Data

Hadoop

Weekly Brief