By Matthew Oostveen, Chief Technology Officer, Asia Pacific & Japan, Pure Storage
Artificial Intelligence (AI) continues to capture the world’s attention, and indeed it’s been doing so since the 1950s with ground-breaking work from the likes of Alan Turing and John McCarthy. After several false starts, we’re finally starting to see the fruition of the early efforts of those pioneers almost 70 years on.Pure Storage by comparison only celebrated its 10th anniversary as a company. But in those ten years, it has created tremendous disruption in the storage space by thinking out of the box and bringing performance levels in storage up to par with the other two legs of IT infrastructure - compute and networking. Big data and AI are intrinsically linked to one another so it was natural for Pure Storage to turn its attention to AI. Partnering with NVIDIA, Pure Storage created the first AI reference architecture in the industry - AIRI. This has enabled the organisation to be involved in cutting-edge AI use cases such as autonomous cars with Zenuity, scientific research with Core Scientific, and cancer research with Paige.AI.
As a result of working with these companies, we’ve come away with some lessons which could be instructive for other companies that are looking to deploy AI.
1. AI is a Data Pipeline
The first thing you should know is that AI thrives on data.
2. Don’t throw your Data into a Data Lake
Way back in 2014, a consultant from PricewaterhouseCoopers said: “We see customers creating big data graveyards, dumping everything into HDFS (Hadoop Distributed File System) and hoping to do something with it down the road. But then they just lose track of what’s there.”
The world of data analytics has since changed. When Google created the Google File System 15 years ago, which inspired the creation of Hadoop and HDFS, the assumptions about data then were: that typical file sizes were large; access was sequential; hardware failure was a norm; data was batched, and networks were slow. This led to data platforms on distributed disks that had lots of disks in nodes, 3x data replication, batched workflows and fixed compute to storage.
We live in a very different data environment today - where containers are gaining credence; file sizes are smaller; access is random, workflows are real-time; where apps and data have to evolve quickly, and where your infrastructure has to be elastic.
3. To Cloud or not to Cloud
One of the first things you should consider before you embark on your AI project is whether you will run it on-premises or in the cloud. It really is dependent on your needs. A cloud-based service enables you to start immediately without having to be bogged down with building your own infrastructure.
However, if you are concerned about costs, on-premises is superior despite the high initial sticker price. Pure did a comparison of the cost of purchasing your own infrastructure such as AIRI and renting equivalent capabilities on a cloud service and found that over three years, on-premise cost 60% less than cloud.
Onwards and Upwards
The applications of AI are limited only by your imagination, and general AI will open up even further possibilities in the future as machines add to humanity's creative force. It’s staggering to think how far we’ve come in a relatively short space of time. With the industry pulling in the same direction and solutions such as AIRI publicly available, real-world AI is truly here. See This: Energy Tech Review