CarahCast: Podcasts on Technology in the Public Sector

Modern Artificial Intelligence Strategy: An In-Depth Conversation with Pure Storage

Episode Summary

Join thought leaders from NVIDIA, Pure Storage, and SFL Scientific as we discuss the foundations necessary in the formulation of an Artificial Intelligence Strategy in Government Operations.

Episode Transcription

Speaker 1: On behalf of Pure Storage and Carahsoft, we would like to welcome you to today's podcast, focused around modern artificial intelligence strategy, where thought leaders of Pure Storage, NVIDIA and SFL Scientific discuss the foundations necessary to formulate an artificial intelligence strategy in government operations.

Scott Goree: Welcome, everyone. I appreciate the time. I am thrilled to be here today. We're going to go in a in-depth conversation around artificial intelligence in the market. For those of you that don't know me, Scott Goree. I lead global distribution here at Pure Storage, for the last two years.

I'm joined by a few colleagues of mine. We're going to have a panel discussion today. So, first off, I have Tony PaikeDay, and he's a senior director of AI systems at NVIDIA. He's responsible for the go-to-market, for the DGX platform, of AI supercomputers, and helping organizations infuse their business with the power of AI, to infrastructure solutions that enable insights from data.

Next, I got Nick Psaki. Nick is my colleague at Pure Storage. He's a senior technical resource for federal, and he provides deep technical knowledge of flash storage system architecture that enables business and technological transformation for government enterprises. Nick's a fellow veteran. He's a veteran of the United States Army, and that's where he's gained his extensive experience designing, developing, deploying, operating information systems, data analysis, sensor integration and large-scale server virtualization.

And last but not least, we have Dr. Michael Segala. Now, I'm pleased to have you today. He's a PhD physicist, CEO of SFL Scientific, and they're a US-based data science consulting firm, specializing in developing production-grade machine learning, AI solutions across industries, and Michael leads SFL Scientific on its mission of delivering state-of-the-art algorithms to enhance research capabilities, drive operational excellence to their clients, and he focuses on ensuring sustainable growth and driving best-in-class AI expertise.

So, thrilled to have you guys here. Welcome to the party. We've got quite a few attendees today, so I'm thrilled to host you all. I'd love to start with Nick. Nick, if you could give us the current state of artificial intelligence out there in the market, love to hear from you, sir.

Nick Psaki: All right. So, thanks, Scott. I'm glad to be here with you guys today. Very early on, and particularly in the development of the FlashBlade object storage platform, we partnered with NVIDIA to create a turnkey artificial intelligence infrastructure.

The process of assessing and understanding what you need in order to do artificial intelligence training, development and deployment at scale was a very, very new endeavor. A lot of people hit the easy button and went to cloud. Other people don't have the types of environments or data sets that lend themselves readily to using public cloud infrastructure. But being a very new set of tools for a new kind of compute, we partnered with NVIDIA to create a capability that could be deployed as a hardware and software integrated solution right away, so basically all you had to do was add data and your algorithms.

Now, that's part of the challenge. The other pieces of this are exactly how you build data pipelines, understanding the questions that you're trying to answer, and developing the tool sets to answer those questions and leverage the richness of your data.

But the process of scaling from a laboratory environment or a pilot project on a GPU workstation, to expanding that to really meaningful, large-scale data sets, it can turn into a very costly and time-consuming exercise in do-it-yourself engineering. And you could spend months building racks of GPUs, or troubleshooting bottlenecks of legacy storage systems, is very often the bottleneck in these environments, is how fast can you shuffle data in and out of storage, to feed thousands, tens of thousands of cores that are processing data.

It's very much a supercomputing kind of problem set, where you have a thundering herd of parallelism and concurrency overwhelming the storage architecture. Then you've got to try and finesse open-source software into commodity hardware.

Basically, if you're going to build it yourself, you're responsible for all of the integration maintenance and sustainment that goes into that. Plus, workloads change, projects change, and a lot of times, we're seeing government customers now creating an artificial intelligence processing resource, and then making that available as a service, to other agencies within that service department, or other service departments.

The Joint Artificial Intelligence Center's an example of this, but also the Naval Surface Warfare Center at Crane has recently created an AI of a service capability available to the Navy. So, there's a service catalog that supports that.

So, when we created the artificial-intelligence-ready infrastructure, or AIRI, we set out to provide four really fundamental benefits that every agency and every company we've talked to as we've done this in the last three years really fundamentally needs.

And that's a single integrated platform for all AI workloads. Massively parallel scale-out processing power, with the high-performance data service platforms to match that, so you don't have an impedance mismatch and throughput bottlenecks, in other words.

Then, using remote direct memory access for moving big data quickly. NVIDIA, GPU Link, nConnect, and of course the RDMA over Fabrics protocol, or RoCE v2.

Then, finally, a scalable infrastructure that really makes it easy to insert additional nodes for capability, without having to refactor the entire stack, do a tremendous amount of reprogramming, et cetera.

So, it's got to be easy to scale, upgrade, manage, and it really needs to do that, really, automatically for you. Otherwise you spend months of downtime again trying to get your new capability or enhanced capability brought up.

So, what we've seen across the industry is an explosion in the utilization of AI for everything from contact centers to image analysis. Firefighting as an example, out West, in wildland fires.

Genomics, at the National Institute of Health, the Centers for Disease Control, and of course across the US pharmaceutical industry as we've pursued an understanding of the coronavirus, and developing the vaccines for coronavirus.

And those have been projects that have happened inside the pharmaceutical industry, as well as the Folding@home, created a specific set of algorithms for doing COVID virus folding. And Pure Storage was a participant in that, and continues to be a participant in that as well.

So, what we see is, people are still trying to figure out, how do I get from a lab-scale project on a single workstation to a full production solution? And what we offer is the capability to not have to try and solve, what hardware pieces do I need to go together? What software do I need to make it all work together?

We want to create the capability for customers to simply focus on answering the question, solving the problem, rather than trying to address, what do I need in order to be able to make this happen?

So, I guess the way that I could say this is, you have been presented with a steak. You want to make steak. You don't want to make knives and cutting boards, and a stove, and fire, in order to make your steak. You want to have all your tools and all the wherewithal you need to make dinner ready. So, all you have to focus on is, how do I want to cook the steak? What is the result that I'm trying to get?

And that's precisely what the three of us on this screen have done, is to give customers an end-to-end, holistic platform solution for embarking on artificial intelligence and machine learning endeavors. So, as I've said in rehearsal a few times, so thank you for attending my TED Talk.

Scott Goree: That's beautiful Nick, and for those of you that didn't see my video, it wasn't in there on the intro, know that I was quite animated and spirited, and all of that was from the top of my head. So, guys, I welcome you today. I love the amount of attendees we've got here. So, we've got a great audience. Love to go a little deeper, and hopefully we've recorded Nick's TED Talk, so we can clip that and publish that out later.

But I think, Michael, the first question I'd love to come to you is, we see a lot of leading sources discussed a high proposition of AI pilots failing. What's the main driver for those failures?

Dr. Michael Segala: Perfect. So, Scott, this is a great question, because I think it gives us the opportunity to actually level-set, and really think about the reality that many of these AI pilots are still failing, right?

And it's a fact that happens both in the commerical space in the public sector. The reasons for failure, which I'll talk about in a second, are by no means single-threaded. So, it's not, fix this one little gap here and we're automatically going to solve problems. It's a lot harder than that.

In general, what we need to be cognizant of is, like in any new technology, there's a very high barrier of entry when we're thinking about setting up anything, right? This could be AI. It could be anything that we're thinking about that is a technology revolution that we want to invest in.

And if you are looking as a pre-baked solution, as a vendor's going to come to you, and you want to buy that, or if you want to create something agnostic and novel yourself, both of these are going to present their own challenges, and both of them will have a high risk of failure if you're not thoughtful about that.

So, as an organization, what we see as the main drivers of failure, there's usually four big components, right? There's lots of other ones, but these are really the core ones that we see.

First and foremost is a lack of time spent early on to acquire and annotate, and then functionally organize representative data for your project, that will actually generalize to net new conditions that you need for your modeling.

So, at this point, we've worked on probably close to 1,000 different unique AI projects. And the single most common issue that we run into with all of our clients is, do you actually have a statistically significant and generalized data set to solve your specific problem at hand, right?

And it's not just in this very small lab environment, but will this data that we've collected, will this generalize to net new conditions, such that we can utilize these in the field?

And I know we've made a lot of improvements. Transfer learning, synthetic data, right? All these things out there are helpful, but by no means are they a true substitute for when we're building a truly bespoke and novel solution towards a real AI problem.

So, thinking holistically upfront about what that data's going to be, and how we're going to capture that, has to be fundamental. We see, especially in the public sector, all these big RFIs and RFPs come out, and they focus very heavily, which they should, about outcomes, and cost implications, and all these great things.

But I don't think I've ever seen one that truly outlays a realistic plan of capturing the data, or overcoming a lot of the data limitations to actually set us up for success, right? So, fundamental problem.

Two is really a lack of infrastructure, and then organizational knowledge required to maintain models and monitor them in the long term. So, again, out of your pilot, moving into now these new environments that we call MLOps or AIOps, or whatever Ops we want to call it, a pilot is not successful if we can just show it on our laptop that it works.

A pilot's successful when we take that piece of code and we deeply it somewhere, and it's actually driving the business application. That's success. So, we often see this lack of planning to, how are we actually going to utilize tools to integrate into these systems, into the existing workflows, or anything like that, to deal with data drift, model drift, business outcomes, everything in that space, right? So, that has to be very, very thoughtful in the beginning.

Another area, it's an interesting point, that people often overlook, of failure, is there's usually this weird inability to leverage frameworks and methods to other problem statements that might solve your specific issue already.

So, for example, a silly example, let's imagine that we created a really cool computer vision problem for ISR. It turns out that that probably could be adopted to fraud detection at the IRS. But if you're at the IRS thinking about your credit model, you'll never think about computer vision in ISR, right?

So, not having this holistic mind-share of saying, "Well, a lot of these problems can be borrowed, and the learnings and algorithms and the organizational challenges can be borrowed from category to category, to make you successful, and not treating things like black boxes will overcome a lot of the obstacles, okay?

Then, last, I think a significant problem that we see, and it's kind of created by all of us on stage, and we're all guilty of it, is this nebulous word cloud that's created around AI in general, right?

So, multi-cloud, on-prem, open-source frameworks, proprietary algorithms, AIOps, MLOps, DevOps, all of these things make it extremely difficult for someone who wants to get started in that space to truly evaluate solutions, and then understand, how do I get past the marketing message to really understand how this technology is going to help me.

So, my big push for you guys, if you're really thinking about this, especially from a fundamental strategy perspective, is, get in the habit of working backwards from your functional specifications and your actual statistical objectives, what are my requirements? What are my speed, performance, accuracy, storage, cost requirements? And then baking that into your pre-pilot initiatives to plan accordingly. And that will mitigate many of your risk factors. So, let me stop there. That was long-winded, but I hope that helps.

Nick Psaki: So, I was going to say, Mike, I wanted to comment on something that you said, which is, computationally, just cognitively with human beings, certain things are the same process regardless of how you're doing it.

So, whether you're doing coherent change detection of objects seen from above, like a five-gigapixel image, or you're doing analysis of the evolution of cancer metastasis on a five-gigapixel slide, it's a computer vision problem. And both of them are working with the same data formats, from the computer's perspective. It's the subjectivity of that data that is different. So, like I say, there are no unique data problems, only unique data perspectives.

Dr. Michael Segala: Right.

Nick Psaki: What we want as an outcome varies from user to user, but how we're going to process that remains, actually, very similar across different cognitive functions. And it's weird to talk about a computer having a cognitive function, but that is sort of the intelligence in artificial intelligence.

So, I really like, and as you pointed out, don't get boxed into the not-invented-here syndrome, and don't get boxed into seeing your problem from your perspective. Ask yourself, what is the problem that is being solved here, and who else has to solve that? And the great analogy being the IRS doing fraud detection, when you're trying to do, what is this object here on the surface of the Earth? Where was it before? Where's it going to, and what is it? All of these things are very much of a piece for a convolutional neural network.

Scott Goree: All right, I love it, guys. I think that’s the right level of conversation. I'd love to get Tony in. I'd love to get you into the mix here. As we look to the role of IT, what is their role, and how do they play in helping the businesses embrace AI?

Tony PaikeDay: Yeah, that's a great question. Actually, it's a great question because actually it leans into one of the points that Michael raised. And I think it was his point number two, around infrastructure.

Obviously, many organizations are doing some remarkable work in AI, the total absence, in fact, of IT involvement. Why is that? I mean, more often than not, developers couldn't find the compute resources that they needed in-house. And the almost ubiquitous availability of cloud, and in a convenient opex model, has made it easy for a lot of teams to stand up for what we sometimes call shadow AI. Namely, infrastructure silos that operate outside of IT, with or without their knowledge.

I think the sprawl of this shadow AI problem, especially in cloud, is continuing right under many IT teams' view, in the absence of an infrastructure strategy that could be enabling the kind of opportunities that we see many organizations and agencies being successful.

So, for IT teams, I think we need to recognize that unique infrastructure is required for AI, when you need to think about scaling an application in a production setting, or a production workflow. And that AI workloads consume resources, and different kinds of resources, unlike traditional enterprise applications, and application development.

And AI development workflow looks very different as well. It's highly experimental and iterative in nature, and the people building these models aren't software engineers. Nor do they follow good DevOps practices or rigor.

So, when getting started, I think it's important to realize that success lies in combining data science expertise with platform and infrastructure, with your IT, if you want to see more of that valuable AI intellectual innovation actually make it into production.

The confluence of these things is what we call MLOps. It's kind of a buzzword, feeding into what Michael had mentioned, but I think it represents this closing of the ranks between the people doing data science innovation, and the DevOps rigor that's needed if you want to actually industrialize and develop this stuff at scale, and actually see more of your models deployed in production.

I think IT has the role to create the platform and infrastructure that can foster all of that. We see lots of organizations doing what I'd call AI center of excellence. That sounds very large and grand and lofty, but it can be fairly simple in nature.

But it's more about, I think, the functionality that it delivers. Namely, the idea of IT being able to consolidate people, process and platform. Centralizing talent, because obviously data scientists are hard to attract, difficult to retain, and they certainly don't come cheap.

And many organizations have a talent acquisition problem. A lot of times, the talent that you need to build great AI models probably exists already within your organization, in the form of business analysts, and other people who know your data. They know your problems. They probably have some aspirational interest in becoming data scientists.

So, when you create this kind of community of experts within your organization, on this center of excellence platform, you can actually see a lot more talent groomed from within, about how talent development pipelines, and actually solve for this critical problem that I think affects a lot of organizations.

The second thing is platform, obviously there's expertise required to go from a great AI idea, to a prototype, to a deployed application, and when you centralize that expertise, you as an organization start to develop muscle memory around how to do this, and you again efficiencies and best practices.

And instead of multiple different silos that are attacking the same, or even different problems, but solving the same problems over and over again when they could be actually benefiting from each other's knowledge.

Then, the last piece is obviously infrastructure. The ability to save a lot of opex, and potentially capex, with centralized shared infrastructure that can be utilized much more effectively, speed the innovation cycle, help you, if you will, shorten the ROI of an AI investment.

So, I think IT is uniquely positioned to be able to affect that kind of change. So, a lot of times, especially in times like this, I think IT teams are looking to be agents of change, and enablers of transformation, instead of being seen as cost centers. So, this is one of those great places which they can really lean into with a prescriptive infrastructure strategy.

Scott Goree: Okay, perfect. I love that. I love that, the limiting of the risk of shadow AI, and then having IT playing the role of moving to an AI center of excellence, or CoE. I love that. If I come back to you, Mike, and I say, what practices have you seen in the private sector that the public sector can adopt to accelerate that maturity, and that movement in AI?

Dr. Michael Segala: Yeah. So, I think taking a little bit different spin. So, to Tony's point, in the commerical space, I think the IT ecosystem in general has adopted those principles, and they're getting there a lot faster than what we see in the public sector space, which is great, right?

So, they're doing that, and I think that's a driver for the public sector to adopt those technologies. But I want to focus on the other side of the house, of where we really see commerical companies, from the onset, for more of a strategic perspective, adopting principles that we can very much start to look at from the public sector, and follow in those footsteps, right?

So, in commerical companies, those that are undertaking this digital transformation, they're normally going through this multi-month or even year-long strategic planning initiative where they sit down, and they're thinking about a very clear roadmap towards their objectives. They're highlighting dozens of practical use cases that are tied to revenue. They're surveying emerging technologies of who they want to buy or partner with.

They're allocating budget. They're doing all sorts of interesting things around identifying bottlenecks, and setting themselves up so that it actually can have innovation, right?

A big part of this is really recognizing that you have to plan, and you have to plan accordingly across this extremely complicated ecosystem of IT, and budgets, and businesses, and revenue, and everything else that makes this process complicated.

So, to do that, I think the two biggest takeaways that I would say from the public sector, or from the private sector, that they're doing well, first and foremost is very strong leadership, that drives innovation. And that leadership sits, really, at the executive level.

So, a lot of private sector companies, they're hiring chief data officers, as a lot of the big new ones, right? Where they even call them chief AI officers, or something like that. It can be head of innovation, it could be head of R&D, or even in the smaller scale, it could be a data science manager.

And regardless of what we're trying to accomplish here, this technology leader has the mandate to vet ideas, vet technology, vet software, and really start thinking holistically around, how am I going to drive the organization, strategically, from data, infrastructure, algorithms, and everything else, right?

Because AI is hard. Let's not fool ourselves. It's a very complicated space, so having a really dedicated technical leader to help move this ship in the correct direction, which has the buy-in from the other executives, is going to be key.

Another big part, and I forgot, maybe... I think Tony even mentioned it, right? In the public sector, or in the private sector, we're building these very quick, agile R&D teams that have access to flexible environments, and they can quickly prototype and do AI use cases, and really, in this agile methodology, burn through a bunch of ideas very quickly and come up with a few that are going to work, and that are going to be sticky.

And their goals are really around enablement and acceleration of AI. If you can take these folks and give them access to the right infrastructure, that will deal with these computationally heavy workloads, and not backlog them in the minutiae, or not having tools and technology at their fingertips, you could really do a lot of things.

So, building these small... It can be very small, right... Center of excellence where they have access to software and hardware, you're going to be set up for success, or a lot quicker to success.

Unfortunately... Well, fortunately for us, because we work in the private sector a lot... Organizations can hire talent very quickly, and they can pay large sums of money, or they can even acquire whole companies.

I think somebody on this panel from NVIDIA just spent $40 billion on another company, right? They got money. They throw that money around all sorts of ways. And it's great, right? Because it really fuels the ecosystem.

But in the public sector, that doesn't work nearly like that, and there's all sorts of complications with fundings, and things like that, and it's hard to retain and maintain that talent. So, we really have to have a mandate for innovation at that leadership, at that executive leadership in the public sector.

Following up the next point here, I wanted to touch on it as well, and he mentioned it himself, right? In the DoD Crane... So, the Naval Surface War Center at Crane is a great example of this, right? They went all-in, in a sense where they spent some money, they got some AIRI pod architectures, to give their folks across the Navy, and obviously at Crane as well, access to quick ways to do innovation, right?

How can I quickly get my hands on some GPUs that I can spin up, test some use cases, and then scale those use cases out, to really mission-critical objectives? And now they're going to be reaping those awards for the next several months and years.

So, if you guys are thinking about, how do I get started, just naively follow in the footsteps, right? Say, what did they do? How are they pairing that? What steps did they take to make that successful? What schedules did they go on? How did they buy that, from a perspective? And start following in those quick footsteps to set you guys up for long-term success. So, that would be the easiest get-started there. So, yeah.

Scott Goree: Beautiful. Beautiful. All right. Tony, coming back to you, can I just build AI in the cloud, like everything else?

Tony PaikeDay: We often get this question, and we often find ourselves trying to rationalize one versus the other. But for AI, I think we have to recognize that both have a useful place in the AI development journey.

I don't think it's necessarily an either/or thing. But also say that, for many organizations who maybe consider themselves cloud-first or cloud-only, there is something to be said for... Cloud is not really necessarily the hammer for every AI nail, so to speak.

I mean, it's a great way to engage in productive experimentation, enabling your developers to get a fast start, with a low barrier to entry. It's also great at supporting temporal needs, as your development projects are starting to get under way.

We have seen, with many organizations, that through ongoing iteration, their models start to get more and more complex, consuming more and more compute cycles, and in parallel, the data sets fueling that training get exponentially large.

This is a point at which costs escalate, and I think a lot of organizations are finding that now, we hear this from a lot of customers. And it's a point at which they start to notice the impact of what we call data gravity.

By that I mean, more time and money is being spent pushing data sets from where they're generated, to where the compute resides. And this development speed bump, if you will, and the associated escalation in opex, is sometimes an inflection point, or a tipping point, at which organizations start to realize that there could be a benefit to a fixed-cost infrastructure that supports rapid iteration, at the lowest cost per training run, if I can frame it up like a metric, almost.

And ultimately, when developers can train models without restriction or fear of budget overrun, they'll build more creative, better applications, with the highest predictive accuracy possible, in the shortest time frame possible.

We see this as a very typically justification for why a lot of organizations are embracing a hybrid mentality, like they do with so many other things. Namely, couple dedicated infrastructure, whether that's on-prem, or in a co-location facility, I don't know if that really matters, but having dedicated infrastructure that supports the nominal volume of AI workload that you need to support.

And then bursting to cloud, for instance, for the temporal stuff. My bumper sticker version of that is, own the base, rent the spike, so to speak. And I think that's probably a useful way for folks to look at maybe considering hybrid as a good way to go.

The other thing, if I'm throwing out bumper stickers, and I've been using this for a time, until people tell me to stop saying it, is, train where your data lands. Ultimately, I think it's helpful to think about where your data lives and lands, and the proximity of your compute instances to that.

For some, both of those things are probably co-resident in cloud. For others, their data lake or data infrastructure is sitting either inside their data center, or in a co-location facility. And they will, at some point, I think, notice the impact of data gravity, and they will have to think about that tipping point, where they start to now repatriate the workloads, because they're at the point in that development cycle where it makes sense to do training at scale on that fixed infrastructure.

Scott Goree: Perfect. I love the slogan, own the base, rent the spike. So, you heard it here first. I love that. I need to get that bumper sticker, and put it on the car here. So, Tony, I'm going to stay with you for a minute. So, now, you talked about that fast start in the cloud, and then the fixed cost, and the hybrid notion. How can I integrate those AI workloads with my existing IT infrastructure?

Tony PaikeDay: Yeah. I would say that over the last four to five years, we've figured some things out about AI infrastructure, things that we've learned at NVIDIA, deploying our own infrastructure. We run a very large supercomputing facility called DGX SaturnV. There's a few thousand plus systems in there. We use it to power everything we do. R&D, research into graphics, video games, autonomous systems, robotics. I mean, anything we do runs on that infrastructure.

So, it's taught us some things about how to build infrastructure for an enterprise. I mean, obviously, we're on maybe one end of the spectrum, compared to where a lot of the mainstream is. But a lot of the design disciplines that are related to infrastructure are very much the same.

We've also seen this with customers around the globe, who've been deploying AI at scale. And some of these folks have a very large budget, and it's one thing if you have vast amounts of capital to spend on supercomputing infrastructure, but most organizations don't. And they're simply looking for the fastest way to access resources to fuel AI development. They have no interest in building a world-leading supercomputer, or anything like that. They're not trying to do a TOP500 run, with a benchmark.

The thing is, this kind of infrastructure is hard. And it's unlike what IT normally builds and manages. It's high-density, high-performance computing. It's high-performance storage and networking. It's cluster-aware, hardware and software, that is designed to solve complex algorithms, and parallelize them at scale.

So, striking the right balance of compute, networking and storage is not easy if you've never done it before. And I think putting it all together can take months, or longer, for some organizations.

When in reality, your data science team probably needs the platform yesterday. And all they really want to do is run experiments, build great models, and deploy production. They don't want to be systems integrators. They don't want to wrestle with open-source software. They don't want to troubleshoot a hardware stack and get bounced around between different vendors who are finger-pointing as far as why a model or a framework is running 20% slower this week than it was last week.

Supporting this stuff can be challenging for IT people in a production environment, especially if this is unfamiliar workload for them. So, this space creates a lot of instability for organizations that have a very simple mission around building and deploying a great AI application.

So, this is why we partnered with Pure Storage, to create AIRI. And you heard it referenced already, AIRI stands for AI-ready infrastructure. It's something that really follows the design disciplines and best practices that we've gained over the years, that I referenced, and figured out in partnership with Pure. And delivers a turnkey solution that deploys very quickly, fully integrated by our partners, like Carahsoft, who guide this from plan, to deploy, to ongoing support.

The AIRI solution is, in fact, an optimized platform, purpose-built for the unique demands of AI. So, you don't have to go figure that stuff out. I mean, who needs to spend time on that, right? And we're not all hyperscalers, or have decades of HPC experience.

So, this actually provides a valuable faster way to speed the ROI of your AI investment. And, in fact, you heard me say artificial intelligence center of excellence. AIRI, actually, for many organizations, has proven to be their AI center of excellence.

Scott Goree: I love the AIRI call-out. I think, Nick, you were going to jump in here a minute?

Nick Psaki: I was, and I wanted to go back and address question number three that you asked Michael, as well, because Michael really specializes... Tony and I, we make the things that make the activity possible. Michael's specialty is in, now that you've got this machine, and this data, how do you ask it questions? How do you solve your problems?

But what practices from the private sector can the public sector adopt to help accelerate AI maturity? Most organizations will not build something from scratch if they can buy something that already works. So, don't build it if you don't have to. Go buy something that already does what you need it to do.

The second thing is, is again, back to the software frameworks, you don't have to write your own PyTorch, or your own Caffe2, or your own Kafka, or anything else. Those tool chains already exist. They're well-documented, well-described, and massively supported across the industry, by Facebook, Amazon, and innumerable others, not to mention NVIDIA.

Then, learn a little bit about containerization, which is really the framework for machine learning operations and AI operations. It's very fundamentally different in some respects from virtualization. In other ways, it's very much the same.

But containerization and Kubernetes are much at the heart of effective DevSecOps and MLOps operations. It isn't hard, and frankly the platforms we're talking about are not integrated hardware but, of course, integrated software as well.

So, those frameworks are in there, and what this enables is really modular and parallel development of algorithms. Data scientists tend to work on multiple algorithms, or multiple iterations of an algorithm, simultaneously. That way, you can find out which ones are training most effectively. Which ones are operating most effectively. And you can kill ones that are non-performing, and then continue to augment the ones that are working. That's basic DevSecOps.

So, if the solutions are already out there that can get you along the road, this, to Tony's point, and to Mike's point, put the data scientists to work much more quickly, and much more effectively.

That's really where the value is in the organization. It's not in the hardware or the software. It's in the minds of very talented human beings, being the most productive they can be, doing the most experimentation they can do, and ultimately speeding the time to the development of effective AI algorithms, and their deployment into the field.

That's another thing that... AIRI is a sublimely powerful architecture, perfectly made for doing AI training and development. But once your Ais are built, they're going to go somewhere, and probably into something being driven by an NVIDIA Tegra process, for that matter. Like a self-driving car, or an air defense artillery system, or a fraud detection mechanism, or what have you.

So, the continuous iteration and evolution of those Ais can happen in this incredibly powerful supercomputing infrastructure, while their products go out into the world and actually do the jobs that you want them to do.

But there is a really a valid argument between whether you rent this capability in a public cloud infrastructure, or you create this capability, as Crane did, and as other AI centers of excellence are doing, they are actually the cloud. They're an as-a-service capability, AI-as-a-service, that other agencies can use and leverage.

So, we ought to not get locked into, again, it's a mental mindset thing that Michael pointed out. Don't get locked into the fact that the only cloud service providers on the planet are Amazon, Microsoft, Google, or anybody else.

The DoD particularly has the largest IT infrastructure on planet Earth. It's been growing it organically for the last 60 or 70 years. So, there's already a tremendous amount of network connectivity, facilities, and so forth, for creating this kind of capability, and then making it available across the Department of Defense and other government agencies.

So, this is an opportunity for the government to literally be able to control some of its own destiny. There's a final part to this that I think we often overlook. It's, from the programmatic perspective of resourcing it and funding, it's better the costs you know than the costs you don't know, and a lot of people are very frequently surprised to understand what the costing model is for an external service provider. And it could be everything from just the upfront cost of the compute, networking and storage, to things as esoteric as data access charges, data movement charges, and API call charges.

At the rate at which Ais tend to access data, and the scale at which they do it, and the number of APIs that get used in exploration and training, you can find yourself being overrun by 80,000 processes that are running anywhere between six and 24 hours a day, for 30 days a week. It can turn into a significantly higher cost than you might have programmed.

So, sometimes buying infrastructure, or cost-sharing infrastructure, can make an awful lot of sense, because you've got a fixed cost, and fixed sustainment cost, that you're aware of.

So, there's a predictability element to this that goes along with the capability element of having this incredibly powerful infrastructure literally ready to purchase, ready-made and built, from Carahsoft.

So, there's a delivery mechanism. There's a deployment mechanism. There's a sustainment mechanism. All of which are well-defined and described, which actually has a tremendous benefit to government agencies.

Scott Goree: Thanks, Nick. I love it. I think, Michael, we're going to come back to you. So, Mike, it always comes down to return on investment. As a sales leader, I'm looking at ROI, these systems have large upfront investment, sometimes without a clear ROI insight. How do we ensure that organizational success here?

Dr. Michael Segala: Yeah, I think this is a perfect follou-up to what Nick was just describing. It really boils down to, how do we manage costs, and how do we appropriately measure success? That's all we really want to know, at the end of the day.

I think we can categorize all problems into two very big categories, just for the point of this conversation, right? The first I'm going to just deem as the smaller to medium size automation efforts, where we can usually map pretty easily processes and technical development time to a clear outcome.

A simple example might be, how do I use AI to reduce manual labor for defect detection in manufacturing, okay? There is a very clear cost savings component from reduction of labor, met with some efficiency by being able to potentially, I don't know, find defects more accurately, or more often, or quickly.

And you can usually very accurately get an ROI calculation from there, right? I'm just going to deem that automation, small/medium size problem spaces. That's going to be an easy way to tackle it, from that side.

What I think we're really going after, especially in the public sector, especially within DoD, is these projects that require a very large upfront investment, and they're these moonshot projects, or they're DARPA-like initiatives, or you're building out completely net new systems or platforms that have a significant amount of risk and potential drift, and scope creep, and everything else that goes into them.

So, for these types of projects, it's going to be very, very hard, because there is no clear, "This is going to automate away four workers, and those workers are going to go over here, and I'm going to get..." Not that linear.

So, I think we need to think about this in a cascading effect to measure ROI, and then ultimately success in de-risking the overall spend here. So, the way that we want to attack it, and I advocate for, is, regardless of the complexity of the problem, let's start by determining literally the smallest AI test case, to prove out actual technical capability.

So, for example, if I'm building something to detect objects in a video, you can start with the fundamentals of, can I accomplish this segmentation and detection task, at a given accuracy that I would want? But I want to do this literally in the smallest, most ideal controlled environment.

So, I'm a former physicist. We would call this, in a black box, in a vacuum. Meaning, no real nature or world really exists, but can I prove my physics in this box? Same thing on the AI side. Can I prove the simplest AI problem in this box?

And if you can do this, great. If you can't, then you're stuck anyways, right? But you can do a lot of work there from a true technical feasibility statement, before you have to make any outside financial commitment beyond really the algorithm side, right? So, that's the first layer.

Second layer, let's assume that it's going to work. Let's just make that assumption. Yes, we can do that, and that worked in a controlled environment. The question really becomes, back to my first point, do we have access to the data and tools in the real world that will actually see the algorithms?

So, if that's a yes, how do we actually measure the cost there? What would that be? How are we going to deal with that from an infrastructure perspective? What does that data look like? What did that cost to acquire? What are the privacy restrictions? What are the algorithm's restrictions, their SLAs, and everything else around the information needing to just literally scale up the algorithm into a real production workflow?

If you can map all of that out, then it's very easy to start, again, incrementally understanding cost. So, finally, the last bit here from a real ROI and success factor is saying, let's assume all technical feasibility is proven, right? We've done it in the controlled environment. We hypothesized about the non-controlled environment, and we're now going to invest millions of dollars into this R&D effort.

The question really still becomes, which most people still aren't thinking about, is if I have an end user, that end user could be a histopathologist looking at cancer cells. This could be a jet pilot sitting in an F-35. This could be a mechanic sitting on a submarine, doesn't matter.

You have an end user who's going to consume this technology, right? How are they going to use this? What are they going to do? How is that going to augment their workload? A big part that often gets overlooked, and can't go into, really, any financial conversation is, AI systems will falter. They are not perfect. They are not 100% accurate on real-world data, and they're never going to be.

So, mapping that change management process into the thinking from the beginning, of how will you represent the outcomes these algorithms, and how does that tie back into any processes, will also help you think about the cascading effects of ROI and success.

So, if you really want to sit down and do this, you need to map out this entire process, and then from there you can make a very methodically targeted statement on ROI, and then make this path towards execution of this entire process, that's going to be de-risked, because now you've taken methodical steps of mapping that, and also understanding the ROI, the cost, and the burdens of actual implementation.

And that's how we see it by a sophisticated organization with real R&D problems do it successfully, and that's how I obviously would advocate it for everybody else as well.

Scott Goree: Perfect. Perfect. Mike, I think you said earlier in the webinar here, verified use cases, and sharing those across to other implementations, as a way to succeed, we've got a question in the Q&A from Kent here, around, what tools are out there and available for that knowledge base, to allow us to weave that experience from maybe other use cases, into functional models that we're working out with a different company, or organization?

Dr. Michael Segala: Yeah, I mean, there's nowadays, thankfully, and pioneered by, for instance, NVIDIA and Pure and a lot of other folks, this open-source nature of community learnings around code, and algorithms, and everything else, is now shared in the public, right?

You're revered by putting out a wonderful article about GANs, or sharing the latest architectures, and writing up really scientifically rigor experiments around models and methodology.

All of that is now public, right? ResearchGate, GitHub, archives, there's countless papers, and now all of these papers usually come with code that are linked to GitHub. So, there is a plethora of information out there about scientific-based research topics that are applied to real practical applications, right, be it whatever it is.

What I'm not advocating for is to naively download those and expect to take that bit of code and run it on your problem, and assume it's going to work in a real-world. That's not true. It's not going to work, and that's a naïve approach.

But those are great starters to understand our first bit of technical feasibility, developing a baseline, incrementally improving, right? Using that as a starting point to then really advance it towards your specific problem statement, is how you have to really approach these problems.

But really, this open-source nature is the way of what it's been for the last 10 or 15 years, and it's really accelerated the community greatly. So, there's tons of stuff out there.

Scott Goree: Beautiful. No, that's helpful. I think that aligns closely with what Paul just asked here in the Q&A as well, around training and enablement that's out there and available, or our masterclass. I know you can see the Q&A. I think my first advice would be to call Carahsoft, call your rep, and we can get you connected with training and enablement, and coursework. Anything else? Any other place they should look?

Dr. Michael Segala: Yeah, so actually, it's interesting. So, we at SFL Scientific, so we're a data science consulting company, and we've actually paired our services directly with you guys, NVIDIA, through Carahsoft, to be educators, right?

To actually come out, and we even did it with Crane, to part of this initiative, countless other folks with Army, Navy, Air Force, you name it, to come out and work with you from a true education and knowledge transfer. We're very specific toward the SMEs. In our group, working with the SMEs, and your group, to really sit down and fundamentally talk about the specific problems you're having, and how to attack that.

It's almost like an enablement, getting started, right? And obviously if you want us to stay around, we'll help you all day solve problems, but this fundamental coursework designed around your specific problems is a great way to kickstart a lot of these initiatives.

So, if that's something anybody's interested, we already have pre-baked ways to do this, and it's already sitting on buyable SKUs that we can literally just purchase off of. So, selfishly, I would say we already can do that, and you can buy it, but honestly there's lots of great resources outside of us. But with us, you'd get all the horsepower of SFL.

Scott Goree: Beautiful. I love the plug there, and we'll make sure the contact details go out there for you, as well as NVIDIA's Deep Learning Institute. Then, again, when in doubt, call your Carahsoft rep. We'll guide you through the process.

There's one last from Q&A, and I know we're approaching time. I think it's more of a philosophical AI question, so I'd love to throw it out there, and for Michael, thank you for asking. This warped AI, and possibly skipping ordinary intelligence, is that a rational move? So, I wanted to just throw that out there. Is that something you guys want to jump on.

Nick Psaki: So, this is Nick. I'll take that one. So, it's a really good question. I mean, many of us have grown up on The Terminator, The Matrix, and other examples of artificial intelligence run amok.

And a general-purpose superintelligence, well, frankly terrifies the heck out of us, because as human beings, we'd rightly be afraid of anything that's bigger, stronger and faster than we are, particularly if we made it.

The truth about artificial intelligence, particularly now, where we sit today, is that it is very much what we call special-purpose AI. We have all we can do training AI algorithms to train themselves to do very finite problem sets, and getting them to be good doing... Or even giving them enough digital runway to develop initial new capabilities on their own, is beyond the realm of possible today.

But certainly, this is an area of policy, and not really technology. And when I say policy, I mean human beings as a culture and a civilization are going to have to decide what we want the limits of our technology to be.

We've always created technology to advance our own capability, enhance and augment our own limitations, for strength, or for reach, or for speed, or what have you. So, we're going to have to wrestle with exactly what it is we're going to want AI to be able to do, and where we want to retain our own agency over our technology going forward.

It's a big question, and probably outside the scope of this particular classroom. But it is actually a series of subject tracks on AI ethics and ethical programming that are occupying a tremendous amount of thought space and conversation space in Washington, in capitals around the world, and in academia.

I mean, it's a good question, and the answer to your question, through my own personal perspective, is I would really rather see carbon-based lifeforms retain agency over carbon-based lifeforms, and let silicon have dominion over silicon. When you start letting silicon have agency over carbon-based lifeforms, we could have challenges that we don't really want to contemplate.

Scott Goree: Beautiful, beautiful. I think that's a great way to wrap us here. Mike, Tony, Nick, thank you so much for joining us today. For the marketing team and the Carahsoft team that put this together, let's make sure we get Mike and Tony some orange to wear for the next event. But gentlemen, really appreciate the time. This was extremely valuable for me. I'm sure our attendees feel the same, so thank you so much.

Nick Psaki: I'm just going to make sure we all have black to wear.

Dr. Michael Segala: Thank you guys for having us.

Tony PaikeDay: Yeah. Thanks for having us. Appreciate it.

Scott Goree: Sure, good. I will hand it back to the Carahsoft team, and again, call Carahsoft with any further questions. I’m sure they’re happy to help.

Speaker 1: Thanks for listening. If you would like more information on how Pure Storage or Carahsoft can assist your organization, please visit www.carahsoft.com or email us at purestorage@carahsoft.com. Thanks again for listening, and have a great day.