CarahCast: Podcasts on Technology in the Public Sector

Discover-Understand-Decide: How DOD is Driving Decisions From Data

Episode Summary

In this podcast, a panel consisting of Allyson Spring, the AVP for Federal at Collibra, Graham Evans, Principal at Booz Allen Hamilton, Howard Levenson, the General Manager for Federal at Databricks, and Brian Shealey, the VP of Public Sector at Trifacta highlight the value of data within the Department of Defense.

Episode Transcription

Speaker 1: On behalf of Collibra and Carahsoft, we would like to welcome you to today's podcast focused around how the DOD is driving decisions from data where Allyson Spring, the AVP for Federal at Collibra, Graham Evans, Principal at Booz Allen Hamilton, Howard Levenson, the General Manager for Federal at Databricks, and Brian Shealey, the VP of Public Sector at Trifacta will discuss the value of data for the Department of Defense. 

Allyson Spring: Good afternoon I want to welcome you to Collibra's first Annual Government Summit. Today we're featuring the second panel which is discover understand and decide how the department of defense is driving decisions from their data. Today's panel will feature speakers from industry. I'd like to start with some introductions. My name is Allyson Spring. I’m from Collibra and run their federal sales division. The next person I’d like to introduce is Howard Levenson, general manager of Databricks' public sector.

Howard Levenson: Hey Allyson. Thank you so much for inviting me here today. It's a pleasure to be here with Graham and Brian and yourself. You know, Databricks is a company that was founded by the original creators of Apache Spark and we deliver a unified analytics platform that allows people to ingest, build out data lakes, and then apply artificial intelligence and machine learning to that data to find insights in their data and we're happy to participate with you and the rest of the Advanta team. Thank you.

Allyson Spring: Thank you, Howard. We really do value your partnership. The next I'd like to introduce is Brian Shealey. Brian is the vice president of public sector for Trifacta.

Brian Shealey: Thanks, Allyson. Well yeah Brian Shealey here. I run public sector for Trifacta we're a software company elite series startup with a leader what's called self-service data preparation and what we do is we tie into organizations that are rolling out cloud native data platforms and allow them to operationalize data at scale in a self-service driven approach that's low code extensible and integratable across the ecosystem so we've got great partnerships with Collibra Databricks amongst our peer technology companies and we've been working with Graham and the team at the Advanta project for a couple of years providing the data wrangling or data preparation component for the Advanta platform. Thank you for having me.

Allyson Spring: Brian, thank you for participating today as well. This panel is going to illustrate how the department of defense has worked with multiple different industry partners and come together to deliver a solution that supports a variety of different mission critical machine learning artificial intelligence and advanced analytics projects I’d like to now introduce Graham Evans from Booz Allen Hamilton to talk a little bit more about his program Advanta.

Graham Evans: Sure, thanks for having me again Allyson. Graham Evans I'm a principal with Booz Allen Hamilton. I'm in our digital practice helping our defense clients understand how to organize their enterprise data make use of it and generate meaningful analytics one of our clients is developing the Advanta data platform Advanta is a government run enterprise data platform built on open architecture leveraging the best of breed commercial tools out there and industry like Collibra, Databricks and Trifacta supporting all areas of the data lifecycle so do d comptroller currently runs the data platform but customers across the DOD are currently leveraging the tools and capabilities and organize data to bring value to their enterprise data across business and mission domains and generate meaningful data products support executive decision making as well as decision making.

Allyson Spring: So as part of today's discussion I’m sure that a lot of people are very interested on how Booz Allen Hamilton was able to not only work with a diverse set of mission goals from the department of defense but also integrate various different technologies to achieve one common goal and that is making the department of defense's data collaborative useful and you know solve a variety of different mission challenges so just to kind of kick off the conversation why don't we talk a little bit more and if you can tell us Graham about the initial success that further enabled god users and use cases to the platform what were some of those first use cases and how did you really you know start proving that to your government leads.

Graham Evans: Sure, thank you. So, today Advanta is truly enterprise wide covers a number of different business and mission domains in terms of its analytic capability, as I mentioned before, but it wasn't always that way. So in the early days of Advanta, the OSD comptroller was really focused on the audit use case. And it's important that they focus on that singular use case, because it allowed them to really hone in on specific data management and analytic techniques that would be useful down the road. So the problem they were trying to solve was, the comptroller was trying to bring together 20, plus different accounting systems from across every organization within the Department of Defense. And there wasn't really a data platform that was able to accomplish that at the time, this is four or five years ago, we were able to harness the power of big data solutions, bringing that data together and drive meaningful insights to support the duties audit. At that point, you know, there was a realization that with all of this great rich business information in one place, the Advanta platform could provide meaningful insights across different mission areas like logistics, manpower, medical, as well as acquisition procurement type use cases that are gonna be really important for the comptroller to drive performance of the department. And so that's really where it started. And that's how we expanded into these other missionaries. You see around the circle here, one of the very first use cases that kind of took off within Advanta was what's called a dormant account review, quarterly, relatively straightforward process by which organizations can certify that they don't necessarily have dormant accounts. And if they do within their accounting system, they can give that money back for other purposes. For the first time, we were able to bring all of this data across the department together into one place, and identify hundreds of millions of dollars in savings, they were able to be repurposed for, for future use, truly demonstrating the value of bringing together this type of data into one place with the right tools. Since then, we've really invested all of our time in improving the end user experience, increasing automation, and providing capabilities that increase developer productivity to increase the meaningful data products that we have on the platform today.

Howard Levenson: You know, if I could add something to that its OMB did a study a couple of years ago, and there were 12,000 data centers across the federal government. And each one of those data centers had islands of data. And the challenge, if you try to find insights in the data is, you've got to bring it together, you've got to normalize it. And like I in Advanta, they normalize it using Trifacta. They curate it into a data lake using data bricks, they catalog it using Collibra. And then Booz Allen brings this whole thing together, in this same use case is applicable across the government, everybody's got the same problem, in the only way that you're going to get insights from your data, the more data you have, the better the insights will be. And that's why it's critical to get this data into a data lake. So I think you guys have cracked the code here, Graham, and I think that's why in van is going to be so valuable to the DOD.

Brian Shealey: Piggybacking on Howard's comments which I, you know, I would say great description from my lens. And I've been fortunate enough to have worked with Graham for a couple of years now, Graham, kind of a bit of a loaded question from my end here. But I think it'd be helpful for the audience, the DVD from everybody's lens has ever worked in the space, we understand that complexity is at a high level. But can you just talk about a little bit about like the scale of the data, the number of systems under the umbrella, that you're bringing together data into this sort of centralized platform? Can you just comment a little bit on that? Because I think it's very, very telling to people to understand what this complexity actually looks like at scale.

Graham Evans: True, thanks for the question. So when you when you're thinking about an enterprise analytics platform, and you try to take the lens of, say, a commercial fortune 500 company, they might have maybe five or six enterprise systems, they might have an accounting system, an HR system of billing and a payment system for some of their vendors. So you might have five or six, the department defense has 3000 business systems. So when you're trying to make sense of that much data, that many different systems that have different ways of describing a piece of information, it becomes a real challenge. And the only way to do that the only really good way to make sense of it is to really be data product specific. So instead of trying to solve the problem of getting all the data in the department, putting it in one place, really focusing on your user and focusing on the data product that you're trying to produce that can then generate the very specific data sets that you want to target and bring it into the right platform with the right tools to generate the answers that you're looking for.

Allyson Spring: That was great. That is really a good explanation of, you know how the system does support so many different mission use cases. And also not bringing in everything all at once. So what you referenced earlier was that initial smaller use case to kind of prove out to the Department of Defense that this was a possibility, kind of defining what the art of the possible was for doing something more with their data, rather than just, you know, store it. By using that smaller use case, do you think that really was beneficial in terms of doing that, you know, kind of proof of concept and going into more and other types? I'm looking at this, and you have readiness, analytics, executive analytics. And I think I'm sure folks that are watching this right now might be interested, how did you determine and evaluate the technologies that were going to be part of your solution?

Graham Evans: That's a great question. There's a lot of examples in the government and within industry, where big wide ranging data projects sort of utterly failed right at the beginning, because they maybe tried to bite off more than they could chew for what they're trying to solve. And I think focusing on the technology at first is sometimes the wrong approach. Or going after all of the data that you're you know, you know, you need to go after right, the beginning is probably a little bit daunting, and can lead to some failures. So not having really that data product focused approach, I think is, is really the biggest key to success, or having that that data, the product focus approach is the key to our success. And so your question was around, how do we choose the right tools, we had sort of a very specific goal in mind at the beginning, and that was to try and solve the problem. So we tailored the analytic capabilities that we're bringing to the platform to support specific personas that we knew need to access the data from an audit perspective. And we tailor the tool choices to those personas as we started to grow. And our use cases expanded across all those different data domains and analytic areas that you see there. Obviously, the personas that we were trying to target, the use cases that we were trying to target became much more diverse, and we needed a larger and more diverse set of tools. So if we have a persona, who's a data scientist, and they want to get down and dirty with Python, or our Scala, and have a very highly scalable performance, analytics engine, they need to like data bricks, if we have someone who's more of a business analyst, and who's very familiar with tabular data, and wants to very quickly get access to large amounts of information and transform it in a very simple to use low code environment, which use a tool like Trifacta. So it's very specific to the persona of the analytic users that we're trying to target, as well as the data product that we're trying to do produce in state is really how we focus on adding in new tools to the platform.

 

Allyson Spring: Graham, that was great. Thank you. And I think being able to deliver self-service analytics, I'm sure that has really assisted your users in terms of getting to their mission getting, you know, smart about their data and driving towards a lot of those mission outcomes. So let's go kind of in a different direction. Let's talk about how advanced platform can scale. I mean, that's, I think, another huge topic that we can talk about a little bit, there's a potential for hundreds of 1000s of users to, you know, come in and use advantage platform. How do you look at scaling to something that you know, something that large? How, you know, what comes into factoring into the tools, you know, that can scale? What types of technology helps you scale as a cloud? How do you address those security concerns? Clouds been around a really long time. And everyone, you know, always talks about, it's a journey to the cloud, you know, but that's been reality for the past, you know, 10 plus years? How do you use this? And this is, obviously a real world use case. How do you get that scale that you need that you can onboard? All these different defense entities into your system?

Graham Evans: Yeah, absolutely. I mean, technology in this space is changing so rapidly. And so I think we look to industry where there's really good examples of organizations scaling rapidly, like Netflix, for example. And, you know, wanting to emulate it as much as possible. But obviously, within the Department of Defense, you've got certain security requirements and restrictions that require you to think a little differently about how you architect platforms. But within any data platform approach that we're taking, we know because of the rapidly changing environment that the platform that we're using today isn't going to necessarily be the platform that we're going to be using two years from now or even one year from now. And so we want to make sure we take advantage of our partners. And I'd love to hear from the other panelists about this topic. But I mean, from a scaling perspective, what we're looking for, in terms of the infrastructure that we're selecting the tools that we're selecting is really around reliability, maintainability performance, and ultimately, because of the clients we're dealing with cost, obviously, is always going to be a factor. And so we take that, we take that into our approach. And I also mentioned before about our data product and user centric approach. And so obviously, when we're thinking about integrating different tool sets, and integrating our data experience for our customers, we're really focused on the usability of the tools, the usability of the platform, and the different best of breed tools that we're bringing to the table, their ability to integrate with one another. From an operations perspective, we're relentlessly focused on automation as much as possible. We can't scale to 100,000 users if we have a bunch of manual processes. So obviously, any tool or any capability that allows us to automate workflows and data management, for example, like how Collibra and your collaboration tools or help us scale, our data management approach is very helpful. And then finally, you mentioned cloud, getting out of the day to day management of the pipes of data management, the underlying infrastructure, and moving more towards cloud native services seems to be the Indus industry trend. And it's something that we're definitely trying to take advantage of, because of its simplicity and reliability that it provides as well as cost savings. But we recognize the need for flexibility of an open architecture given how the government procures tools and capabilities. Again, we want to take advantage of the best tools in the marketplace for our users, we maintain an open architecture. Related to that. So again, I'd love to hear from the folks on the panel about their thoughts on scalability in the future of tools in a space.

Howard Levenson: Yeah, I'd love to add to that, like scaling is obviously critical. But scaling in a reliable and cost effective manner, in secure manner is equally important. I think one of the innovations that, you know, the founders of our company, and the founders of this open source Apache Spark capability came up with was the ability to you to leverage low cost systems, and to actually parallelize the workload across dozens, hundreds or even 1000s of systems to accommodate the scale of massive amounts of data, and massive amounts of users. The brilliance in Apache Spark Is it pretty much abstracts the developer, from the fact that he's running on 100, or 1000. Computers, he just looks like he's using one computer. And he doesn't have to have special skills in order to understand the parallelization techniques that people used years ago. I think the other key point, at least from Databricks’ perspective, is the computing has to remain ephemeral. And the way to do this cost effectively in the cloud, is to drive all the storage into the low cost, but highly durable storage layer that the cloud service providers provide. Whether that's s3 and Amazon or ADLs. In Azure, these storage environments are really inexpensive, really durable. And the magic is being able to pull the data out of those storage areas and spin up the computing for the period of time that you need it, and then eliminate it and only pay for what you compute. And so that makes it cost effective. And then of course, we could dive into the security mechanisms and reliability mechanisms that are available here. But I think the fundamental architecture has to support scalability. And it has to do it without putting too much of a burden on the developer.

Brian Shealey: To add to that, I think great points Howard and Graham. So when I think about scalability, there's the technical scalability like the architecture and the way you build and a couple comments to what Graham and his team at Advanta have architected for us. You know, he talked at the very beginning about open systems design, I'm a huge believer in open systems design, I've thought for years that this is the key to the future, for large scale software oriented projects. So every component in and around Advanta, if you understand the architecture is got modern API's typically restful based API's. And you can plug and play the right components together, I would describe the architecture approach on the technical side as one of designing for agility. And the sense there that you should get is that, as Graham said, there's tends to be a shelf, I'll call it a lifecycle of how long things will be on the shelf before they kind of expire in terms of, of technologies. And what happens is, you know, like Advanta and many of the other shared service programs I've seen in public sector that were quote unquote, big data projects. Over the last a dozen years. People set up these systems and data centers, getting a bunch of data. And it's like, where's the value? The real difference in Advanta is there's real tangible value. The first use case Graham talked about was massive taxpayer benefit, right? There's money sitting around. And now we can reallocate that money and find areas to spend it in without asking for more tax dollars to be spent. But with that being said, what they've really done is had this focus on analysis and delivering data centric products to users across the DSD and beyond. So from a scalability standpoint, from my lens, it's not just the technology that achieves scalability, it's not just the architecture, it's the approach that the program themselves takes towards an outcome. And one of the things that's been really interesting to watch, having spent like a call 20 years working in the public sector space, is as an integrator, leading the program. Graham and his team have done an amazing job of picking tools like Databricks, Collibra, Trifacta, there's many other component technologies running in this architecture. But the way they bring it together and deliver it to the user base is really in my opinion, where the scalability is getting the main benefit for the user community. So there's different personas, and data bricks, if you're building a data science platform, and you're looking at best of breed, and you truly understand the space Databricks is absolutely at the top of your shortlist of tools that you're going to look at, if you're looking to do that. And the analogy, you have to kind of to Howard's point around that innovation there. We're all bringing innovation as technology companies. And one of the things about the components that are in there, I can speak from Trifacta his behalf. And certainly I've watched this with data bricks. And I've seen it with Collibra. And some of the other component period technologies as well as our companies are investing as best of breed tools in these deep sets of capabilities in the modalities that we specialize in. Right. So as a data platform, the benefit of choosing data bricks as a cloud native leader, is that Apache Spark, if you don't know the history, it's worth Wikipedia in it. It was the fastest growing Apache project. And I think the most pervasive Apache project in the history of the Apache open source Foundation, 1000s and 1000s of contributors and users well, the people that started data bricks were the guys that created that technology. And so, you know, they're thinking years ahead of where computers going. And then they've innovated with a storage engine component called Delta as well. And basically, what's happened is the days of people wanting to set up and manage infrastructure and plumbing. For years, people would say, well, the DOD is never going to move to the cloud. It's too hard. It's to this, it's to that, you know, this is a testament to the fact that, oh, the DODs definitely go into the cloud. And anybody working in it knows that. And the analogy that I think about is like this, if you're going to go build a house on a new property, you don't also go and build a windmill on a dam to create power for yourself. You plug into the grid, and you consume that is a utility. And the beauty of the architecture of Advanta is taking these best of breed components, wrapping it in a great go to market strategy, where truly, you know, this is a shared service. It's like any other shared service in the deity. The stick doesn't work, the carrot works, and from my lens Advanta offers a great carrot for users across the doodie that want capability and the analytics, business intelligence and reporting, data catalog and data governance and data ops, toolset spaces, but also they want the outcome of those analytics and what that will drive to the program value. And the way that grant and team and leadership Advanta put this together is very innovative. It's in my opinion, by far, it's currently the most advanced data platform in and around not only the DOD, but that I've seen in the public sector in terms of depth, breadth of capability, and then also an agile design that will continue to scale because of the technologies, the architecture, and the overall management of the program.

Howard Levenson: You know, there's one other important factor to Advanta that bears notice, and that is that if you're building a scalable environment, and you're going to put tons of data in there, you really want to do it using open protocols and open capabilities. And I think one of the things 20 years ago, the number one word processing platform was WordPerfect. If you to WordPerfect file today, they'd practically be no way to read it. And the fact in Advanta is all of that data is going into Amazon s3, it's going in there using an open Format Data bricks is open source, the Delta format. Exabytes of data are processed every week with Delta. The fact of the matter is that anybody that wants to read that data can with open API's, and that data will never be locked up. Because the source code is out there. It's available. It's not dependent on data bricks, and it adds value to the program. So the open source nature of this ensures that the government's not going to get locked in and not going to have to pay attacks every time they want to read their data. And I think that's a really powerful, you know, point I don't know, Graham, you seem to agree with that.

Allyson Spring: Overarching theme is what you brought up powered vendor lock in or when you move in advance in time that you're not going to be able to access a file or, you know, it's going to become difficult to translate it and keep up with new technologies. So, you know, that's a really important thing. And that's been just the key theme. And I think almost every government agency is avoiding that vendor lock in or avoiding something that is proprietary, not open, you know, closed systems. And I think, you know, this system truly illustrates this ability to, you know, have open systems working with one another and that compatibility.

Brian Shealey: There's another comment there around vendor lock and innovation related to this type of design. So one of the things a lot of program leaders, my experience, don't think about is they choose best of breed based on where they are today. They'll say, well, these guys, this company over here has this great tool. And what they don't realize is that if you're choosing things that are not allowing you to perform openly with the data itself, and interoperable API driven fashion, what ends up happening is, number one, you're locking yourself into tools, because now you've got some sort of proprietary component, or proprietary format that you have to deal with. But what you're also doing is you're stifling competition in the marketplace. If you want for example, Trifacta to remain the leader in self-service data preparation, the best thing to do you know us is light a fire behind us by making the data accessible to anybody. And what we have to do is continue to bring feature function and interoperability to play with data bricks and Libra and provide real operational value for the users of the system. So that the integrator that the person, the people like Graham's team, that are bringing those capabilities to the customer base, as that demand continues to evolve, we as component vendors in these modalities continue to evolve, and invest in those areas for the future. And that's the other part about vendor lock in, it's kind of a dirty secret people don't talk about it's not just a cost issue. It's really an innovation issue.

Graham Evans: I was just gonna agree with both Howard and Brian about that topic. And the way we see IT procurement right now, especially around this data, platform space is very similar to what was just described, it's this concept that because the technology space is changing so rapidly, and because the acquisition lifecycle for the government is relatively slow compared to industry, two choices have to be made with a lot of care, you definitely don't want to get locked into a specific architecture, or tool choice, that's going to be become redundant in a few years. And then, you know, from the government's perspective, the adversary might take an advantage that we have to be able to be agile, in terms of the tool choices on a particular value chain, and be able to make substitutions if necessary. I think there's just been so much advancement today. And like you said, the cloud infrastructure, and then data engineering tools we've talked about, and then data science capabilities, that there's a lot of options out there. But there are some, I think, that we've talked about that have really risen to the top in terms of their specific area of focus. And so the concept is Brian mentioned about open systems and open architecture is really something that we've adopted. So being able to integrate via API, to your data storage, to your compute, to your data wrangling capabilities, and then all the way through to your data visualization. And data science tool sets is something that we really value and getting to the best of breed and all of those areas, is something that we're striving for.

Allyson Spring: Thank you, Graham. And we talked about multi-tenant scalability, the data engineering behind it, subject matter expertise, leadership in, you know, very specific data centric areas, and also the security that wraps around it. I mean, Advanta is truly illustrated that this is something that the government is ready for. I mean, you have so many users that are you know, wanting to onboard or have on boarded into the Advanta platform. Let's talk and go back a little bit and talk about how that works. You know, someone comes to you and they have an idea. And you know, they come and they say we've got X, Y and Z as a data problem, or we're trying to address this mission challenge. You know, we need to save money because we don't know our supply chain needs that. Well. Can you talk a little bit more about that and the customer experience team and how user experience and design kind of plays into all of that?

Graham Evans: Absolutely. I think I mentioned before, our approach towards bringing new capabilities to Advanta is really our data products centered approach. So it always starts with focusing on the customer and their business problem that they're trying to solve. Before we start to think about anything else related to all sets are or data or that sort of thing and focusing heavily once we understand that problem set focusing heavily on the user experience is something that we have been used for you know successful onboarding of new users to the platform so they fully integrate into the platform their data experience is something that they value and that leads to increased adoption which then leads to scaling of use cases but typically what happens when a new customer will come to the government organization that runs Advanta they will have a problem and they will have an analytic problem that they need to solve they've got a bunch of data they may not have the talent the tools or the funding to be able to support it on their own and I think that's where Advanta really shines it being a shared service data platform the goal here is to provide a value proposition to the department where you know an organization can get access to their data quickly they don't have to stand up infrastructure they don't have to go out and get a month long accreditation for a new system they don't have to go through a lengthy procurement process for a new contract for example it's really about I have money I have a problem let me give it to this organization Advanta and they can help me solve it they can either partner with me to solve it or they give me the tools and the access to my data so that I can go and solve it myself and that model has really been the key to continued success I honestly think in the future of this space you're going to see less and less of one off data platform development activities because there are a couple of different examples of very successful shared service data platforms that support enterprise level use cases throughout DOD, I don't think there's ever going to be a time when there's a single repository for all do the data because there's very specific security requirements that might make that difficult or very specific very large scale data problems that might make that a little bit challenging but I do think we're going to start to see a consolidation of this specific use case of a shared service data platform on a couple of larger enterprise scale platforms like Advanta and also being supported by vendors like Trifacta and Databricks. Brian, have you seen something similar in your work across the air force or army or any other organizations like them do it?

Brian Shealey: Yeah I would say specifically speaking to duty graham there's certainly a move and to your point I echo your sentiment I think that you're going to see less sort of one off creations of organizations standing up a platform for their own program solely the air force is doing a really nice job with the platform called vault that's the enterprise data platform there coincidentally Howard in the Databricks team, ourselves, Booz Allen Hamilton we're all involved in that so there's definitely some you know this famous phrase right amateurs borrowed professional steel who borrowed and stole what who knows but I think that paradigm has definitely shifted where we're going to continue to see particularly for the analysis component we're going to see moves to the shared services there's certainly never going to be a lack of new systems to collect data at the transactional and system of record level I just think the nature of creating data within programs that's what programs do and the cloud may even make that more pervasive but I do think that on the analysis piece to leverage that you absolutely need to consolidate co locate data in a cloud native extensible way and have an approach that allows users of different persona different skill different business need to quickly operationalize data for their needs and fundamentally what it comes down to is you know for anybody in the DOD that's read general Mattis’ his 2018 national defense strategy it's all about time that's really what it comes down to when we spent as a country billions of dollars but what's nice about this paradigm shifting is you're starting to see time to insight rapidly speed up and it's because of these technologies that were mentioning but really the program approach in and of itself so again like air force is doing that graham you can probably talk a little bit more about maybe what the navy's doing not only are they interested in what Advanta is doing but they're literally investing in Advanta. I think you should talk about that but I happen to know that the army's headed that route as well based on what I've read and then we're in the intelligence community and the rest of the public sector various like state department for example there's public they have stood up a thing called center for analytics and again like that need to co locate data for more intelligent more broad reaching insight is not changing at all it's continuing to gain steam. That's kind of my perspective there.

Graham Evans: It goes back to what I was saying before about managing the pipes or managing the infrastructure the complexity of these types of environments just make it almost kind of a high barrier to entry to stand up a new capability like this and get started quickly and really address the emerging needs that these defense clients have in terms of harnessing their data using it for decision making purposes so you mentioned the navy they're taking a look at Advanta as basically the backbone of their solution they have their own capability called Jupiter that they've stood up using the same reference architecture that Advanta uses a lot of the same shared services but they've been able to take advantage of that and stand up very quickly because they started with an existing platform and existing set of best of breed tool sets.

Howard Levenson: So, I'd love to add a couple of comments to what you just said there. I remember in the nuclear industry in the united states it really failed because every single power utility built their own nuclear reactor with their own set of processes and procedures the thing that I think graham you understate is the experience that Booz Allen brings to the party here I think correct me if I'm not mistaken but I think you guys have 1000 certified data scientists in the organization there and I also noticed that like 85% according to Gartner 85% of all the big data projects failed and I think the thing that works well for Databricks in Booz Allen is we've done this before and we've done it before for the air force with vault we've done it at the FBI on Prometheus we've done it together at the NGA at the BA and we found a set of principles that work in the experience that Booz Allen brings to the platform and the experience working with Databricks, Trifacta, and Collibra. I think it has given us a recipe for success that is replicatable for other agencies as well. It's not your first rodeo.

Allyson Spring: I mean just to echo on some of the topics that we've talked about user experience so important and the rising costs of infrastructure I know for 1012 years I sold storage and servers you know when cloud became so prominent and it showed that there was a way to be able to especially with the government that has very predictable budgeting cycles that they could be able to get what they need from a service without having to worry that they need to go buy a server add storage you know so really driving home the importance of this is you know just not only saving money but it's also saving you know saving time it's getting people to you know that time to value that you all brought up everyone has a lot of data but you know really what did they do with it is it interoperable is it you know proprietary just you know kind of breaking down all of those barriers is what to me is really very just cool about Advanta and what you've done with this program and that would be leading up to our final kind of question for my panel here is given the success that you've seen so far with proving all these different use cases where do you see Advanta going? Where do you see the department of defense going? What kinds of problems challenges? What's 2021 look like? I know we've all had a crazy year with 2020.

Graham Evans: In 2020, Advanta really saw an uptake and usage because of its ability to respond very quickly to the COVID-19 crisis and that's obviously a very big topic of interest to the new administration as well as the department at large and the country and so I think that's been we've been able to use that as a point of inflection where we've started take on more and more different complex use cases that require flexibility in terms of what data is needed what questions need to be answered what types of analytic techniques and capabilities are involved in supporting those types of new meaningful data products that we're producing from a future perspective you know we always talk at Advanta about not really wanting to focus on bringing on more data or focusing on bringing on more users but it's really working with customers to find more interesting and relevant business problems to solve because that will translate into tools and techniques and more data as well and you mentioned the user experience the advantage government team I think are some of the most innovative group of leaders that we have in the defense part when it comes to focusing on the user experience and so first and foremost any problem that we try and solve it's really drilling down into this specific persona and user journey about how an individual is going to interact with a dashboard or a model or even just a report that we might be generating and that's been really helpful in terms of increasing customer adoption across the board I think that's how we're going to continue forward, is really focusing on challenging problems that we can solve across the department and then focusing on that user experience to ensure adoption and delivery of that capability.

Allyson Spring: And that's great, because really, I think 2020 illustrated to a lot of people that, you know, the remote work from home situation, all these collaborative tools are a possibility, you know, things that look looks to be really hard in 2019, I think actually became reality, when everyone was sort of put into, you know, different work environments, working from home not being in their data center, I can see that as a huge driver towards your platform, and just the ability and with the user experience being, you know, so tailored and customized. It's provided just a platform that folks can use that have a problem and don't have to think about I need to hire what I think Howard or Brian said, 1000 data scientists, you know, most organizations don't have the ability to even maybe hire one or two data scientists. And, you know, just another question for the panelists, you know, get what are the types of trends and things that you're seeing in terms of futures of 2021 in data? 

Howard Levenson: I think for me, and what I see is a major trend that we're calling lake house. And the idea here is that typically, people are building out data lakes, and they're separately building out data warehouses, and that's causing a replication in the amount of data, which is expensive, you end up with two governance models, which open up vulnerabilities, you might have different versions of the data, which could cause inaccurate results. And so one of the things that we're bringing to market is this idea of a lake house where you combine the best of a data lake, and the best of a data warehouse. And with data bricks in our SQL analytics capability, all of the data that sits in your data lake now appears as though it's in a data warehouse. And you can bring your BI tools as banamine, with click or with Tableau or whatever else. And now, you're not limited to just the data that you've curated and put into the data warehouse, you have the ability to unleash that BI tool on all of the data within your data lake without ever making a copy of it. And with a single version of truth for all of your data. I think that's going to transform the industry. I think, ultimately, data warehouses are going to you know, they've survived for 40 years, but I think they're going to be dead. And I think, you know, it's going to be a data sitting in an object store with a platform like Databricks to query it. So I think that's a major change for business.

Brian Shealey: Yeah. So that's definitely an interesting paradigm. Howard, I would echo that, you know, the lake house concept. I've heard it, I think, from you before. And that's a good way to describe sort of the model that we're seeing, fundamentally, the big trend that I've been seeing for a couple of years now is this move to shared service platforms that are truly self-service driven? Allyson, you've not talked a whole lot about Collibra. But what I would say is like, all these platforms, you have some fundamental needs, right? One is you need a platform to be able to process the data flexibly, extremely scalable, and highly performant and secure manner, where the user community on the technical side can bring whatever tools they want, because I believe on the technical side, like your data scientists, data engineers, they use a lot of open source. They're coders by nature. So they don't need robust business tools with great interfaces, they're just going to bring the latest greatest thing and they want accessibility to the data and the power to process it. And that's where like a Databricks comes in. You know, you need a great structure around cataloguing the data and managing curating, storing the data, the data Act has driven that across public sector, every organization has data strategies in the public sector, you know, from large federal agencies down to even local municipalities. You know, cleaver brings an awesome cataloguing component there, you need to curate manage and integrate the data in a way that is robust, supports not only the technical use cases, but you know, the dummies like me that I just want to put together a sales report, right or a market report. And I've got to pull data from four different data sources, I can go to the where it's stored, leverage a Trifacta. And then in Trifacta, we can hand it off for the processing to data, bricks, all of this really comes together, in my opinion, and here's the main trend that I was getting at is it's really around self-service. And one of the things that again, Graham is a very modest person, just working with them. And I'll tell you that I have not seen any other integrator either government lead or systems integrator lead, take us to self-service approach to a shared service. And I've been working in shared service it for like call it almost 20 years, where people have forever been saying, well, why do we have all these systems? Let's like set one thing out there. What happens is if you set something up, and it's got great capability, but it's hard to use, it's like being a startup as a software company. You could have the greatest design of your software ever. But if it's hard for to use hard for them to get access to guess what you're going to die on the vine and what I think the Advanta team has really done and grants a key part of that leadership is this this concept of self-service analytics being able to analyze the data for your needs in a timely fashion when it's relevant is really critically key for everybody and I think that's the one design paradigm if you take nothing else from this I think looking at the technical architectures the best of breed component ideas and open design and thinking about self-service as a strategy if you're a shared service that is how you're going to get the benefit you can't do it any other way and that's not just choosing tools there's really a great approach and I would highly encourage anybody to come to Advanta you can go to the catalog and look and cleaver and see what data is available but really meet with the team because it's not just tools they don't just throw the tools over the wall there's a whole enablement strategy and really you know any modern software company that's cloud based has a customer success effort where years ago people would buy software and they would pay for support and they'd get breaks fix support and have to go out and buy consulting when they needed ongoing help well these are projects they evolve and as graham was saying what that north star looks like today for somebody working on an analytics project is going to be different probably six months from now and if nothing else with COVID-19 I think what we all realized is man the data is out there but if we can't put it together quickly scalable and effectively we can't do analysis and here we are what 13 months into the pandemic you know and we're starting to see a light at the end of the tunnel but I would say like the silver lining of the last year in my opinion is people really are taking to heart this concept of we need to allow people to operationalize and analyze data quickly at scale in a self-service fashion it's extensible pluggable and cloud native and that's kind of the pattern we're seeing over and over again.

Allyson Spring: That's a great summary of you know really just how this year is made a lot of users just you know kind of evolve and realize these things are possible but these things are possible because there's teams of folks like graham's team at Booz Allen that has built solutions that enable you know that self-service capability and you're 100% correct if something is hard to use people don't use it I mean there are internal business systems that when you log on every day you're like oh gosh I hope I don't have to use the system so you know driving user experience and that self-service capability I think it's truly key. Again, I want to thank everyone who's participated today. Brian Shealey from Trifacta, Howard Levenson from Databricks and Graham if you want to finish it off again what your team has done and built with the data science capability and bringing you know that true time to value is one of the most impressive projects that I've seen that's ongoing and constantly evolving. So when we talk about what's next I look forward to seeing it.

Graham Evans: Yeah, thanks so much. Yeah, it's definitely exciting time to be working in this space with emerging data challenges always becoming available to us and having the partners that we have on this call today that to help us work through those things as well as the Advanta government team who is truly like I said visionaries and making this a true shared service self-service capability for the entire department defense it's certainly been an honor to be a part of this. I look forward to seeing how we grow and scale throughout the rest of this year.

Allyson Spring: Thank you so much to all. If there are any questions, our team is here to support you and we look forward to seeing what comes next with Advanta and how the DOD data strategy shapes all the different mission critical self-service requirements that you're going to serve up with Advanta and the tools that we all offer together. Thank you.

Speaker 1: Thanks for listening. If you'd like more information on how Carahsoft or Collibra can assist your organization, please visit www.carahsoft.com or email us at collibra@carahsoft.com. Thanks again for listening and have a great day.