|SAP National Security Services (NS2)
Clients across the national security community count on NS2 to deliver solutions when security is paramount, stakes are high, and the volume and scale of data are too great to leave to chance. Our unique combination of startup agility and global stability gives our clients a competitive edge. We are 100% staffed by U.S. citizens on U.S. soil. Click here to learn more
SAP NS2 Executive Point of View:
Enhancing Coastal and Boarder Security through High Performance Data Analysis using SAP HANA
There are large--and continually growing--collections of data that need to be fully and efficiently analyzed to support the missions of National security and defense of the homeland; use-cases include boarder and costal situational awareness, cargo and people screening, intelligence processing and law enforcement investigation.
Analysts need better supporting technology to help them find entities in the data--people, places and things of interest—and to uncover hidden relationships between those entities. They need easier and quicker methods for fusing together data that is held in multiple systems, in multiple formats, so they can spend less time “wrangling” data, and more time doing their valuable cognitive work to increase situational awareness. This need is driven by the fact that in many cases the required reaction times have decreased from days to minutes, while fewer agents and analysts are available to work on more problems in constrained time frames. The goal is to help increase the mission effectiveness of their analytical effort.
Increased flexibility and agility is needed in analysis tools, with advanced analytical methods, because the adversary has no fixed doctrine, and most analytical problems are complex and information rich, but insight poor. Multiple modes of analysis are needed; sometimes many for one mission use-case. These modes of analysis include machine learning algorithms to automatically categorize and cluster entities to find out which may belong together. They may also require geospatial context, and text mining to identify and extract entities and concepts from large bodies of intelligence and sensor information. Graph analysis is needed to uncover non-obvious links between people, places and things. “Fuzzy” Linguistic searching of the body of documents is required to expose the pertinent facts to the analyst in an easy-to-use way. (This type of sophisticated, multi-modal analysis has come to be called “High Performance Data Analytics” by some noted industry analysts.) In the current practice, complicated technical architectures are being designed with multiple separate analysis servers to execute this kind of analysis; these complex architectures require many copies of the data to be sent throughout the system, introducing complexity and risk of error.
The need to process and analyze data that has a lot of volume, variety and velocity (aka “big data”) has led to the implementation of distributed computing environments, in-memory databases and virtualized data architectures. Increasingly, there is a need to combine information from both structured databases and unstructured “big data” lakes for a unified analytical view; for example, in social network analysis of actors, and tracing the digital trail of subjects for continuous evaluation.
SAP HANA is a high performance in-memory data management and analysis software platform that is the result of co-innovation between the large enterprise software company, SAP, and the CPU processor chip manufacturer, Intel. First released in 2010, SAP wrote the software system to run on the widely used Intel XEON-based servers, which are available from all the major computer manufacturers and are pervasive in data centers and cloud infrastructures around the world. Intel optimized the instruction sets in the chips themselves to accelerate the high performance in-memory database processing of HANA. This co-innovation allowed the SAP HANA data management system, first released in 2010, to fully leverage the potential of the multi-processor, multi-core hardware of the newest servers in a way that data management systems that were first written in the 1980s --the days of single threaded, single CPU systems--cannot do. The result of this effort is the ability to scan, search and analyze multi-million—or even billion-- record datasets in seconds. Recently, SAP has collaborated with IBM to deliver the high performance functionality of SAP HANA on the IBM Power System chipsets as well.
SAP HANA High Performance Data Analysis Platform for Coastal and Boarder Intelligence Processing
And to complete the capabilities of the SAP HANA system, its designers also included multiple analytical engines right in the in-memory high-performance data management system. In this way, SAP HANA provides the ability to deliver sophisticated analytical insights, on very large data sets, with very high performance. Unlike other solutions, SAP HANA delivers the following multiple analysis capabilities in a radically simpler, more agile platform which is easier to deploy and maintain:
Data integration capabilities, so that new incoming data sources can be ingested in an automated fashion using text analysis, parsing and data normalization methods.
- Text mining, so that entities such as people, places and things can be identified and concepts be understood in an automated fashion on a large corpus of information, in over 30 languages. The emotional sentiment of the author can be derived as well.
- Machine learning, which allows algorithms to be used on the data to discover hidden patterns, to automatically categorize together similar entities, to discover patterns and relationships by clustering together subtly associated entities.
- Graph analysis, to understand links between entities, and the connectedness of networks of people, organizations, and social networks.
- Geospatial analysis to help the analyst understand the spatial and temporal relationships between entities.
- Time-Series analysis to compare data that streams in cyclically over time from sensors or IoT (Internet of Things) devices.
- Linguistic search to take the product of these analytics, and make them accessible to analysts in a convenient and intuitive way.
HANA can be added to any existing data architecture (because it is completely based on non-proprietary open-standards interfaces) to bring powerful capabilities for High Performance Data Analytics to bear on data of any type or volume, and to multiply the value of existing systems.
Example Use-Case: Enhancing Situational Awareness at Coastal and Land Boarders
The Data federation capabilities of the SAP HANA system allow it to break open “stove-pipes” of data to make analysis of information across multiple information systems a reality. Given the appropriate policies and permissions, the SAP HANA system can query multiple systems-of-record in a way that is transparent to the client application which requests the information. This capability will seamlessly integrate stove-piped data sources from federal, state, local, tribal, and international law enforcement partners in order to support a more complete situational awareness and investigatory analysis of boarder and coastal data. Heterogeneous data systems can be federated together, for example, data held in: Microsoft SQL Server, Oracle Database 12C, Teradata Database, IBM DB2, IBM Netezza Appliance, Apache Hadoop, Apache Spark and others. (Technically, SAP HANA Smart Data Accesstm enables remote data to be accessed via SQL queries as if they are local tables in HANA, without copying the data into SAP HANA.) This supports the development and deployment of the next generation of analytical applications which require the ability to access and integrate data from multiple systems and sensors in real-time regardless of where the data is located or what systems are generating it. In this way a more complete situational picture, through correlated data from multiple systems and sensors, can be used to generate appropriate courses of action.
Text mining and text analysis capabilities of the platform can be used to identify and extract entities of interest from large bodies of information, such as cargo manifests, entry/exit data, publically available information and intelligence reports. Then the linguistic search engine can be used to provide convenient, contextual retrieval of information in a timely manner.
Then, the system’s Graph engine can take the output of text analysis and perform network analysis of entities extracted to understand non-obvious relationships, and to find out which people or organizations are associated by their activities. It can also perform algorithmic analysis of graph network data, such as discovering the most highly connected entities (so an adversary network can be disrupted) or the analysis of the connecting paths between entities.
Geospatial analysis—The HANA platform includes spatial analysis capabilities, so the system can be used to query entities to know if they are within a proscribed polygon area, up to a border, or a radius around a point of interest. In this way that system can alert users of a potential Maritime Threat which has changed course or entered a high-risk area. Predictive spatial algorithms included in the HANA platform can be used to help foresee the likely location of a target in a given time-frame resulting in increased small vessel interdiction effectiveness.
Spatial Analysis of Sensor Data for Lines-of-Bearing
More than fifty different Machine Learning mathematical algorithms are available to run in the platform at high performance in-memory speeds. This type of analysis provides pattern discovery, clustering together of non-obvious relationships and entities, and prediction of likely progressions of events and locations. This allows analysts to have improved assessment of risks by identifying potential threats along with emerging patterns and trends.
Event Stream Processing gives users the ability to define important events and to perform predictive logic on streaming sensor data and meta data, such as that from UAS and ground sensors—before it’s ever written to a database. Multiple data streams can be analyzed in time-series analysis from an individual sensor, or in context with each other, so that an alert can be generated when a specified condition exists across several different sensors. Machine learning algorithms can be applied to streaming data in order to produce “signal from the noise”. This will allow better measurement and understanding of illegal border activity and more effective boarder incursion detection, interdiction and deterrence.
At SAP National Security Services (NS2), our goal is to bring the cutting-edge commercial capabilities which are the product of the multi-billion dollar annual research and development effort of SAP to bear on the mission of National Security and protecting the homeland.