Palantir Technologies is among the few startups donning the shroud of secrecy as well as controversy. The data science company seeps fragrance of patriotism and is touted as a key factor in tracking Osama bin Laden. Founded in 2003 by PayPal founder Peter Thiel and Alex Karp, Palantir helps organizations manage and make sense of hugely fragmented datasets to gain actionable insights.
The company went public on September 30th, 2020 via direct listing and was valued at $21 billion. Its list of customers includes some of the biggest names in commercial sectors like Airbus, Ferrari, BP, and govt. contractors like CIA, ICE, and many police departments across the states.
Overview of Palantir Technologies Patent Portfolio
The patent portfolio of Palantir Technologies has 1557 worldwide patents (granted + applications) that belong to 486 patent families. 1372 of these patent documents are alive and 185 are dead. From the exhibit below on the worldwide distribution of Palantir’s patent portfolio, we can see that 61% of Palantir’s patent portfolio consists of US patent documents. The European patent portfolio of Palantir is 19.4% of its global patent portfolio, followed by Great Britain, Australia, and Canada with a share of 4%, 3.9%, and 2.8% respectively.
Exhibit: Legal Status of Palantir’s Patent Portfolio:
Patent Application Filing Trend of Palantir Technologies
Are you wondering why in the above two exhibits there is a drop in patent filing and patent claiming priority in the year 2020? A patent office takes 18 months to publish a patent application. So, all the application filed in 2020 hasn’t been published yet which is cause of the drop.
Technological Profile of Palantir Patents
Palantir describes itself as an analytical infrastructure (and not a visualization tool or database) that helps analyst leverage existing data within an organization to find answers quickly. This analytical infrastructure has four major layers of functionalities that build on each other. These are as follows:
Data integration: This is the foundation of Palantir on which it does everything. It fetches all form of data – video, GPS imagery, spreadsheets, etc. – that exists across an enterprise in different databases brings it to access at one place.
Search and Discovery: This layer facilitates a single search point of access that allows a user to search in any of the databases across an enterprise. A non-technical analyst can perform advanced conceptual and persistent searches of petabytes of data without using SQL queries or by writing strings. It lets an analyst discover the unknowns by allowing them to search on the basis of how different information is linked together.
The Search and Discovery layer also offers geo spatial and temporal searches which allow a user to know what was happening at what time at what place. For example, a user can ask questions like show me all the traffic violations in this route over the last three weeks.
Knowledge management: This layer keeps a track of every bit of data that enters into it data integration layer. This data is tracked wrt to how and when it entered into Palantir, who is allowed to see it, and how the information changed/evolved over the time.
Collaboration: The collaboration layer of Palantir tools allows a user to share his analysis across an enterprise.Individuals and groups of individuals benefit from each other’s work while making sure that each user can access only the data s/he is authorized to access.
Palantir’s collaboration layer is modeled around the way version control worksin software development. Like in a software project where hundreds of engineers make changes in a single codebase at the same time, Palantir offers its users to work on a dataset individually and reconcile their shared understanding at a single place. The exhibit below represents the breakage and distribution of 486 patent families of Palantir into various technology clusters. Being a data science unicorn is reflected in the patent portfolio of Palantir as 72.2% of its patent portfolio is focused on data analysis.
Palantir Patent Portfolio: Digital Data Processing Patents (302 patent Families)
The Data Processing cluster is the backbone of Palantir’s technology. 62% or 302 patent families of Palantir are focused on the processing of digital data. Major sub-clusters of Data Processing are Information retrieval, Interface, Handling Natural Language Data, and Security Arrangement. Inventions on UI, making sense of data coming from vast and hugely fragmented datasets, finding the relationship between data in fragmented datasets, and how Palantir’s Gotham, Metropolis, and Foundry platforms offer analysis and prediction, are in this cluster.
Information Retrieval (File System and Database structure – 181 patent families)
Patent families classified in this cluster are focused on the Data Integration layer. In general, inventions in this cluster are focused on retrieving digital information stored in databases, data repositories or file systems, on query formulation, on tuning, replication, archiving, synchronization, and concurrency control, de-duplication of stored data, application-specific caching and pre-fetching in file systems. Also, techniques for retrieving semi-structured data, text, audio, image, video or multimedia data, and retrieving information from web get classified under the cluster.
135/181 patent families of Palantir in this cluster are focused on Structured data. These patent families are focused on visual data mining, a database having geographical information, DBMS interfaces, database models, querying (max 46 patent families), and the like. The exhibit below gives a complete breakdown of the distribution of 135 patent families. Also, the list below has the detail of some of the inventions covered under Structured Data.
- To efficiently replicate large numbers of data object changes over an unreliable data network.
- Interactive UI that enables efficient and rapid access to multiple different data sources simultaneously for an unskilled user.
- More efficient multi-row ACID-compliant transactions with snapshot isolation semantics,
- On ontology
- Provide a highly dynamic and interactive UI for quick and efficient exploration of large volume data sources.
- For automatically clustering and canonically identifying related data in various data structures.
- For improved time-series databases and time-series operations.
- Use of AI algorithms to categorize data items obtained from different sources into different sub-sets based on ranking, and presenting that on an ergonomic UI for efficient analysis.
- UI elements that enable users to create visual queries.
- Techniques for reducing the amount of decoding and decompressing to speed up locating, accessing, and retrieving data.
- An improved spreadsheet application that allows a user to generate, manipulate, and replicate data visualizations (e.g., sparklines, graphs, charts, etc.) using functions without importing data into cells of the application
- Approaches for indexing and comparing charts
Interface Arrangement (35 patent families)
This cluster has patent families covering inventions for the Search and Discovery layer. The list below is an overview of the kind of inventions Palantir has filed under the Interface Arrangement cluster:
- A gesture management system for UI wherein when a user waves hands to the right, for example, the system may trigger rotation of a globe to the right or panning of a surface to the right.
- Graphic representation of time based results. Let me explain its application in context of gas extraction where output versus time is displayed on a graph for an analyst. If the analyst observes a drop in production, s/he may highlight a portion of the graph where output was “ideal” and set that portion of the graph as the baseline. This offers the analyst to find what has changed with the inputs and why between the baseline/ideal time and the time being studied.
- A convenient, digestible overview of tactical and/or strategic data in a single UI for Emergency Call Data of a Law Enforcement Agency.
- UI that allows distorting nodes in a graph. It can present multiple nodes in a particular area of interest in a manner that makes it easier for an analyst to interact with one or more nodes, and can visualize a graph better than a simple zoomed-in view.
Security (17 patent families)
Patent families in this cluster are on features of the Collaboration Layer. The list below is a peek at the kind of inventions covered in this sub-cluster:
- Data analysis system that may automatically analyze a suspected malware file, or group of files.
- For Protecting Against Malicious Code
- On a system that allows multiple users to collaborate on a document while ensuring that each user can only see the portion of a document commensurate with their access level control. .
- A central computer site on a computer network for detecting authorized or unauthorized duplication of software on computers connected to such a network.
Handling Natural language Data (20 patent families)
Patents that cover text and natural language processing, language translation, spreadsheets, and processing of markup language are classified under this sub-cluster. This sub-cluster in Palantir patent portfolio has inventions focusing on analysis of large bodies of textual data, for providing access to a data object from within a spreadsheet, for generating a new workflow for an application, for enhanced verification wherein a classification computer trains a classifier based on a set of training documents, for annotating and linking electronic documents, etc.
Program Control Unit (20 patent families)
Patents in this cluster cover inventions on runtime execution of programs. Below is detail of some of the inventions covered in this sub-cluster:
- For creating and managing dashboards.
- Pipeline Task Verification
- System Architecture For Efficient Inter-Application Communications
- Module Assignment Management
- For secure interfacing with a cloud computing service
- Remote configuration of a computing machine
Palantir Patent Portfolio: Data Processing Systems (31 Patent Families)
This tech cluster of Palantir’s patent portfolio covers data processing systems for managing, promoting, or practicing commercial or financial activities. These patent families cover different features and applications of Palantir’s Foundry platform.
A total of 31 patent families of Palantir are classified in this cluster. 12 patent families in this cluster focus on governing or management of an organization, enterprise or employees, 2 on Payment Architectures, 7 on Commerce, 8 in Finance, and one each in Manufacturing and Service Sectors.
Under Administration & Management sub-cluster, inventions like data processing systems for surveillance/monitoring, tracking systems, for analyzing healthcare data, data audit system to generate and display a tracking interface, fraud detection in the context of health insurance, etc. are covered.
In Finance, data processing techniques for analyzing market dynamics, card breach detection, using multiple predictive data models to predict profitability of a customer/household, etc. are covered.
In Commerce, techniques like determining an inclination of a customer to take a specified action – leaving, using ML to automatically process already stored data to find fraudulent/abusive users, payment card fraud detection by identification of compromised card readers, a UI with a map based on location-based interaction data, Collecting and classifying billion of data points according to different behavioral patterns, etc., are disclosed.
The inventions in Payments clusters are on data analysis to detect and score potential money laundering activities, and for detecting fraudulent transactions such as unauthorized trading activity.
Palantir Patent Portfolio: Digital Information Transmission (50 Patent Families)
50 patent families on Data Transmission by Palantir Technologies cover techniques and methods related to Network Security, Cryptography, Access control, Policy Control, Distributed computing, Client-server architecture, and the like. 17/50 patent families in this tech cluster are focused on Malicious Traffic. These patent families cover techniques that detect malicious software, email phishing attacks, account compromise, a vulnerability in a computer network, and security systems that detect and prevent cyber-attacks on an organization. In 2009 and 2010, Palantir was used in uncovering the China-based cyberespionage network GhostNet and the Shadow Network respectively.
Palantir Patent Portfolio: Image Data Processing and Generation Patents (18 Patent Families)
The focus of Palantir’s patent families in this cluster is on increasing cognitive and ergonomic efficiencies in interactions and presentation of data. The patent families in this cluster cover:
- Techniques to display an interactive geospatial map,
- Time series data for analysis,
- A visualization system for event participation flows,
- A UI for a defect management system, a passing system with an interactive UI that provides information about vehicle/individual crossing a marker(s),
Multiple viewshed analysis which has applications in military/civilian world, a system for generating alerts for user review after collecting data from large number of entities – machines in manufacturing unit, oil rigs, computers on a network, etc., are protected in this technology cluster.
HealthCare (2 patent Families)
One patent family in Bioinformatics is on a UI for visualization of genomic data. This patent belongs to Palantir’s Foundry Platform for the Healthcare industry. The NIH uses the Foundry platform to understand scientific data from dozens of internal and external sources. For example, in their studies to enable precision medicine, NIH used the Foundry Platform to understand how genetic and other factors impact drug efficacy.
Another patent family categorized under Healthcare Informatics is on care management software. The patent shares a technique to identify members to target for therapeutic intervention.
Artificial Intelligence (7 Patent Families)
Palantir has 7 patent families covering computer systems based on specific computational models in its patent portfolio. You can see a further distribution of these patent families in the exhibit below.
Three patent families in Knowledge-based models are on Crime Risk Forecasting or predictive policing which is a key feature of Palantir’s Metropolis, for reducing failure rates of manufactured products, and for identifying and categorizing electronic documents through machine learning.
Two patent families in Machine Learning are on selecting machine learning training data, and on a vector modeling system For Distributed Data Sets. One patent family in Biological Models is on using AI to identify prior art patent references for a subject patent application, and the one in Mathematical Models is on processing sensor logs.