- Mountain View CA, US Pankaj RASTOGI - Fremont CA, US Sumanth VENKATASUBBAIAH - Mountain View CA, US Qingbo HU - Burlingame CA, US Karthik PRAKASH - Milpitas CA, US Nicholas Jeffrey HOH - Mountain View CA, US Frank WISNIEWSKI - San Francisco CA, US Abhishek JAIN - Mountain View CA, US Caio Vinicius SOARES - Redwood City CA, US Yuwen WU - Mountain View CA, US
International Classification:
G06F 16/23 G06N 20/00 G06N 5/02 G06F 9/54
Abstract:
Certain aspects of the present disclosure provide techniques for operation of a feature management platform. A feature management platform is an end-to-end platform developed to manage the full lifecycle of data features. For example, to create a stateful feature, the feature management platform can receive a processing artifact from a computing device. The processing artifact defines the stateful feature, including the data source to retrieve event data from, when to retrieve the event data, the type of transform to apply, etc. Based on the processing artifact, the feature management system generates a processing job (e.g., the API defines a pipeline), which when initiated generates a vector that encapsulates the stateful feature. The vector is transmitted to the computing device that locally hosts a model, which generates a prediction that is transmitted to the feature management platform. Subsequently, the predication and stateful feature can be transmitted to other computing devices.
Batch To Stream Processing In A Feature Management Platform
- Mountain View CA, US Karthik PRAKASH - Milpitas CA, US Pankaj RASTOGI - Fremont CA, US Sumanth VENKATASUBBAIAH - Mountain View CA, US
International Classification:
G06F 9/445
Abstract:
Certain aspects of the present disclosure provide techniques for a “hand-off” operation of a feature management platform. A feature management platform can receive a request to generate feature data based on batch and streaming data. To generate such feature data, a “hand-off” occurs between a batch processing job to a stream processing job. The feature management platform can initiate the batch processing job to generate a first set of feature data. Once all of the feature data is generated by the batch processing job, the feature data is saved in an offline database. The feature data with the maximum timestamp is saved in an online database, and the maximum timestamp is saved in a persistent database. With the maximum timestamp, the feature management platform begins the stream processing job. Once feature data is generated by the stream processing job, the feature data is stored in an offline or online database.
- Mountain View CA, US Pankaj RASTOGI - Fremont CA, US Sumanth VENKATASUBBAIAH - Mountain View CA, US Qingbo HU - Foster City CA, US Karthik PRAKASH - Milpitas CA, US Nicholas Jeffrey HOH - Sunnyvale CA, US Frank WISNIEWSKI - San Francisco CA, US Abhishek JAIN - Mountain View CA, US Caio Vinicius SOARES - Redwood City CA, US Yuwen Ellen WU - Mountain View CA, US
International Classification:
G06F 16/23 G06F 9/54 G06N 5/02 G06N 20/00
Abstract:
Certain aspects of the present disclosure provide techniques for operation of a feature management platform. A feature management platform is an end-to-end platform developed to manage the full lifecycle of data features. For example, to create a stateful feature, the feature management platform can receive a processing artifact from a computing device. The processing artifact defines the stateful feature, including the data source to retrieve event data from, when to retrieve the event data, the type of transform to apply, etc. Based on the processing artifact, the feature management system generates a processing job (e.g., the API defines a pipeline), which when initiated generates a vector that encapsulates the stateful feature. The vector is transmitted to the computing device that locally hosts a model, which generates a prediction that is transmitted to the feature management platform. Subsequently, the predication and stateful feature can be transmitted to other computing devices.