Invention Grant
- Patent Title: Preparing structured data sets for machine learning
-
Application No.: US16552857Application Date: 2019-08-27
-
Publication No.: US11861462B2Publication Date: 2024-01-02
- Inventor: Nicholas John Teague
- Applicant: Nicholas John Teague
- Applicant Address: US FL Orlando
- Assignee: Nicholas John Teague
- Current Assignee: Nicholas John Teague
- Current Assignee Address: US FL Orlando
- Main IPC: G06N20/00
- IPC: G06N20/00 ; G06F16/90 ; G06F16/27 ; G06F18/21 ; G06F16/901 ; G06F18/214 ; G06F18/2135

Abstract:
A technique for automated preparation of tabular data for machine learning, including options for machine learning derived infill, feature importance evaluations, and/or dimensionality reduction. Validation data sets may be consistently prepared to training data sets based on properties of the training data saved in a metadata database. Additional data sets may be consistently prepared to training data sets based on properties of the training data saved in a returned metadata database such as for use in generating predictions from the trained ML system. Returned data sets may be prepared for oversampling of labels with lower frequency occurrence. Columns of a training data set are evaluated for appropriate categories of transformations, with the composition of transformation function applications designated by a defined tree of transformation category assignments to transformation primitives. Composition of transformation trees and their associated transformation functions may optionally be custom defined by a user.
Public/Granted literature
- US20200349467A1 Preparing Structured Data Sets for Machine Learning Public/Granted day:2020-11-05
Information query