Patent search ap:"Dathena Science Pte. Ltd." Page 1

1.

发明授权
Optical character recognition systems and methods for personal data extraction 有权

公开(公告)号：US12046067B2

公开(公告)日：2024-07-23

申请号：US17348433

申请日：2021-06-15

Applicant: Dathena Science Pte. Ltd.

Inventor： Christopher Muffat , Tetiana Kodliuk

IPC: G06V30/416 , G06F18/22 , G06N3/045 , G06N3/08 , G06V10/46 , G06V30/10 , G06V30/164 , G06V30/28

CPC classification number: G06V30/416 , G06F18/22 , G06N3/045 , G06N3/08 , G06V10/462 , G06V30/10 , G06V30/164 , G06V30/293

Abstract: Methods and systems for extracting personal data from a sensitive document are provided. The system includes a document prediction module, a cropping module, a denoising module, and an optical character recognition (OCR) module. The document prediction module predicts type of document of the sensitive document using a keypoint matching-based approach and the cropping module extracts document shape and extracts one or more fields comprising text or pictures from the sensitive document. The denoising module prepares the one or more fields for optical character recognition, and the OCR module performs optical character recognition on the denoised one or more fields to detect characters in the one or more fields.

2.

发明申请
DEEP LEARNING ENGINE AND METHODS FOR CONTENT AND CONTEXT AWARE DATA CLASSIFICATION 审中-公开

公开(公告)号：US20200279105A1

公开(公告)日：2020-09-03

申请号：US16731259

申请日：2019-12-31

Applicant: Dathena Science Pte Ltd

Inventor： Christopher MUFFAT , Tetiana KODLIUK

IPC: G06K9/00 , G06K9/62 , G06N3/04 , G06N3/08

Abstract: Methods, systems and deep learning engines for content and context aware data classification by business category and confidentiality level are provided. The deep learning engine includes a feature extraction module and a classification and labelling module. The feature extraction module extracts both context features and document features from documents and the classification and labelling module is configured for content and context aware data classification of the documents by business category and confidentiality level using neural networks.

3.

发明申请
METHODS, PERSONAL DATA ANALYSIS SYSTEM FOR SENSITIVE PERSONAL INFORMATION DETECTION, LINKING AND PURPOSES OF PERSONAL DATA USAGE PREDICTION 审中-公开

公开(公告)号：US20200250139A1

公开(公告)日：2020-08-06

申请号：US16731351

申请日：2019-12-31

Applicant: Dathena Science Pte Ltd

Inventor： Christopher MUFFAT , Tetiana KODLIUK

IPC: G06F16/14 , G06F16/182 , G06F16/16 , G06F21/62 , G06K9/62 , G06N20/00

Abstract: Systems and methods for personal data classification, linkage and purpose of processing prediction are provided. The system for personal data classification includes an entity extraction module for extracting personal data from one or more data repositories in a computer network or cloud infrastructure, a linkage module coupled to the entity extraction module, a linkage module coupled to the entity extraction module and a processing prediction module. The entity extraction module performs entity recognition from the structured, semi-structured and unstructured records in the one or more data repositories. The linkage module uses graph-based methodology to link the personal data to one or more individuals. And the purpose prediction module includes a feature extraction module a purpose of processing prediction module, wherein the feature extraction module extracts both context features and record's features from records in the one or more data repositories, and the purpose of processing prediction module predicts a unique or multiple purpose of processing of the personal data.

4.

发明申请
Fully Explainable Document Classification Method And System 有权

公开(公告)号：US20210374533A1

公开(公告)日：2021-12-02

申请号：US17331938

申请日：2021-05-27

Applicant: Dathena Science Pte. Ltd.

Inventor： Christopher MUFFAT , Tetiana KODLIUK , Adel RAHIMI

IPC: G06N3/08 , G06K9/62 , G06K9/00

Abstract: Methods, systems and computer readable medium for explainable artificial intelligence are provided. The method for explainable artificial intelligence includes receiving a document and pre-processing the document to prepare information in the document for processing. The method further includes processing the information by an artificial neural network for one or more tasks. In addition, the method includes providing explanations and visualization of the processing by the artificial neural network to a user during processing of the information by the artificial neural network.

5.

发明申请
SYSTEMS AND METHODS FOR SUBSET SELECTION AND OPTIMIZATION FOR BALANCED SAMPLED DATASET GENERATION 审中-公开

公开(公告)号：US20200250241A1

公开(公告)日：2020-08-06

申请号：US16730111

申请日：2019-12-30

Applicant: Dathena Science Pte Ltd

Inventor： Christopher Muffat , Tetiana Kodliuk

IPC: G06F16/93 , G06F16/35 , G06K9/62 , G06N20/00 , G06F21/62 , G06F16/9035

Abstract: Methods and systems for data management of documents in one or more data repositories in a computer network or cloud infrastructure are provided. The method includes sampling the documents in the one or more data repositories and formulating representative subsets of the sampled documents. The method further includes generating sampled data sets of the sampled documents and balancing the sampled data sets for further processing of the sampled documents. The formulation of the representative subsets is performed for identification of some of the representative subsets for initial processing.

6.

发明授权
Methods and text summarization systems for data loss prevention and autolabelling 有权

公开(公告)号：US11461371B2

公开(公告)日：2022-10-04

申请号：US16731356

申请日：2019-12-31

Applicant: Dathena Science Pte Ltd

Inventor： Christopher Muffat , Tetiana Kodliuk

IPC: G06F7/00 , G06F16/28 , G06F16/93 , G06N5/04 , G06F16/242 , G06N20/00

Abstract: Methods and systems for data loss prevention and autolabelling of business categories and confidentiality based on text summarization are provided. The method for data loss prevention includes entering a combination of keywords and/or keyphrases and offline unsupervised mapping of a path of transfer of specific groups of documents. The offline unsupervised mapping includes keyword/keyphrase extraction from the specific groups of documents and normalization of candidates. The method further includes vectorization of the extracted keywords/keyphrases from the specific groups of documents and quantitative performance measurement of the keyword/keyphrase extraction to derive keywords and/or keyphrases suitable for data loss prevention.

7.

发明申请
METHODS AND TEXT SUMMARIZATION SYSTEMS FOR DATA LOSS PREVENTION AND AUTOLABELLING 审中-公开

公开(公告)号：US20200226154A1

公开(公告)日：2020-07-16

申请号：US16731356

申请日：2019-12-31

Applicant: Dathena Science Pte Ltd

Inventor： Christopher Muffat , Tetiana Kodliuk

IPC: G06F16/28 , G06F16/93 , G06N20/00 , G06F16/242 , G06N5/04

Abstract: Methods and systems for data loss prevention and autolabelling of business categories and confidentiality based on text summarization are provided. The method for data loss prevention includes entering a combination of keywords and/or keyphrases and offline unsupervised mapping of a path of transfer of specific groups of documents. The offline unsupervised mapping includes keyword/keyphrase extraction from the specific groups of documents and normalization of candidates. The method further includes vectorization of the extracted keywords/keyphrases from the specific groups of documents and quantitative performance measurement of the keyword/keyphrase extraction to derive keywords and/or keyphrases suitable for data loss prevention.

8.

发明授权
Methods, personal data analysis system for sensitive personal information detection, linking and purposes of personal data usage prediction 有权

公开(公告)号：US12039074B2

公开(公告)日：2024-07-16

申请号：US16731351

申请日：2019-12-31

Applicant: Dathena Science Pte Ltd

Inventor： Christopher Muffat , Tetiana Kodliuk

IPC: G06F16/182 , G06F16/14 , G06F16/16 , G06F18/21 , G06F21/62 , G06N20/00 , G06V10/82 , G06V30/196 , G06V30/262 , G06V30/412

CPC classification number: G06F21/6245 , G06F16/148 , G06F16/156 , G06F16/164 , G06F16/182 , G06F18/2185 , G06N20/00 , G06V10/82 , G06V30/1988 , G06V30/274 , G06V30/412

Abstract: Systems and methods for personal data classification, linkage and purpose of processing prediction are provided. The system for personal data classification includes an entity extraction module for extracting personal data from one or more data repositories in a computer network or cloud infrastructure, a linkage module coupled to the entity extraction module, a linkage module coupled to the entity extraction module and a processing prediction module. The entity extraction module performs entity recognition from the structured, semi-structured and unstructured records in the one or more data repositories. The linkage module uses graph-based methodology to link the personal data to one or more individuals. And the purpose prediction module includes a feature extraction module a purpose of processing prediction module, wherein the feature extraction module extracts both context features and record's features from records in the one or more data repositories, and the purpose of processing prediction module predicts a unique or multiple purpose of processing of the personal data.

9.

发明授权
Method, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection 有权

公开(公告)号：US12033040B2

公开(公告)日：2024-07-09

申请号：US17268381

申请日：2018-08-14

Applicant: Dathena Science Pte. Ltd.

Inventor： Christopher Muffat

IPC: G06N20/00 , G06F18/23213 , G06F40/284 , G06F40/30

CPC classification number: G06N20/00 , G06F18/23213 , G06F40/284 , G06F40/30

Abstract: Systems, methods and computer readable medium are provided for perform a method for content and context aware data classification or a method for content and context aware data security anomaly detection. The method for content and context aware data confidentiality classification includes scanning one or more documents in one or more network data repositories of a computer network and extracting content features and context features of the one or more documents into one or more term frequency-inverse document frequency (TF-IDF) vectors and one or more latent semantic indexing (LSI) vectors. The method further includes classifying the one or more documents into a number of category classifications by machine learning the extracted content features and context features of the one or more documents at a file management platform of the computer network, each of the category classifications being associated with one or more confidentiality classifications.

10.

发明授权
Systems and methods for subset selection and optimization for balanced sampled dataset generation 有权

公开(公告)号：US11675926B2

公开(公告)日：2023-06-13

申请号：US16730111

申请日：2019-12-30

Applicant: Dathena Science Pte Ltd

Inventor： Christopher Muffat , Tetiana Kodliuk

IPC: G06F16/93 , G06F21/62 , G06F16/9035 , G06N20/00 , G06F16/906 , G06F18/23213

CPC classification number: G06F21/6245 , G06F16/906 , G06F16/9035 , G06F16/93 , G06F18/23213 , G06N20/00

Abstract: Methods and systems for data management of documents in one or more data repositories in a computer network or cloud infrastructure are provided. The method includes sampling the documents in the one or more data repositories and formulating representative subsets of the sampled documents. The method further includes generating sampled data sets of the sampled documents and balancing the sampled data sets for further processing of the sampled documents. The formulation of the representative subsets is performed for identification of some of the representative subsets for initial processing.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification