Abstract:
A method for providing metadata to a search engine for a document that is not in a mark-up language receives a request for contents of the document and locates metadata associated with the document. The method further creates name-value pairs for the metadata and provides to the search engine server a response comprising the name-value pair in an HTTP (or HTTPS) header and the contents of the document. In other implementations, a method includes sending a request for contents of the document and receiving a response to the request comprising an HTTP header with metadata about the document in a name-value pair and the document's content. The method also includes extracting the name-value pair from the HTTP header, creating a mark-up language tag for the name-value pair, and providing the make-up language tag and the contents of the document in a mark-up language format to a search index creation component.
Abstract:
A method for providing metadata to a search engine for a document that is not in a mark-up language receives a request for contents of the document and locates metadata associated with the document. The method further creates name-value pairs for the metadata and provides to the search engine server a response comprising the name-value pair in an HTTP (or HTTPS) header and the contents of the document. In other implementations, a method includes sending a request for contents of the document and receiving a response to the request comprising an HTTP header with metadata about the document in a name-value pair and the document's content. The method also includes extracting the name-value pair from the HTTP header, creating a mark-up language tag for the name-value pair, and providing the make-up language tag and the contents of the document in a mark-up language format to a search index creation component.
Abstract:
A method for providing metadata to a search engine for a document that is not in a mark-up language includes sending a request for data about the document and receiving a response to the request that has a Hyper-Text Transfer Protocol (HTTP or HTTPS) header that includes metadata associated with the document in a name-value pair and the document's content. The method also includes extracting the name-value pair from the HTTP-header and creating a mark-up language tag for the name-value pair and providing the make-up language tag and the contents of the document in a mark-up language format to a search index creation component.
Abstract:
A system for searching files stored in a closed file source that is not accessible via a web crawler obtains file identifiers for files stored in the file source and creates a unique URL for each of the identifiers. Each URL may be based on a file identifier and a domain portion of a URL associated with the system. The system may provide the unique URLs to a search engine. The system may respond to a crawl request from the search engine for a particular URL by converting the URL back into a file identifier, obtaining the contents of the file, creating an HTTP response from the contents of the file, and returning the response to the search engine. The system may respond to a request for a seed URL with a plurality of URLs as links in a single HTTP response.
Abstract:
A system for searching files stored in a closed file source that is not accessible via a web crawler obtains file identifiers for files stored in the file source and creates a unique URL for each of the identifiers. Each URL may be based on a file identifier and a domain portion of a URL associated with the system. The system may provide the unique URLs to a search engine. The system may respond to a crawl request from the search engine for a particular URL by converting the URL back into a file identifier, obtaining the contents of the file, creating an HTTP response from the contents of the file, and returning the response to the search engine. The system may respond to a request for a seed URL with a plurality of URLs as links in a single HTTP response.