Abstract:
An image classification method based on reliable weighted optimal transport (RWOT) includes: preprocessing data in a source domain, so that a deep neural network fits a sample image in the source domain to obtain a sample label; performing image labeling to add a pseudo label to a data sample in a target domain; performing node pairing to pair associated images in the source domain and the target domain; and performing automatic analysis by using a feature extractor and an adaptive discriminator, to perform image classification. The present disclosure proposes a subspace reliability method for dynamically measuring a difference between the source domain and the target domain based on spatial prototypical information and an intra-domain structure. This method can be used as a preprocessing step of an existing domain adaptation technology, and greatly improves efficiency.
Abstract:
Provided are a method and device for decoding and encoding supplemental auxiliary information of a three-dimensional video sequence. The method includes: obtaining supplemental auxiliary information for constructing a stereo-pair from a three-dimensional video sequence bitstream, the supplemental auxiliary information being used for indicating that the stereo-pair is constructed from a reconstructed three-dimensional video sequence, and the reconstructed three-dimensional video sequence being obtained by decoding the three-dimensional video sequence bitstream. The present invention solves the technical problem in the prior art that degraded displaying quality of a constructed stereo-pair appears because of the lack of supplemental auxiliary information in a three-dimensional video sequence bitstream, and achieves the technical effect of improving the display quality of a constructed stereoscopic video.
Abstract:
A video encoding method, a video decoding method, an apparatus, a device, and a storage medium include parsing a first flag from the video bitstream when a video bitstream is allowed to be decoded by referencing a library picture corresponding to a library picture bitstream, using a value of a target parameter of the video bitstream as a value of a target parameter of the library picture bitstream referenced by the video bitstream when a value of the first flag is a first value, and reconstructing, based on the value of the target parameter of the library picture bitstream referenced by the video bitstream and the library picture bitstream referenced by the video bitstream to obtain the library picture corresponding to the library picture bitstream referenced by the video bitstream.
Abstract:
Disclosed are a method and device for generating a predicted picture, the method comprising: determining a reference rectangular block of pixels according to parameter information which includes a location of a target rectangular block of pixels and/or depth information of a reference view; mapping the reference rectangular block of pixels to a target view according to the depth information of the reference view to obtain a projection rectangular block of pixels; and acquiring a predicted picture block from the projection rectangular block of pixels. The technical problem of relatively large dependence among the data brought by simultaneously employing the depth picture of the target view and the depth picture of the reference view in the process of generating the predicted picture in the prior art is solved, and the technical effects of reducing the dependence on the data and improving the encoding and decoding efficiency is achieved.
Abstract:
A decoding method includes obtaining an identifier from a bitstream, where the identifier indicates a minimum decoding time interval k between library pictures that is allowed in the bitstream, obtaining, when parsing the bitstream, a decoding moment ti of a current decoded picture and a decoding moment tj of a first decoded picture that is closest to the current decoded picture and that references a new library picture when the current decoded picture is decoded by referencing a library picture, where the new library picture is a library picture that is not decoded or needs to be re-decoded when the first decoded picture is decoded, and determining a preset quantity of library pictures as candidate reference pictures of the current decoded picture based on a relationship between k and a difference between ti and tj.
Abstract:
Disclosed are a method and device for generating a predicted picture, the method comprising: determining a reference rectangular block of pixels by shifting or warping parameter information which includes a location of a target rectangular block of pixels, or the parameter information comprises a location of a target rectangular block of pixels and depth information of a reference view; mapping the reference rectangular block of pixels to a target view according to the depth information of the reference view to obtain a projection rectangular block of pixels; and acquiring a predicted picture block from the projection rectangular block of pixels. The technical problem of relatively large dependence among the data brought by simultaneously employing the depth picture of the target view and the depth picture of the reference view in the process of generating the predicted picture in the prior art is solved, and the technical effects of reducing the dependence on the data and improving the encoding and decoding efficiency is achieved.
Abstract:
The present invention discloses a method to generate a stereoscopic video pair which can be used in the field of multimedia communication. A stereoscopic video pair can be obtained after processing a stereoscopic video sequence by using display auxiliary information. The stereoscopic video pair is displayed on a stereoscopic display D1. The display auxiliary information comprises camera viewpoint position information, virtual viewpoint position information and display scaling factor S1. The camera viewpoint position information indicates the position of a camera viewpoint C of a three-dimensional video sequence. The virtual viewpoint position information indicates the position of a virtual viewpoint P1. The display scaling factor S1 indicates the ratio of horizontal resolution Res1 to horizontal width W1 of the stereoscopic display D1, i.e., S1=Res1/W1, where the horizontal width is the approximation of actual display screen width. The present invention also discloses the corresponding stereoscopic video pair generation apparatus. The present invention can improve visual experience of stereoscopic viewing.
Abstract:
Disclosed are a method and device for generating a predicted picture, the method comprising: determining a reference rectangular block of pixels according to parameter information which includes a location of a target rectangular block of pixels and/or depth information of a reference view; mapping the reference rectangular block of pixels to a target view according to the depth information of the reference view to obtain a projection rectangular block of pixels; and acquiring a predicted picture block from the projection rectangular block of pixels. The technical problem of relatively large dependence among the data brought by simultaneously employing the depth picture of the target view and the depth picture of the reference view in the process of generating the predicted picture in the prior art is solved, and the technical effects of reducing the dependence on the data and improving the encoding and decoding efficiency is achieved.
Abstract:
Provided are a method and device for decoding and encoding supplemental auxiliary information of a three-dimensional video sequence. The method includes: obtaining supplemental auxiliary information for constructing a stereo-pair from a three-dimensional video sequence bitstream, the supplemental auxiliary information being used for indicating that the stereo-pair is constructed from a reconstructed three-dimensional video sequence, and the reconstructed three-dimensional video sequence being obtained by decoding the three-dimensional video sequence bitstream. The present invention solves the technical problem in the prior art that degraded displaying quality of a constructed stereo-pair appears because of the lack of supplemental auxiliary information in a three-dimensional video sequence bitstream, and achieves the technical effect of improving the display quality of a constructed stereoscopic video.