Invention for Machine Learning Model Training Method and Apparatus, Server, and Storage Medium

Invented by Wei Zhao, Yabing FENG, Yu Liao, Junbin LAI, Haixia CHAI, Xuanliang PAN, Lichun LIU, Tencent Technology Shenzhen Co Ltd

The market for machine learning model training methods and apparatus, servers, and storage mediums is experiencing significant growth and is expected to continue expanding in the coming years. Machine learning has become an integral part of various industries, including healthcare, finance, retail, and technology, driving the demand for advanced training methods and infrastructure. Machine learning model training refers to the process of training algorithms to learn from data and make accurate predictions or decisions. This training requires substantial computational power, specialized hardware, and large storage capacities. As the complexity and size of machine learning models increase, the need for efficient training methods and robust infrastructure becomes crucial. One of the key factors driving the market growth is the increasing adoption of machine learning in various applications. Industries are leveraging machine learning models to gain insights from vast amounts of data, automate processes, and improve decision-making. This has led to a surge in demand for training methods and apparatus that can handle the computational requirements of these models. Additionally, advancements in hardware technology have played a significant role in the market’s growth. Specialized hardware, such as graphics processing units (GPUs) and tensor processing units (TPUs), have emerged as powerful tools for accelerating machine learning model training. These hardware solutions offer parallel processing capabilities, enabling faster and more efficient training. The market for servers and storage mediums is also witnessing substantial growth due to the increasing need for high-performance computing and data storage. Machine learning models require significant computational resources, and organizations are investing in powerful servers to handle the training workloads. Moreover, the growing volume of data generated by businesses necessitates robust storage solutions to store and manage the data used for training. Cloud computing has emerged as a key trend in the market, offering scalable and flexible infrastructure for machine learning model training. Cloud service providers offer machine learning platforms that provide pre-configured environments, specialized hardware, and storage options, reducing the upfront investment required by organizations. This has made machine learning more accessible to businesses of all sizes, further fueling market growth. The market is highly competitive, with several key players offering a range of training methods, apparatus, servers, and storage mediums. Companies are investing in research and development to develop innovative solutions that can handle the increasing complexity and scale of machine learning models. Partnerships and collaborations between hardware manufacturers, software developers, and cloud service providers are also becoming common to provide comprehensive solutions to customers. In conclusion, the market for machine learning model training methods and apparatus, servers, and storage mediums is witnessing significant growth driven by the increasing adoption of machine learning across industries. The demand for efficient training methods, specialized hardware, and robust infrastructure is on the rise as organizations strive to leverage the power of machine learning to gain a competitive edge. With advancements in technology and the emergence of cloud computing, the market is expected to continue expanding in the foreseeable future.

The Tencent Technology Shenzhen Co Ltd invention works as follows

A machine-learning model training method” includes training a model based on the features of each sample within a training set, based on a first weight and a second weight. The method in one iteration includes determining the first set of samples in which the target variable was incorrectly predicted and the second set of samples in which the target variable was correctly predicted based upon a predicted lose of each sample. It also involves determining the overall predicted loss for the first set of sample based upon a predicted gain and the first weight of each of the sample in the set. The method includes updating a first weight and second weight for each sample of the first set, based on an overall predicted loss. Inputting the updated secondweight, the features and the target variables of each of the samples to the machine-learning model and initiating the next iteration.

Background for Machine Learning Model Training Method and Apparatus, Server, and Storage Medium

Machine Learning (ML) involves multiple fields and is continuously applied to real industry fields.

A supervised method is currently the most popular way to train a machine-learning model. A machine learning is model is trained on the features (for instance, title content in a mail, and credit reporting information of a person) of samples from a training set, and a classifier result (also called a target variable for example a credit score of an individual).

For example, using a machine-learning model, high-quality and low-quality customers can be distinguished in a credit report service. Spam mails and regular mails can be distinguished in a mailing system. And whether a client is a lost customer in business, it can also distinguished.

If training a machine-learning model involves training multiple classifiers in a supervised way, it is difficult to predict the classification results of some samples within a training set.

Embodiments” of the present disclosure are a machine-learning model training apparatus and method, a server and a storage medium that can improve machine-learning model prediction accuracy and training efficiency.

Technical solutions” of the embodiments in the present disclosure are implemented according to the following:

According to an aspect of the present disclosure, a computing device can execute a method for training a machine-learning model. The method involves training a machine-learning model using the features of each of the samples in a set of training samples based on a first weight and a second weight for each of them. The method determines a first set of samples that includes a sample for which the target variable has been incorrectly predicted and a subsequent set of samples with a target variable that is correctly predicted in one iteration. This is based on the predicted loss of every sample within the training set. The method includes calculating an overall predicted loss for the first set of samples based on a predicted loss and the corresponding weights of each sample. The method includes updating the weights of the samples in the first set, based on their predicted loss. The method includes introducing the second weight, features, and target variable for each sample of the training sets to the machine-learning model.

Accordingly, another aspect of the present disclosure is a machine-learning model training apparatus that includes: a memory, and one or multiple processors. The one or two processors are configured for training a machine-learning model using the features of each of the samples in a set of training data based on a first weight and a second weight initial to each sample. The one or more processors, in one iteration, are configured to determine, based upon a predicted loss for each sample, a first set of samples that contains a target variable that is incorrectly forecast, and a subsequent set of samples containing a target variable which is correctly predicted. The one or two processors can also determine the overall predicted loss for the first set of samples based on the predicted losses and the corresponding weights of each sample. One or more processors can also update a first weight and second weight for each sample of the first set of samples based on an overall predicted loss. One or more processors can also be configured to input to the machine-learning model the updated second weight for each sample of the training sets, as well as the features and target variables of each training sample.

Accordingly, another aspect of the present disclosure is a non-transitory medium that stores an executable programme. The executable program, when executed by a computer processor, can instruct the processor to: train a machine-learning model using the features of every sample in the training set. This is based on the initial first weight and initial second weight for each sample. The executable program instructs the processor, in one iteration, to determine a first set of samples that contains a sample for which the target variable has been incorrectly predicted and a set of samples for which the target variable has been correctly predicted. This is based on the predicted loss of every sample within the training set. The executable programme also causes the processor perform: determining a predicted overall loss of the training set based upon the predicted loss, and the corresponding first weight for each sample. The executable software also instructs the processor to update the first weight and second weight for each sample of the first set of samples based on the predicted overall loss. The executable program causes the processor to also perform: Inputting the updated secondweight of each training sample, the features for each training sample, and the target variables of each training sample, set to machine learning model and initiating the next iteration training the machine-learning model.

The following benefits are associated with the embodiments of this disclosure:

The machine learning model will be trained first when the samples are distributed according to the second weight. Once a sample is identified (the sample set), the machine-learning model will incorrectly predict it. A corresponding weight increase is then made to update the distribution of samples. The machine learning model will then pay more attention in future training to the sample which is incorrectly predicted. This improves the accuracy of the prediction.

The training efficiency of the model has improved.

Click here to view the patent on Google Patents.