Abstract:
In order to assess the operation status of a cigarette market scientifically and efficiently, an intelligent assessment model was developed by means of big data technology and machine learning algorithm.The development of the model includes four steps:data cleaning and standardizing; eigenvalue screening via principal component analysis; model building and training on the basis of Spark distributed parallel computing architecture; validation and optimization of the trained model.Cigarette brands "Nanjing (Jiu Wu)" and "Suyan (Soft Jinsha)" produced by China Tobacco Jiangsu Industrial Limited Corporation were chosen for model validation, the results indicated that the model could reflect the market operation status and development trend of individual cigarette brand timely and accurately with an accuracy of higher than 90% for the said two cigarette brands, the predicted values and actual values were basically agreed.This method provides technical supports for market trend predicting and marketing decision making of cigarettes.