本平台为互联网非涉密平台,严禁处理、传输国家秘密或工作秘密

基于多源异构的烟用香原料数据集构建

Creation of a multi-source heterogeneous tobacco flavor material dataset

  • 摘要: 为解决烟用香原料数据查找难、获取难、使用难等问题,构建了基于多源异构的烟用香原料数据集。通过公开数据来源进行数据采集,获取香原料基本信息、理化性质、感官特性等数据;对香原料样品开展感官评价与成分检测,获取样品检测数据。经过条目标准化、数据结构融合和数据标签标注,实现了多源异构数据处理。烟用香原料数据集涵盖1 000余种香原料,包含10个数据模块,并以此为基础建立了烟草行业香原料中心库平台。对主体香型分布、嗅香香韵分布、香韵与卷烟加香作用之间的关联性进行分析,结果表明:①数据集能够从多维度提供烟用调香数据,并面向应用场景支持多种数据检索功能。②通过数据分析能够发现烟用香原料的分布特征,所体现的卷烟加香规律与实际经验基本相符合。③数据集检索量达到15 000余次/年。该研究可为推动烟草调香数字化转型提供支持。

     

    Abstract: To address issues such as difficulties in searching, accessing, and using data related to raw materials of tobacco flavors, a dataset of flavor materials based on multi-source heterogeneous data was created. Published data, including basic information, physicochemical properties, and sensory characteristics were collected. Sensory evaluation and chemical analysis were performed on flavor material specimens to obtain testing data. Heterogeneous data from different origins were processed through entry standardization, structure integration, and annotation. The created dataset includes over 1 000 flavor materials and comprises 10 data modules. Meantime, the "Tobacco Flavor Material Central Database" platform was set up. Main flavor type distribution, olfactory aroma note distribution, the correlations between aroma notes and cigarette flavoring effect were analyzed. The results showed that: 1) The dataset offered data for tobacco flavor blending from multiple dimensions and supported a range of data retrieval routes to adapt to diverse application scenarios. 2) Data analysis revealed the distribution features of tobacco flavor materials, the rules of tobacco flavoring aided by the dataset were basically in consistence with practical experiences. 3) The dataset was accessed more than 15 000 times per year. This research supports the digital transformation of tobacco flavor blending.

     

/

返回文章
返回