按:   检索词:   从
   最新录用    过刊检索
多模态融合的家庭音乐相册自动生成
投稿时间:2017-08-28    点此下载全文
引用本文:刘君芳,邵曦.多模态融合的家庭音乐相册自动生成[J].南京信息工程大学学报,2017,(6):661~668
摘要点击次数: 27
全文下载次数: 30
作者单位E-mail
刘君芳 南京邮电大学 通信与信息工程学院, 南京, 210003  
邵曦 南京邮电大学 通信与信息工程学院, 南京, 210003 shaoxi@njupt.edu.cn 
基金项目:国家自然科学基金(61401227);北京市自然科学基金(4152053)
中文摘要:随着大数据以及社交网络的发展,电子相册与在线服务成为如今人们使用计算机与互联网的基础应用.尤其是近年社交网络的流行,电子相册的数量得到了爆炸增长,而如何增强相册的用户体验变得尤为重要.具有某种主题的相册一般都带有一定的情感信息,因此,本文研究了基于多模态融合的家庭音乐相册自动生成问题,旨在使用户能够在享受音乐的同时配以与音乐情感相同的相册图片.针对音乐与图片中所蕴含的情感,本文在音乐和图像中分别选取能够表达其情感的句子级别的音频特征和图像特征,然后在图像与音乐之间异构和跨模态的特征融合问题上,采用局部保持投影(LPP)方法,将图像特征与音乐特征映射到更具情感分类能力的隐式特征空间中,实现了音乐相册的自动生成.在实验中,客观评测结果表明,采用LPP方法在查准率方面高于纯CCA方法;在主观评测中LPP获得72.06%的满意度,与人工推荐的评价结果(78.09%)比较接近,明显高于随机推荐和CCA方法的满意度.
中文关键词:音乐相册  情感模型  句子级别  多模态融合  隐式空间
 
Automatic generation of family music album based on multi-modal fusion
Abstract:With the development of the big data and social network,electronic albums and online services have become basic uses of computers and the Internet.Especially in recent years,the number of electronic albums has exploded with the popularity of social network.So how to improve the user experience of music album becomes particularly important.A photo album with certain topic usually has some emotion information.This paper studies the problem of automatic generation of family music album based on multi-modal fusion,so that users can enjoy music when browsing album photos with matched emotion.According to the emotions in music and images,the representative sentence-level features both for music and images are selected,and the LPP (Locality Preserving Projection) is employed to study the relevance between the music and the images in the same emotion.The image feature and the music feature are mapped into the latent space with more emotional classification ability to realize the automatic generation of music album.In the experiments,the objective evaluation result shows that the LPP method is higher than pure CCA (Canonical Correlation Analysis) method in precision;and in the subjective evaluation,the proposed LPP method achieves 72.06% at satisfaction level,which is close to the results of manually recommended approach (78.09%) and is higher than the results of randomly recommended approach and pure CCA approach.
keywords:music album  emotion model  sentence-level  multi-modal fusion  latent space
查看全文  查看/发表评论  下载PDF阅读器

您是本站第 695370 位访问者
版权所有:南京信息工程大学期刊社《南京信息工程大学学报》编辑部     
地址:江苏南京,宁六路219号,南京信息工程大学