Promote the better application of generative AI in sound related artistic creation
2024-09-05
As an important form of artistic creation, the combination of sound products and generative AI (artificial intelligence) has brought new possibilities for artistic creation. Through generative AI technology, sound products are constantly innovating and transforming in music creation, speech synthesis for audiobooks, virtual anchors, and video and sound effects production, providing new creative methods and expressions for artistic creation. The application of generative AI technology in artistic creation provides new opportunities for the development of sound products, but the risks and challenges it faces also need to be actively addressed and resolved. Generative AI products have a wide range of applications, and AI music creation is thriving. Generative AI technology has been widely applied in the field of music, mainly reflected in three aspects: classification recognition (music retrieval, score recognition, audio recognition), generation (AI composition, virtual singers), and dissemination (MIDI sound, education). AI has played a significant role in different stages of music creation. In the early stages of creation, creators use AIGC to draw music materials and creative inspiration, while analyzing and predicting the music style of their works, and conducting intelligent retrieval and organization of music materials. In the middle stage of creation, AIGC technology provides support for the output of creators' works, better assisting music creation and visualizing the intention of music works. In the later stage of creation, AIGC provides powerful audio processing tools for the post production of prototype works, making the post production of works more accurate and efficient. Especially with the emergence of certain AI orchestration technologies, it brings more inspiration to music creators while also making the creative process more efficient. AI virtual anchors approach the public. Virtual anchors have entered the public eye through the use of AI speech synthesis technology, natural language processing technology, deep learning technology, and computer vision technology. Speech synthesis technology uses advanced speech synthesis algorithms to mimic human voice, allowing virtual hosts to express and communicate fluently and naturally. Computer vision technology plays a crucial role in presenting the image of virtual anchors. Through 3D modeling and rendering techniques, highly realistic virtual characters can be created. Meanwhile, through facial and motion capture technology, virtual anchors can simulate real people's expressions and movements in real time. Emotion computing technology can analyze the language and behavior of the audience, judge their emotional state, and adjust the virtual anchor's response style and tone in a timely manner based on this, thereby endowing the virtual anchor with emotional changes in interactive states. AI dubbing is widely used. AI dubbing technology can simulate the speech characteristics of different characters and emotions by learning from a large amount of speech data, providing rich character and emotional expressions for AI dubbing products. With the development of AI technology, technologies such as neural networks and deep learning have emerged, and the combination of "TTS+AI" (from text to speech+artificial intelligence) has continuously improved the naturalness and accuracy of electronic synthesized speech. In response to diverse needs such as emotional fluctuations, diverse voice lines, and vocal adaptability, AI speech product suppliers currently provide customized "emotional TTS" services. By adjusting speech tone, speed, pause, pitch, and even simulating human tone changes in different moods, electronic synthesized speech can better adapt to different contexts and scenes, giving it richer "emotional" expressions. AI dubbing technology has a wide range of application scenarios and has a promoting significance for the development of many industries. In movies, TV dramas, animations and other film and television works, AI dubbing technology is used to generate the voice of characters. By simulating different timbres and languages, AI dubbing technology can make the character's voice more natural and realistic, improving the audience's viewing experience. This technology is also widely used in the fields of audiobooks and e-books, providing users with more natural and smooth voice broadcasting services. AI dubbing has also been applied in the gaming industry, generating natural and more character specific sounds that enhance the gaming consumer experience. The advantages of generative AI products are obvious, and the creation cost is relatively low. With the development of artificial intelligence technology, AI technology has begun to be applied to intelligent sound design, which automatically generates corresponding sound effects based on film and television content and emotional requirements. This application greatly reduces production costs. On the one hand, it saves time and human resources in the production of sound products. Compared to traditional products, the application of AI technology can quickly generate the required sound and achieve automated operations, reducing the manpower and time required for the creative process. On the other hand, the production of traditional sound products is easily influenced by various factors, such as the creator's status, environmental noise, equipment operation, etc., resulting in a longer creation cycle. However, the application of generative AI technology in sound product creation can minimize the impact of external factors and achieve cost optimization. Accurate and efficient material processing. AI technology can recognize, classify, and organize audio materials through a large amount of data learning, automatically identifying different elements in audio such as vocals, music, sound effects, etc., and quickly classifying and organizing them. Compared with manual methods, it greatly improves the quality of material processing and reduces the error rate of data processing. AI technology can also intelligently edit and stitch audio materials. In addition, special processing such as noise reduction, addition or subtraction of reverberation is required in the audio processing process, and manual processing may result in individual bias, while AI technology can minimize the probability of errors occurring. Generate diverse and innovative content. With the development of society, people's demand for sound products has become more diversified and personalized. In response to this situation, generative AI can help creators break through inherent thinking limitations and provide innovative creative ideas in various fields such as advertising dubbing, music production, and virtual character sound creation. Moreover, AI can analyze users' preferences, habits, and interaction data to understand their specific needs for sound products such as timbre, tone, intonation, etc., providing practical support for creators' personalized creation. There is controversy over the risks and challenges of the development of generative AI products and the issue of equity. Sound products involve issues such as data infringement risks, the rights of copyright holders, and personal privacy and personality rights. Most AIGC creation models are trained from large samples, making it difficult to trace back the materials selected for the model, and the materials used may not be authorized by the author. Moreover, users of generative AI can engage in secondary creation on the generated products, and the ownership of new product copyrights is also difficult to confirm. When AI generates some audio products, if it uses the voice of ordinary people that has not been publicly released, such voice belongs to personal privacy information, and there may be a risk of privacy exposure when it is publicly released, which will infringe on personal privacy and personality rights. At the current stage of development, AIGC has shifted from combinatorial content creation to exploratory or even transformative content creation. The division of rights and benefits and potential infringement liability in its creative process is currently under debate. There is controversy over the issue of artistic norms. Voice AI products involve art disciplines such as music, broadcasting, and hosting, which also have certain professional norms and artistic processing methods in their professional fields. However, there is controversy over whether voice AI products comply with professional artistic norms. Taking broadcasting and hosting as an example, when reading aloud in different contexts, the tone, emotions, pauses, and stress expressed in the same sentence are different. The position changes of pauses and stress during speaking can also affect the expression of meaning. However, currently, generative AI cannot recognize specific contexts based on text content during dubbing, and there is not much difference in the way the same voice is read in different contexts. In terms of emotional processing, AI is more rigid in expression compared to real people, with no emotional fluctuations. Therefore, in practical applications, the issue of artistic standards for sound based AIGC products is also worth pondering. The inspiration of generative AI products for creators is that creators should continuously improve their own abilities. Generative AI technology can achieve one click generation, eliminating the need for manual completion of some simple and repetitive tasks, which puts some creators at risk of unemployment. Of course, the inherent problems of voice based AI products also make the position of excellent creators unshakable. For example, AI virtual anchors can imitate standard Mandarin and specific tones through "cloning", but fundamentally they can only imitate the external form of sound. In fact, excellent anchors need to adopt different broadcasting methods in different situations, expressing different emotional states through changes in tone, intonation, and pauses. This requires creators to constantly improve their technical skills, enhance their professional abilities, comprehensively learn knowledge from multiple fields, enrich their experiences, expand their advantages, and calmly face the impact of AI product development on creators. Creators should make reasonable use of AI technology. At present, AIGC technology has certain advantages in providing creators with creative inspiration, reducing creative costs, improving work output efficiency, and enhancing work quality. Therefore, creators should actively understand and learn about AIGC technology, rather than simply affirming or resisting it. They should combine their own needs with AI technology, take the initiative in the application of AI technology, and make it a powerful assistant to assist in the creation of works, achieving mutual integration and development between themselves and AI technology. At the same time, there is still controversy over the usage standards of AIGC. Creators need to raise their legal awareness, timely understand the relevant laws and regulations of AIGC, so that AI technology can assist their own creation in a reasonable and standardized manner, improve the quality of their works, and produce better sound products. With the development of AIGC technology, its application in sound products has occupied a certain position. It can not only simulate human like sounds and directly generate sound products, but also assist creators in their creation, providing them with new tools and unique ways of expression. However, at the same time, generative AI technology still faces many challenges in the application of sound based artistic creation. We need to seize opportunities while facing existing problems, and strive to improve them in practice, so that generative AI technology can be better and more widely applied. (New Society)
Edit:Xiong Dafei Responsible editor:Li Xiang
Source:CCJK
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com