Blog

인공지능이 만든 팝 음악

AI makes pop music in the style of any composer (the Beatles too!)

[iframe id=”https://www.youtube.com/embed/LSHZ_b05W7o”]

1) LSDB라는 데이터 베이스를 만든다. 다양한 스타일과 작곡가들의 리드시트 약 13,000를 포함하고 있다.
2) 인간 작곡가는 FlowComposer라는 시스템을 가지고 스타일과 생성될 리드시트(멜로디+하모니)를 선택한다. Daddy’s Car 경우에는 비틀즈 스타일을 선택했고, Mr. Shadow는 “아메리칸 작곡가” 스타일을 선택했다.
3) Rechord 라는 다른 시스템을 가지고 인간 작곡가는 생성된 리드 시트에 다른 노래의 오디오 녹음에서 일부 오디오 덩어리들을 붙어녛는다.
4) 인간 음악가는 프로덕션과 믹싱을 마무리한다.
[expand title=Eng]
1) We set up a database called LSDB. It contains about 13000 leadsheet from a lot of different styles and composers (mainly jazz and pop about also a lot of Brazilian, Broadway and other music styles).

2) The human composer (in this case Benoît Carré, but we are experimenting with other musicians as well) selected a style and generated a leadsheet (melody + harmony) with a system called FlowComposer. For Daddy’s Car, Carré selected as style “the Beatles” and for Mr. Shadow he selected a style that we call “American songwriters” (which contains songs by composers like Cole Porter, Gershwin, Duke Ellington, etc).

3) With yet another system called Rechord the human musician matched some audio chunks from audio recordings of other songs to the generated leadsheets.

4) Then the human musician finished the production and mixing.[/expand]

창작이라는게 결국 기존 것들의 조합인데 최근 느끼는 건 결국 기존 만들어진 로데이터가 가장 중요한 것 아닌가 하는 생각이 든다.

10월 29, 2025
구글의 뉴럴 머신 번역 시스템

Google Translate now converts Chinese into English with neural machine translation
By Jordan Novet, venturebeat.com

구글이 중국어 번역에 뉴럴 네트워크 기반 머신러닝을 적용한다는 이야기이다. 번역의 품질이 구절 기반으로 번역할때보다 향상되었다고 한다.기존에 구글은 구글 번역에 뉴럴 네트워크를 사용한다고 말해왔지만, 실시간 이미지 번역을 위한 것이었다.

뉴럴 네트워크 기반이 항상 좋은 결과를 내는 것은 아니지만 phrase-based 번역에 비해 60%정도 오류가 줄어들었다는 이야기를 한다.

이번 번역 관련 연구는 여기에서 읽어볼 수 있다.

10월 29, 2025
디즈니 트위터 인수에 참여하기로

Disney Is Working With an Adviser on Potential Twitter Bid

디즈니도 트위터 인수를 고려하고 있다. 디즈니하면 미키마우스와 디즈니랜드를 떠올리지만 ESPN과 ABC를 소유하고 있는 미디어 복합기업으로, 얼마전에는 스트리밍 미디어 기업 BAM Tech 지분을 33.3% 인수하기도 했다.

디즈니의 가장 많은 비중을 차지하는 비즈니스인 케이블TV가 시청자들을 잃어가고 온라인 비디오 서비스들과 경쟁에 직면함에 따라, Iger (CEO)는 비디오 스트리밍 서비스 훌루(Hulu), 디지털 미디어 기업 바이스(Vice), MLB의 BAMTech와 같은 기술 관련 미디어 테크놀로지 기업에 투자해왔다. 트위터는 또한 BAMTech와 라이브 스트리밍에 대한 파트너십을 맺었다.
[expand title=Eng]
With Disney’s largest business — cable TV — losing viewers and facing more competition from online video services, Iger has invested in technology-related media businesses, including the Hulu video streaming service, digital media company Vice and Major League Baseball’s BAMTech, which provides the platform for online video services such as HBO Now. Twitter has also partnered with with BAMTech for its live streaming.[/expand]

트위터 내부적으로는 디즈니가 가장 적격이라는 이야기도 나온다고 한다.

10월 29, 2025
넷플릭스 오리지널 비중 50%까지 늘린다

Netflix planning to fill half its catalog with originals in the next few years By Rich McCormick, www.theverge.com

의미하는 것은 넷플릭스가 오리지널 콘텐트를 제작하거나 구매함으로써 50/50에 도달할 것이라는 점이지만, 쇼와 영화를 더 조금 라이센스함으로써 비율을 맞출 수도 있다. 넷플릭스 CEO Ted Sarandos는 연초 2016년 600시간의 오리지널 콘텐트를 공개할 것이라고 말했지만, 다른 네트워크와 스튜디오에서 콘텐트를 얻는 것의 비율을 줄이지 않는한 50%의 카탈로그를 채우는 것은 충분하지 않다. 라이센스 쇼와 다르게 넷플릭스 오리지널들은 경쟁자와 차별화하고 사람들이 가입하도록 만들기 위해 독점으로 유지될 것이다. [expand title=Eng] The implication is that Netflix would reach its 50/50 split by producing and buying up more originals, but the company may also even the odds by licensing fewer shows and movies. Ted Sarandos, Netflix’s CEO, said at the start of the year that Netflix would release 600 hours of originals in 2016 — up from the 450 hours it put out last year, but not yet enough to fill up 50 percent of its catalog unless it scaled back on content picked up from other networks and studios. Unlike licensed shows, Netflix’s originals can be kept exclusive to the service, differentiating it from its competitors and driving people to sign up for subscriptions. Already, Wells says between a third and half of lapsed subscribers to the service return eventually.[/expand]

독점작이 플랫폼에게 중요하지만 전체 콘텐트에서 비중을 50%까지 가져간다는게 어떤 의미일까.

10월 29, 2025
페이스북 동영상 시청 측정

Facebook Overestimated Key Video Metric for Two Years By SUZANNE VRANICA and JACK MARSHALL, www.wsj.com

몇 주 전, 페이스북은 “광고주 고객 센터”에 올린 글에서 이용자들이 비디오 시청에 소비하는 평균 시간에 대한 매트릭스가 인위적으로 부풀려졌는데, 3초 이상 본 비디오를 감안했기 때문이라고 밝혔다. 페이스북은 문제를 해결하기 위한 새로운 매트릭을 소개할 것이라고 말했다. [expand title=Eng] Several weeks ago, Facebook disclosed in a post on its “Advertiser Help Center” that its metric for the average time users spent watching videos was artificially inflated because it was only factoring in video views of more than three seconds. The company said it was introducing a new metric to fix the problem.[/expand]

평균이라고 써진 값을 잘못 계산한건 문제겠지만 페이스북에 올라온 글처럼 다른 메트릭들 계산에 시청인원이 포함되지 않는다면 큰 문제였을까 하는 생각은 든다.

10월 29, 2025
EdTech 분야에서 LMS 시장 변화

The LMS market glacier is melting

EdTech 분야에서 스타트업들에 대한 투자는 지속적으로 높은 수준을 보여왔다. 그리고 그 중에서 많은 관심을 받는 분야는 LMS쪽이다. 신규 진입자들도 많이 등장하고 있지만 원래부터 자리잡고 있는 지배적인 사업자들이 대부분의 시장점유율을 차지하는 과점 형태를 보이고 있다. 블랙보드 같은 업체들이 대표적이다. 하지만 LMS분야에서 이런 독과점 형태가 무너지고 있다는 기사다. LMS 도입에서 조직보다 개인이 선택할 수 있는 경향이 생겼기 때문이다. 캔버스 같은 서비스는 신규 진입 이용자의 80%정도를 가져간다고 한다.

아마도 중요한 사실은 대부분의 학교가 더 이상 가상 클래스를 운영하기 위한 하나의 시스템을 찾지 않는다는 점이다. 우리는 LMS가 구성요소에 하나일 뿐인 새로운 아키텍처를 디자인하고 있는 기관들을 볼 수 있다. 이러한 움직임은 능력기반교육이나 개인화 된 학습과 같은 폭넓은 교육학적 접근을 가능하게 한다.
[expand title=Eng]
Perhaps most significantly, most schools are no longer looking for just one system to manage the virtual classroom. We are now seeing entire institutions, such as Southern New Hampshire University and University of Maryland University College, designing new architectures where the LMS is but one core component. This move is enabling broader adoption of pedagogical approaches, such as competency-based education and personalized learning.[/expand]

10월 29, 2025
정지사진으로 다음 동작 예측

Machine learning’s next trick is generating videos from photos

MIT 연구자들이 뉴럴네트워크를 통한 알고리즘으로 정지 사진 다음 동작이 어떻게 될지 예측하는 모델을 만들었다는 기사. 플리커에서 받은 이미지들을 통해 학습시킨 것 같다. 아직까지 그림 크기가 작고 몇 초 되지 않으며 약 빤거 같은 이미지들이 나오는 경우도 있지만 시도 자체가 재미있다. 기사 내에 샘플 이미지들이 올라와 있다.

10월 29, 2025
인공지능은 모자이크 한 얼굴 인식 가능

AI Can Recognize Your Face Even If You’re Pixelated
By Lily Hay Newman, www.wired.com

픽셀화 (모자이크) 된 이미지는 학습된 기계가 판별해낼 수 있다는 이야기.

모자이크는 시각 미디어의 사적인 부분을 가리기 위해 사용하는 친숙한 도구였다. 흐릿하게 가려진 글자 혹은 얼굴과 번호판은 뉴스와 온라인에 나왔다. 그 기술은 화려한 것은 아니지만 사람들이 왜곡을 통해 보거나 읽을 수 없었기 때문에 잘 작동해왔다. 그러나 문제는 더 이상 인간만이 이미지를 인식하는 달인이 아니라는 사실이다. 컴퓨터의 시각은 점점 견고해지고 있고, 우리가 볼 수 없는 것을 보기 시작했다.
[expand title=Eng]
Pixelation has long been a familiar fig leaf to cover our visual media’s most private parts. Blurred chunks of text or obscured faces and license plates show up on the news, in redacted documents, and online. The technique is nothing fancy, but it has worked well enough, because people can’t see or read through the distortion. The problem, however, is that humans aren’t the only image recognition masters around anymore. As computer vision becomes increasingly robust, it’s starting to see things we can’t.[/expand]

비슷한 종류의 기사를 얼마전에도 읽은 것 같다. 시리의 음성 학습 관련 내용이었던걸로 기억하는데 사람들이 잘 판별해내지 못하는 지역별 사투리도 기계는 명확하게 구분해낼 수 있다는 내용이었다. 기사 본문에는 유투브에서 제공하는 블러링 도구들도 극복해냈다는 이야기도 언급한다.

10월 29, 2025
200만 이상이 트위터로 NFL 시청

More than two million people watched Twitter’s NFL stream on Thursday night
By Kurt Wagner, recode.net

트위터에서 시도하는 NFL 라이브 스트리밍에 대한 성적표가 나왔다. 트위터 스트리밍을 통해 경기를 본 사람들은 200만 이상이었고, CBS와 NFL네트워크를 통해 텔레비전으로 시청한 인원 4810만 명이었다.

만약 트위터와 전통적인 텔레비전을 비교하고자 한다면 여기에 조금 더 적절한 수치가 있다. 평균 24만3천 명의 사람들이 트위터를 통해 경기를 시청한 반면, 게임을 동시중계한 CBS와 NFL 네트워크는 평균 1540만 명에게 도달했다.
[expand title=Eng]
Here are more relevant numbers, if you really want to compare Twitter’s reach vs. traditional TV: An average of 243,000 people were watching the game on Twitter at any given time, while CBS and the NFL network, which simulcast the game, reached an average of 15.4 million.[/expand]

당연히 TV만큼 영향력을 가질거라고는 아무도 기대하지 않았다. 트위터 입장에서는 NFL을 시청하기 위해 새롭게 유입된 이용자들이 얼마나 되는지 중요한데 잘 될지 모르겠다.

10월 29, 2025
AI 세대

Growing up in Generation AI
By Remi El-ouazzane, techcrunch.com

특별한 내용은 없는 기사지만 생각해볼만한 도입부가 있어서 남겨놓는다. 마크 프렌스키가 디지털 네이티브라는 말을 했던 것도 2001년이다. 코호트 세대 구분으로 밀레니얼, Z세대라는 이야기들을 하는데 AI 세대라는 말이 딱히 인구통계적으로 구분되는 개념은 아니지만 생각해볼만한 내용이다.

5살짜리 아이가 엄마가 시리와 이야기하고 아빠가 알렉사와 대화하는 것을 보고있다고 상상해보자. 매일 그런 상호작용을 어떤 생각을 해야하는가? 최근 아이들은 컴퓨터가 마음을 가진 것처럼 보이는 것을 목격한다. 그들의 기계에 대한 인식은 당연시될 수 있고 따라서 우리들과 세계 자체가 다르다.
[expand title=Eng]
Imagine a five-year-old watching Mum talking to Siri, and Dad talking to Alexa, on a daily basis — what must she think of such interactions? Children nowadays witness computers that seem like they have a mind of their own — and even a personality with which to engage. It can be taken for granted that their perception of machines, and thus of the world itself, differs a lot from our own.[/expand]

I세대라는 말은 일반적으로 사용되는 말은 아니지만 다음과 같은 세대를 언급하는 것으로 보인다.

I세대가 선천적으로 아이패드와 스마트폰을 받아들인 것으로 설명되는 것처럼, AI 세대는 마음과 스스로에 대해 사고할 수 있는 지각된 능력을 가진 기계는 물론이고 공감과 카리스마를 갖춘 고도의 인공지능을 가진 기계를 당연하게 받아들일 것이다.
[expand title=Eng]
Just as Generation I demonstrated innate acceptance of iPads and smartphones, Generation AI will take for granted machines with advanced AI: machines with minds and a (perceived) ability to think on their own, and even machines equipped with artificial empathy and charisma.[/expand]

27개월 된 조카가 핸드폰에서 자연스럽게 자기 동영상 찾아서 반복해보는 모습을 보면 신기하다는 생각이 드는데 블루스크린과 밀레니엄 버그를 겪으며 자라난 세대와 인공지능이 자연스러운 것이 된 시대에서 살게 된 세대는 어떻게 달라질까.

10월 29, 2025