Image Style Transfer①（Abstract＆Introduction）｜Style Transferの研究を俯瞰する #1

f:id:lib-arts:20200117124826p:plain

2019年にNVIDIAが公開して話題になったStyle GANにもあるように、生成モデルへのStyle Transferの研究の導入が注目されています。当シリーズではそれを受けて、Style Transferの研究を俯瞰しながらStyle GANやStyle GAN2などの研究を読み解いていければと思います。
#1、#2ではStyle Transfer関連の初期の研究である、Image Style Transfer(Image Style Transfer Using Convolutional Neural Networks)について取り扱います。

https://zpascal.net/cvpr2016/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf

#1ではAbstractとIntroductionの確認を行います。
以下目次になります。
1. Abstract
2. Introduction(Section1)
3. まとめ

1. Abstract
1節ではAbstractの内容を確認しながら概要について把握します。以下各文の和訳などを通して簡単に内容を確認します。

Rendering the semantic content of an image in different styles is a difficult image processing task. Arguably, a major limiting factor for previous approaches has been the lack of image representations that explicitly represent semantic information and, thus, allow to separate image content from style.

和訳：『異なるstyleを用いた意味を持った画像のコンテンツの生成(rendering)は画像処理の難しいタスクである。間違いなく、先行研究の手法における主要な制約は、明示的に意味的な情報を表現したり画像のコンテンツとstyleを分離したりする画像の表現(image representations)の欠如である。』
styleとcontentsを分離して考えるというのがこの論文の取り組んでいる主要なテーマとなっています。先行研究における画像の表現(image representations)の欠如については、よくある展開のようにこの論文においてもimage representationsにCNNを導入することによって従来の課題の解決を試みています。

Here we use image representations derived from Convolutional Neural Networks optimised for object recognition, which make high level image information explicit.

和訳：『ここで我々は物体認識のために最適化された畳み込みニューラルネットワーク派生の画像の表現(representations)を用いており、高いクオリティの画像の(ベクトル)表現となっている。』
前文であるように画像からの特徴抽出としてCNNを用いたとされています。この辺は強化学習におけるQ-learningの枠組みにCNNを導入したDeep Q-Networkなどと論理展開は似ていると考えて良さそうです。

We introduce A Neural Algorithm of Artistic Style that can separate and recombine the image content and style of natural images. The algorithm allows us to produce new images of high perceptual quality that combine the content of an arbitrary photograph with the appearance of numerous well-known artworks.

和訳：『この研究では、入力画像のcontentとstyleを分離したり再構成したりすることのできる、Artistic Styleのニューラルネットワークのアルゴリズムを導入している。導入したアルゴリズムによって、多数の有名なアート作品を伴った任意の写真のコンテンツを生み出す、高度の知覚品質を持つ新しい画像が生成できるようになった。』
研究の特徴としてcontentとstyleの分離について触れられており、それによって写真のcontentsを有名アート(絵画)のstyleで実現したことについて示唆されています。

Our results provide new insights into the deep image representations learned by Convolutional Neural Networks and demonstrate their potential for high level image synthesis and manipulation.

和訳：『結果として、畳み込みニューラルネットワークによって学習された深層の画像のベクトル表現によって新しい洞察が得られ、高いレベルでの画像合成や画像の取り扱いにあたってのポテンシャルを示すことができた。』
研究において行われた実験結果などで、高いレベルでの画像合成や画像の取り扱いができるであろうという可能性が確認できたとされています。

2. Introduction(Section1)
2節ではSection1のIntroductionについて確認します。以下パラグラフ単位で確認していきます。

f:id:lib-arts:20200117132901p:plain

f:id:lib-arts:20200117132959p:plain

第一パラグラフでは、Style Transferについて考えるにあたって、"problem of texture transfer"について紹介されています。texture transferの研究の文脈においては入力の画像をsemantic contentを残したままtextureの合成を行うとされています。また、先行研究の多くにおいては、texture transferのアルゴリズムはノンパラメトリック(non-parametric)な手法が用いられていたとされており、具体例がいくつか挙げられています。

f:id:lib-arts:20200117134057p:plain

f:id:lib-arts:20200117134111p:plain

第二パラグラフでは、ノンパラメトリックな手法を用いた先行研究は顕著な成果を出したものの、low-levelなターゲット画像の特徴しか用いることしかできないというfundamentalな制約が生じたとされています。それに対し、この研究におけるstyle transferのアルゴリズムでは、ターゲット画像からsemantic image contentを抽出できるようにするべきであるとされています。したがって、fundamentalな前提条件(prerequisite)として高度なimage representationsが必要であるとなっています。

f:id:lib-arts:20200117135106p:plain

f:id:lib-arts:20200117135122p:plain

第三パラグラフでは、写真などのような通常の画像からcontentとstyleの分離を行うのは極めて難しい問題であるものの、DeepLearning(CNN)を用いることで強力なコンピュータビジョンのシステムを作成することができるとされており、DeepLearningの導入に対しての期待について言及されています。

f:id:lib-arts:20200117135329p:plain

f:id:lib-arts:20200117135422p:plain

第四パラグラフでは、この論文におけるアプローチについて記載されています。Section2のDeep image representationsに詳しくは記載されているので、ここでは省略します。

3. まとめ
#1ではStyle Transfer関連の初期の研究である、Image Style Transfer(Image Style Transfer Using Convolutional Neural Networks)のAbstractとIntroductionを確認し、論文の概要を掴みました。
#2ではSection2のDeep image representations以下の重要なポイントについて取り扱っていきます。