How to use deep learning to add local (e.g. repairing) transformations to images? The 2019 Stack Overflow Developer Survey Results Are InHow to calculate the mini-batch memory impact when training deep learning models?Neural Net for regression task on images only learning mean of training dataTensorflow: Save History of Images from GeneratorHow to make output dimensions match input dimensions in CNN?Should the different layers of deep learning models have same size or they should be changed based on a ruleHow to correctly resize input images in a CNN?Very Deep Convolutional Networks for Text Classification: Clarifying skip connectionsRandom Training set for GAN'sConverting video into frames using openCVHow can I combine images for Matlab deep learning?
Why can't devices on different VLANs, but on the same subnet, communicate?
Is it ethical to upload a automatically generated paper to a non peer-reviewed site as part of a larger research?
How to obtain a position of last non-zero element
Geography at the pixel level
Slides for 30 min~1 hr Skype tenure track application interview
Why “相同意思的词” is called “同义词” instead of "同意词"?
What is the meaning of Triage in Cybersec world?
Old scifi movie from the 50s or 60s with men in solid red uniforms who interrogate a spy from the past
What do hard-Brexiteers want with respect to the Irish border?
Match Roman Numerals
Does adding complexity mean a more secure cipher?
I am an eight letter word. What am I?
What could be the right powersource for 15 seconds lifespan disposable giant chainsaw?
writing variables above the numbers in tikz picture
Why not take a picture of a closer black hole?
How do you keep chess fun when your opponent constantly beats you?
If a sorcerer casts the Banishment spell on a PC while in Avernus, does the PC return to their home plane?
How much of the clove should I use when using big garlic heads?
Why didn't the Event Horizon Telescope team mention Sagittarius A*?
Cooking pasta in a water boiler
What is preventing me from simply constructing a hash that's lower than the current target?
Accepted by European university, rejected by all American ones I applied to? Possible reasons?
Relationship between Gromov-Witten and Taubes' Gromov invariant
Is it okay to consider publishing in my first year of PhD?
How to use deep learning to add local (e.g. repairing) transformations to images?
The 2019 Stack Overflow Developer Survey Results Are InHow to calculate the mini-batch memory impact when training deep learning models?Neural Net for regression task on images only learning mean of training dataTensorflow: Save History of Images from GeneratorHow to make output dimensions match input dimensions in CNN?Should the different layers of deep learning models have same size or they should be changed based on a ruleHow to correctly resize input images in a CNN?Very Deep Convolutional Networks for Text Classification: Clarifying skip connectionsRandom Training set for GAN'sConverting video into frames using openCVHow can I combine images for Matlab deep learning?
$begingroup$
I want to train a neural network that removes scratches from pictures. I chose a GAN architecture with a generator (G) and a discriminator (D) and two sets of images scratchy and non-scratchy, with similar motives. G in my setting uses mainly convolutional and deconvolutional layers with ReLU
. As input it uses the scratchy images. The D discriminates between the output of G and the non-scratchy images.
For example the generator would perform the following transformations:
(128, 128, 3) > (128, 128, 128) > (128, 128, 3)
Where the tuples contain (width, height, channels). Input and output need to have the same format.
But in order to get output that have same global structure as the inputs (i.e. streets, houses etc) it seem I have to use filter-sizes and strides of 1 and basically pass the complete pictures through the network.
However, the features I am looking at are rather small and local. It should be sufficient to use filters of up to 20 pixels in size for the convolutions. And then then apply the network to all parts of the input picture.
What would be a good generator architecture for such a task? Would you agree with the structure of the generator or would you expect different designs to perform better on local changes?
Here is the code for the generator that I use implemented in tensorflow.
def generator(x, batch_size, reuse=False):
with tf.variable_scope('generator') as scope:
if (reuse):
tf.get_variable_scope().reuse_variables()
s = 1
f = 1
assert ((WIDTH + f - 1) / s) % 1 == 0
keep_prob = 0.5
n_ch1 = 32
w = init_weights('g_wc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bc1', [n_ch1])
h = conv2d(x, w, s, b)
h = bn(h, 'g_bn1')
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h1 = h
n_ch2 = 128
w = init_weights('g_wc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bc2', [n_ch2])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn2")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h2 = h
n_ch3 = 256
w = init_weights('g_wc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bc3', [n_ch3])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn3")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
output_shape = [batch_size, HEIGHT//s//s, WIDTH//s//s, n_ch2]
w = init_weights('g_wdc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bdc3', [n_ch2])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd3")
h = tf.nn.relu(h)
h = h + h2
output_shape = [batch_size, HEIGHT//s, WIDTH//s, n_ch1]
w = init_weights('g_wdc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bdc2', [n_ch1])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd2")
h = tf.nn.relu(h)
h = h + h1
output_shape = [batch_size, HEIGHT, WIDTH, CHANNELS]
w = init_weights('g_wdc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bdc1', [CHANNELS])
h = deconv2d(h, w, s, b, output_shape)
return tf.nn.sigmoid(h+x)
When you use a stride s>1
you will get an hourglass, where the layers get smaller but deeper. The depth is independently controlled by the n_ch
variables.
BTW, I am running this on a Google colab notebook. Its amazing to have such an engine for free and be able to experiment with deep learning! Fantastic!
machine-learning neural-network deep-learning gan
$endgroup$
bumped to the homepage by Community♦ 36 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
I want to train a neural network that removes scratches from pictures. I chose a GAN architecture with a generator (G) and a discriminator (D) and two sets of images scratchy and non-scratchy, with similar motives. G in my setting uses mainly convolutional and deconvolutional layers with ReLU
. As input it uses the scratchy images. The D discriminates between the output of G and the non-scratchy images.
For example the generator would perform the following transformations:
(128, 128, 3) > (128, 128, 128) > (128, 128, 3)
Where the tuples contain (width, height, channels). Input and output need to have the same format.
But in order to get output that have same global structure as the inputs (i.e. streets, houses etc) it seem I have to use filter-sizes and strides of 1 and basically pass the complete pictures through the network.
However, the features I am looking at are rather small and local. It should be sufficient to use filters of up to 20 pixels in size for the convolutions. And then then apply the network to all parts of the input picture.
What would be a good generator architecture for such a task? Would you agree with the structure of the generator or would you expect different designs to perform better on local changes?
Here is the code for the generator that I use implemented in tensorflow.
def generator(x, batch_size, reuse=False):
with tf.variable_scope('generator') as scope:
if (reuse):
tf.get_variable_scope().reuse_variables()
s = 1
f = 1
assert ((WIDTH + f - 1) / s) % 1 == 0
keep_prob = 0.5
n_ch1 = 32
w = init_weights('g_wc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bc1', [n_ch1])
h = conv2d(x, w, s, b)
h = bn(h, 'g_bn1')
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h1 = h
n_ch2 = 128
w = init_weights('g_wc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bc2', [n_ch2])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn2")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h2 = h
n_ch3 = 256
w = init_weights('g_wc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bc3', [n_ch3])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn3")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
output_shape = [batch_size, HEIGHT//s//s, WIDTH//s//s, n_ch2]
w = init_weights('g_wdc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bdc3', [n_ch2])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd3")
h = tf.nn.relu(h)
h = h + h2
output_shape = [batch_size, HEIGHT//s, WIDTH//s, n_ch1]
w = init_weights('g_wdc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bdc2', [n_ch1])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd2")
h = tf.nn.relu(h)
h = h + h1
output_shape = [batch_size, HEIGHT, WIDTH, CHANNELS]
w = init_weights('g_wdc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bdc1', [CHANNELS])
h = deconv2d(h, w, s, b, output_shape)
return tf.nn.sigmoid(h+x)
When you use a stride s>1
you will get an hourglass, where the layers get smaller but deeper. The depth is independently controlled by the n_ch
variables.
BTW, I am running this on a Google colab notebook. Its amazing to have such an engine for free and be able to experiment with deep learning! Fantastic!
machine-learning neural-network deep-learning gan
$endgroup$
bumped to the homepage by Community♦ 36 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
$begingroup$
This problem is called descreening or, more generally, image restoration/inpainting. Scratch removal is a thing too, but yours don't look like scratches.
$endgroup$
– Emre
Apr 16 '18 at 22:00
add a comment |
$begingroup$
I want to train a neural network that removes scratches from pictures. I chose a GAN architecture with a generator (G) and a discriminator (D) and two sets of images scratchy and non-scratchy, with similar motives. G in my setting uses mainly convolutional and deconvolutional layers with ReLU
. As input it uses the scratchy images. The D discriminates between the output of G and the non-scratchy images.
For example the generator would perform the following transformations:
(128, 128, 3) > (128, 128, 128) > (128, 128, 3)
Where the tuples contain (width, height, channels). Input and output need to have the same format.
But in order to get output that have same global structure as the inputs (i.e. streets, houses etc) it seem I have to use filter-sizes and strides of 1 and basically pass the complete pictures through the network.
However, the features I am looking at are rather small and local. It should be sufficient to use filters of up to 20 pixels in size for the convolutions. And then then apply the network to all parts of the input picture.
What would be a good generator architecture for such a task? Would you agree with the structure of the generator or would you expect different designs to perform better on local changes?
Here is the code for the generator that I use implemented in tensorflow.
def generator(x, batch_size, reuse=False):
with tf.variable_scope('generator') as scope:
if (reuse):
tf.get_variable_scope().reuse_variables()
s = 1
f = 1
assert ((WIDTH + f - 1) / s) % 1 == 0
keep_prob = 0.5
n_ch1 = 32
w = init_weights('g_wc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bc1', [n_ch1])
h = conv2d(x, w, s, b)
h = bn(h, 'g_bn1')
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h1 = h
n_ch2 = 128
w = init_weights('g_wc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bc2', [n_ch2])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn2")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h2 = h
n_ch3 = 256
w = init_weights('g_wc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bc3', [n_ch3])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn3")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
output_shape = [batch_size, HEIGHT//s//s, WIDTH//s//s, n_ch2]
w = init_weights('g_wdc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bdc3', [n_ch2])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd3")
h = tf.nn.relu(h)
h = h + h2
output_shape = [batch_size, HEIGHT//s, WIDTH//s, n_ch1]
w = init_weights('g_wdc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bdc2', [n_ch1])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd2")
h = tf.nn.relu(h)
h = h + h1
output_shape = [batch_size, HEIGHT, WIDTH, CHANNELS]
w = init_weights('g_wdc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bdc1', [CHANNELS])
h = deconv2d(h, w, s, b, output_shape)
return tf.nn.sigmoid(h+x)
When you use a stride s>1
you will get an hourglass, where the layers get smaller but deeper. The depth is independently controlled by the n_ch
variables.
BTW, I am running this on a Google colab notebook. Its amazing to have such an engine for free and be able to experiment with deep learning! Fantastic!
machine-learning neural-network deep-learning gan
$endgroup$
I want to train a neural network that removes scratches from pictures. I chose a GAN architecture with a generator (G) and a discriminator (D) and two sets of images scratchy and non-scratchy, with similar motives. G in my setting uses mainly convolutional and deconvolutional layers with ReLU
. As input it uses the scratchy images. The D discriminates between the output of G and the non-scratchy images.
For example the generator would perform the following transformations:
(128, 128, 3) > (128, 128, 128) > (128, 128, 3)
Where the tuples contain (width, height, channels). Input and output need to have the same format.
But in order to get output that have same global structure as the inputs (i.e. streets, houses etc) it seem I have to use filter-sizes and strides of 1 and basically pass the complete pictures through the network.
However, the features I am looking at are rather small and local. It should be sufficient to use filters of up to 20 pixels in size for the convolutions. And then then apply the network to all parts of the input picture.
What would be a good generator architecture for such a task? Would you agree with the structure of the generator or would you expect different designs to perform better on local changes?
Here is the code for the generator that I use implemented in tensorflow.
def generator(x, batch_size, reuse=False):
with tf.variable_scope('generator') as scope:
if (reuse):
tf.get_variable_scope().reuse_variables()
s = 1
f = 1
assert ((WIDTH + f - 1) / s) % 1 == 0
keep_prob = 0.5
n_ch1 = 32
w = init_weights('g_wc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bc1', [n_ch1])
h = conv2d(x, w, s, b)
h = bn(h, 'g_bn1')
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h1 = h
n_ch2 = 128
w = init_weights('g_wc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bc2', [n_ch2])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn2")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
h2 = h
n_ch3 = 256
w = init_weights('g_wc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bc3', [n_ch3])
h = conv2d(h, w, s, b)
h = bn(h, "g_bn3")
h = tf.nn.relu(h)
h = tf.nn.dropout(h, keep_prob)
output_shape = [batch_size, HEIGHT//s//s, WIDTH//s//s, n_ch2]
w = init_weights('g_wdc3', [f, f, n_ch2, n_ch3])
b = init_bias('g_bdc3', [n_ch2])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd3")
h = tf.nn.relu(h)
h = h + h2
output_shape = [batch_size, HEIGHT//s, WIDTH//s, n_ch1]
w = init_weights('g_wdc2', [f, f, n_ch1, n_ch2])
b = init_bias('g_bdc2', [n_ch1])
h = deconv2d(h, w, s, b, output_shape)
h = bn(h, "g_bnd2")
h = tf.nn.relu(h)
h = h + h1
output_shape = [batch_size, HEIGHT, WIDTH, CHANNELS]
w = init_weights('g_wdc1', [f, f, CHANNELS, n_ch1])
b = init_bias('g_bdc1', [CHANNELS])
h = deconv2d(h, w, s, b, output_shape)
return tf.nn.sigmoid(h+x)
When you use a stride s>1
you will get an hourglass, where the layers get smaller but deeper. The depth is independently controlled by the n_ch
variables.
BTW, I am running this on a Google colab notebook. Its amazing to have such an engine for free and be able to experiment with deep learning! Fantastic!
machine-learning neural-network deep-learning gan
machine-learning neural-network deep-learning gan
edited Jul 13 '18 at 21:21
Sören
asked Feb 13 '18 at 18:17
SörenSören
291413
291413
bumped to the homepage by Community♦ 36 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 36 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
$begingroup$
This problem is called descreening or, more generally, image restoration/inpainting. Scratch removal is a thing too, but yours don't look like scratches.
$endgroup$
– Emre
Apr 16 '18 at 22:00
add a comment |
$begingroup$
This problem is called descreening or, more generally, image restoration/inpainting. Scratch removal is a thing too, but yours don't look like scratches.
$endgroup$
– Emre
Apr 16 '18 at 22:00
$begingroup$
This problem is called descreening or, more generally, image restoration/inpainting. Scratch removal is a thing too, but yours don't look like scratches.
$endgroup$
– Emre
Apr 16 '18 at 22:00
$begingroup$
This problem is called descreening or, more generally, image restoration/inpainting. Scratch removal is a thing too, but yours don't look like scratches.
$endgroup$
– Emre
Apr 16 '18 at 22:00
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Have you looked into ResNet which modifies image on a pixel level, rather than holistically modifying the image content, preserving the global structure and annotations?
Also your application sounds a lot like deep image prior.
$endgroup$
$begingroup$
Yes, this comes close to what I have in mind. I tried that hourglass architecture, but I always get artifacts in the output from the (I guess) upsampling.
$endgroup$
– Sören
Feb 15 '18 at 11:24
add a comment |
$begingroup$
There is a paper called Spatial Transformer Networks written by Max Jaderberg et al. What it does is trying to find the canonical shape of its input by reducing transformations, like translation and rotation, or even diminishing the distortion of the inputs. It introduces a module which helps convolutional network to be spatial invariant. One of the significant achievements of this module is that it tries to enhance distorted inputs. Take a look at here.
I edit the answer because of the request of one of our friends. First I quote something from the popular book written by Pr. Gonzalez, I hope nothing goes wrong with copyright. Then I suggest my recommendation.
Figure 2.40, the one that I've attached, shows an example of the steps in Fig. 2.39. In this case, the transform used was the Fourier transform, which we mention briefly later in this section and discuss in detail in Chapter 4. Figure 2.40(a) is an image corrupted by sinusoidal interference and Fig. 2.40(b) is the magnitude of its Fourier transform, which is the output of the first stage in Fig. 2.39. As you will learn in Chapter 4, sinusoidal interference in the spatial domain appears as bright bursts of intensity in the transform domain. In this case, the bursts are in a circular pattern that can be seen in Fig. 2.40(b). Figure 2.40(c) shows a mask
image (called a filter) with white and black representing 1 and 0, respectively.
For this example, the operation in the second box of Fig. 2.39 is to multiply the mask by the transform, thus eliminating the bursts responsible for the interference. Figure 2.40(d) shows the final result, obtained by computing the inverse of the modified transform. The interference is no longer visible, and important detail is quite clear. In fact, you can even see the fiducial marks (faint crosses) that are used for image alignment.
Okay! After those quotes, I refer to my suggestion. In the ST network, the authors have claimed that their differentiable module can learn different kinds of transformations other than affine transformation. The point about that is that we usually apply two kinds of transformations. One transformation is applied to the intensity of the image, which is popular by the means of filters. The other one is called image warping where we change the position of intensities and the intensity values do not change unless they locate between the discrete grids of image entries also called pixels, picture element. Spatial transformers are good for the second task but they also can be used for the first task. There are studies about using CNNs for evaluating the images in their frequency domain and not in the spatial domain. You can employ these differentiable modules in those nets.
Finally about your specific task, what I'm seeing in your pictures, your noise has a same behaviour. I guess if that is true for all cases, your task is not learning and can be solved using usual image processing techniques.
$endgroup$
1
$begingroup$
Hi, this seems to do something different though.
$endgroup$
– Sören
Apr 19 '18 at 0:45
$begingroup$
@Sören did you look at the video in the noisy situation?
$endgroup$
– Vaalizaadeh
Apr 21 '18 at 14:58
$begingroup$
I did, though in my case I don't have canonical structures like integer numbers that could be recognized, right? My task is not object recognition. I would rather put it into the style transfer category. I actually want to conserve the information in the picture, just removing scratches. The images are arbitrary street scenes.
$endgroup$
– Sören
Apr 24 '18 at 17:10
$begingroup$
@Sören You mean you have a noisy environment and you want to inhance the inputs?
$endgroup$
– Vaalizaadeh
Apr 26 '18 at 13:08
$begingroup$
Yes, you can treat it as noise. I have a set of noisy images and a set of clean images. I want a neural network that learns to recognize the noise and removes it. But I don't have a version of the same image noisy and clean with the same content.Therefore, it needs to be something like a style transfer network, I think. Like "draw the noisy picture in the style of the clean images". The architecture that I have (GAN and dual GAN) works somewhat, but not quite as I want it.
$endgroup$
– Sören
May 22 '18 at 16:12
|
show 4 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27770%2fhow-to-use-deep-learning-to-add-local-e-g-repairing-transformations-to-images%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Have you looked into ResNet which modifies image on a pixel level, rather than holistically modifying the image content, preserving the global structure and annotations?
Also your application sounds a lot like deep image prior.
$endgroup$
$begingroup$
Yes, this comes close to what I have in mind. I tried that hourglass architecture, but I always get artifacts in the output from the (I guess) upsampling.
$endgroup$
– Sören
Feb 15 '18 at 11:24
add a comment |
$begingroup$
Have you looked into ResNet which modifies image on a pixel level, rather than holistically modifying the image content, preserving the global structure and annotations?
Also your application sounds a lot like deep image prior.
$endgroup$
$begingroup$
Yes, this comes close to what I have in mind. I tried that hourglass architecture, but I always get artifacts in the output from the (I guess) upsampling.
$endgroup$
– Sören
Feb 15 '18 at 11:24
add a comment |
$begingroup$
Have you looked into ResNet which modifies image on a pixel level, rather than holistically modifying the image content, preserving the global structure and annotations?
Also your application sounds a lot like deep image prior.
$endgroup$
Have you looked into ResNet which modifies image on a pixel level, rather than holistically modifying the image content, preserving the global structure and annotations?
Also your application sounds a lot like deep image prior.
answered Feb 14 '18 at 4:35
Ricky HanRicky Han
1111
1111
$begingroup$
Yes, this comes close to what I have in mind. I tried that hourglass architecture, but I always get artifacts in the output from the (I guess) upsampling.
$endgroup$
– Sören
Feb 15 '18 at 11:24
add a comment |
$begingroup$
Yes, this comes close to what I have in mind. I tried that hourglass architecture, but I always get artifacts in the output from the (I guess) upsampling.
$endgroup$
– Sören
Feb 15 '18 at 11:24
$begingroup$
Yes, this comes close to what I have in mind. I tried that hourglass architecture, but I always get artifacts in the output from the (I guess) upsampling.
$endgroup$
– Sören
Feb 15 '18 at 11:24
$begingroup$
Yes, this comes close to what I have in mind. I tried that hourglass architecture, but I always get artifacts in the output from the (I guess) upsampling.
$endgroup$
– Sören
Feb 15 '18 at 11:24
add a comment |
$begingroup$
There is a paper called Spatial Transformer Networks written by Max Jaderberg et al. What it does is trying to find the canonical shape of its input by reducing transformations, like translation and rotation, or even diminishing the distortion of the inputs. It introduces a module which helps convolutional network to be spatial invariant. One of the significant achievements of this module is that it tries to enhance distorted inputs. Take a look at here.
I edit the answer because of the request of one of our friends. First I quote something from the popular book written by Pr. Gonzalez, I hope nothing goes wrong with copyright. Then I suggest my recommendation.
Figure 2.40, the one that I've attached, shows an example of the steps in Fig. 2.39. In this case, the transform used was the Fourier transform, which we mention briefly later in this section and discuss in detail in Chapter 4. Figure 2.40(a) is an image corrupted by sinusoidal interference and Fig. 2.40(b) is the magnitude of its Fourier transform, which is the output of the first stage in Fig. 2.39. As you will learn in Chapter 4, sinusoidal interference in the spatial domain appears as bright bursts of intensity in the transform domain. In this case, the bursts are in a circular pattern that can be seen in Fig. 2.40(b). Figure 2.40(c) shows a mask
image (called a filter) with white and black representing 1 and 0, respectively.
For this example, the operation in the second box of Fig. 2.39 is to multiply the mask by the transform, thus eliminating the bursts responsible for the interference. Figure 2.40(d) shows the final result, obtained by computing the inverse of the modified transform. The interference is no longer visible, and important detail is quite clear. In fact, you can even see the fiducial marks (faint crosses) that are used for image alignment.
Okay! After those quotes, I refer to my suggestion. In the ST network, the authors have claimed that their differentiable module can learn different kinds of transformations other than affine transformation. The point about that is that we usually apply two kinds of transformations. One transformation is applied to the intensity of the image, which is popular by the means of filters. The other one is called image warping where we change the position of intensities and the intensity values do not change unless they locate between the discrete grids of image entries also called pixels, picture element. Spatial transformers are good for the second task but they also can be used for the first task. There are studies about using CNNs for evaluating the images in their frequency domain and not in the spatial domain. You can employ these differentiable modules in those nets.
Finally about your specific task, what I'm seeing in your pictures, your noise has a same behaviour. I guess if that is true for all cases, your task is not learning and can be solved using usual image processing techniques.
$endgroup$
1
$begingroup$
Hi, this seems to do something different though.
$endgroup$
– Sören
Apr 19 '18 at 0:45
$begingroup$
@Sören did you look at the video in the noisy situation?
$endgroup$
– Vaalizaadeh
Apr 21 '18 at 14:58
$begingroup$
I did, though in my case I don't have canonical structures like integer numbers that could be recognized, right? My task is not object recognition. I would rather put it into the style transfer category. I actually want to conserve the information in the picture, just removing scratches. The images are arbitrary street scenes.
$endgroup$
– Sören
Apr 24 '18 at 17:10
$begingroup$
@Sören You mean you have a noisy environment and you want to inhance the inputs?
$endgroup$
– Vaalizaadeh
Apr 26 '18 at 13:08
$begingroup$
Yes, you can treat it as noise. I have a set of noisy images and a set of clean images. I want a neural network that learns to recognize the noise and removes it. But I don't have a version of the same image noisy and clean with the same content.Therefore, it needs to be something like a style transfer network, I think. Like "draw the noisy picture in the style of the clean images". The architecture that I have (GAN and dual GAN) works somewhat, but not quite as I want it.
$endgroup$
– Sören
May 22 '18 at 16:12
|
show 4 more comments
$begingroup$
There is a paper called Spatial Transformer Networks written by Max Jaderberg et al. What it does is trying to find the canonical shape of its input by reducing transformations, like translation and rotation, or even diminishing the distortion of the inputs. It introduces a module which helps convolutional network to be spatial invariant. One of the significant achievements of this module is that it tries to enhance distorted inputs. Take a look at here.
I edit the answer because of the request of one of our friends. First I quote something from the popular book written by Pr. Gonzalez, I hope nothing goes wrong with copyright. Then I suggest my recommendation.
Figure 2.40, the one that I've attached, shows an example of the steps in Fig. 2.39. In this case, the transform used was the Fourier transform, which we mention briefly later in this section and discuss in detail in Chapter 4. Figure 2.40(a) is an image corrupted by sinusoidal interference and Fig. 2.40(b) is the magnitude of its Fourier transform, which is the output of the first stage in Fig. 2.39. As you will learn in Chapter 4, sinusoidal interference in the spatial domain appears as bright bursts of intensity in the transform domain. In this case, the bursts are in a circular pattern that can be seen in Fig. 2.40(b). Figure 2.40(c) shows a mask
image (called a filter) with white and black representing 1 and 0, respectively.
For this example, the operation in the second box of Fig. 2.39 is to multiply the mask by the transform, thus eliminating the bursts responsible for the interference. Figure 2.40(d) shows the final result, obtained by computing the inverse of the modified transform. The interference is no longer visible, and important detail is quite clear. In fact, you can even see the fiducial marks (faint crosses) that are used for image alignment.
Okay! After those quotes, I refer to my suggestion. In the ST network, the authors have claimed that their differentiable module can learn different kinds of transformations other than affine transformation. The point about that is that we usually apply two kinds of transformations. One transformation is applied to the intensity of the image, which is popular by the means of filters. The other one is called image warping where we change the position of intensities and the intensity values do not change unless they locate between the discrete grids of image entries also called pixels, picture element. Spatial transformers are good for the second task but they also can be used for the first task. There are studies about using CNNs for evaluating the images in their frequency domain and not in the spatial domain. You can employ these differentiable modules in those nets.
Finally about your specific task, what I'm seeing in your pictures, your noise has a same behaviour. I guess if that is true for all cases, your task is not learning and can be solved using usual image processing techniques.
$endgroup$
1
$begingroup$
Hi, this seems to do something different though.
$endgroup$
– Sören
Apr 19 '18 at 0:45
$begingroup$
@Sören did you look at the video in the noisy situation?
$endgroup$
– Vaalizaadeh
Apr 21 '18 at 14:58
$begingroup$
I did, though in my case I don't have canonical structures like integer numbers that could be recognized, right? My task is not object recognition. I would rather put it into the style transfer category. I actually want to conserve the information in the picture, just removing scratches. The images are arbitrary street scenes.
$endgroup$
– Sören
Apr 24 '18 at 17:10
$begingroup$
@Sören You mean you have a noisy environment and you want to inhance the inputs?
$endgroup$
– Vaalizaadeh
Apr 26 '18 at 13:08
$begingroup$
Yes, you can treat it as noise. I have a set of noisy images and a set of clean images. I want a neural network that learns to recognize the noise and removes it. But I don't have a version of the same image noisy and clean with the same content.Therefore, it needs to be something like a style transfer network, I think. Like "draw the noisy picture in the style of the clean images". The architecture that I have (GAN and dual GAN) works somewhat, but not quite as I want it.
$endgroup$
– Sören
May 22 '18 at 16:12
|
show 4 more comments
$begingroup$
There is a paper called Spatial Transformer Networks written by Max Jaderberg et al. What it does is trying to find the canonical shape of its input by reducing transformations, like translation and rotation, or even diminishing the distortion of the inputs. It introduces a module which helps convolutional network to be spatial invariant. One of the significant achievements of this module is that it tries to enhance distorted inputs. Take a look at here.
I edit the answer because of the request of one of our friends. First I quote something from the popular book written by Pr. Gonzalez, I hope nothing goes wrong with copyright. Then I suggest my recommendation.
Figure 2.40, the one that I've attached, shows an example of the steps in Fig. 2.39. In this case, the transform used was the Fourier transform, which we mention briefly later in this section and discuss in detail in Chapter 4. Figure 2.40(a) is an image corrupted by sinusoidal interference and Fig. 2.40(b) is the magnitude of its Fourier transform, which is the output of the first stage in Fig. 2.39. As you will learn in Chapter 4, sinusoidal interference in the spatial domain appears as bright bursts of intensity in the transform domain. In this case, the bursts are in a circular pattern that can be seen in Fig. 2.40(b). Figure 2.40(c) shows a mask
image (called a filter) with white and black representing 1 and 0, respectively.
For this example, the operation in the second box of Fig. 2.39 is to multiply the mask by the transform, thus eliminating the bursts responsible for the interference. Figure 2.40(d) shows the final result, obtained by computing the inverse of the modified transform. The interference is no longer visible, and important detail is quite clear. In fact, you can even see the fiducial marks (faint crosses) that are used for image alignment.
Okay! After those quotes, I refer to my suggestion. In the ST network, the authors have claimed that their differentiable module can learn different kinds of transformations other than affine transformation. The point about that is that we usually apply two kinds of transformations. One transformation is applied to the intensity of the image, which is popular by the means of filters. The other one is called image warping where we change the position of intensities and the intensity values do not change unless they locate between the discrete grids of image entries also called pixels, picture element. Spatial transformers are good for the second task but they also can be used for the first task. There are studies about using CNNs for evaluating the images in their frequency domain and not in the spatial domain. You can employ these differentiable modules in those nets.
Finally about your specific task, what I'm seeing in your pictures, your noise has a same behaviour. I guess if that is true for all cases, your task is not learning and can be solved using usual image processing techniques.
$endgroup$
There is a paper called Spatial Transformer Networks written by Max Jaderberg et al. What it does is trying to find the canonical shape of its input by reducing transformations, like translation and rotation, or even diminishing the distortion of the inputs. It introduces a module which helps convolutional network to be spatial invariant. One of the significant achievements of this module is that it tries to enhance distorted inputs. Take a look at here.
I edit the answer because of the request of one of our friends. First I quote something from the popular book written by Pr. Gonzalez, I hope nothing goes wrong with copyright. Then I suggest my recommendation.
Figure 2.40, the one that I've attached, shows an example of the steps in Fig. 2.39. In this case, the transform used was the Fourier transform, which we mention briefly later in this section and discuss in detail in Chapter 4. Figure 2.40(a) is an image corrupted by sinusoidal interference and Fig. 2.40(b) is the magnitude of its Fourier transform, which is the output of the first stage in Fig. 2.39. As you will learn in Chapter 4, sinusoidal interference in the spatial domain appears as bright bursts of intensity in the transform domain. In this case, the bursts are in a circular pattern that can be seen in Fig. 2.40(b). Figure 2.40(c) shows a mask
image (called a filter) with white and black representing 1 and 0, respectively.
For this example, the operation in the second box of Fig. 2.39 is to multiply the mask by the transform, thus eliminating the bursts responsible for the interference. Figure 2.40(d) shows the final result, obtained by computing the inverse of the modified transform. The interference is no longer visible, and important detail is quite clear. In fact, you can even see the fiducial marks (faint crosses) that are used for image alignment.
Okay! After those quotes, I refer to my suggestion. In the ST network, the authors have claimed that their differentiable module can learn different kinds of transformations other than affine transformation. The point about that is that we usually apply two kinds of transformations. One transformation is applied to the intensity of the image, which is popular by the means of filters. The other one is called image warping where we change the position of intensities and the intensity values do not change unless they locate between the discrete grids of image entries also called pixels, picture element. Spatial transformers are good for the second task but they also can be used for the first task. There are studies about using CNNs for evaluating the images in their frequency domain and not in the spatial domain. You can employ these differentiable modules in those nets.
Finally about your specific task, what I'm seeing in your pictures, your noise has a same behaviour. I guess if that is true for all cases, your task is not learning and can be solved using usual image processing techniques.
edited Jun 12 '18 at 20:20
answered Feb 14 '18 at 5:02
VaalizaadehVaalizaadeh
7,56562263
7,56562263
1
$begingroup$
Hi, this seems to do something different though.
$endgroup$
– Sören
Apr 19 '18 at 0:45
$begingroup$
@Sören did you look at the video in the noisy situation?
$endgroup$
– Vaalizaadeh
Apr 21 '18 at 14:58
$begingroup$
I did, though in my case I don't have canonical structures like integer numbers that could be recognized, right? My task is not object recognition. I would rather put it into the style transfer category. I actually want to conserve the information in the picture, just removing scratches. The images are arbitrary street scenes.
$endgroup$
– Sören
Apr 24 '18 at 17:10
$begingroup$
@Sören You mean you have a noisy environment and you want to inhance the inputs?
$endgroup$
– Vaalizaadeh
Apr 26 '18 at 13:08
$begingroup$
Yes, you can treat it as noise. I have a set of noisy images and a set of clean images. I want a neural network that learns to recognize the noise and removes it. But I don't have a version of the same image noisy and clean with the same content.Therefore, it needs to be something like a style transfer network, I think. Like "draw the noisy picture in the style of the clean images". The architecture that I have (GAN and dual GAN) works somewhat, but not quite as I want it.
$endgroup$
– Sören
May 22 '18 at 16:12
|
show 4 more comments
1
$begingroup$
Hi, this seems to do something different though.
$endgroup$
– Sören
Apr 19 '18 at 0:45
$begingroup$
@Sören did you look at the video in the noisy situation?
$endgroup$
– Vaalizaadeh
Apr 21 '18 at 14:58
$begingroup$
I did, though in my case I don't have canonical structures like integer numbers that could be recognized, right? My task is not object recognition. I would rather put it into the style transfer category. I actually want to conserve the information in the picture, just removing scratches. The images are arbitrary street scenes.
$endgroup$
– Sören
Apr 24 '18 at 17:10
$begingroup$
@Sören You mean you have a noisy environment and you want to inhance the inputs?
$endgroup$
– Vaalizaadeh
Apr 26 '18 at 13:08
$begingroup$
Yes, you can treat it as noise. I have a set of noisy images and a set of clean images. I want a neural network that learns to recognize the noise and removes it. But I don't have a version of the same image noisy and clean with the same content.Therefore, it needs to be something like a style transfer network, I think. Like "draw the noisy picture in the style of the clean images". The architecture that I have (GAN and dual GAN) works somewhat, but not quite as I want it.
$endgroup$
– Sören
May 22 '18 at 16:12
1
1
$begingroup$
Hi, this seems to do something different though.
$endgroup$
– Sören
Apr 19 '18 at 0:45
$begingroup$
Hi, this seems to do something different though.
$endgroup$
– Sören
Apr 19 '18 at 0:45
$begingroup$
@Sören did you look at the video in the noisy situation?
$endgroup$
– Vaalizaadeh
Apr 21 '18 at 14:58
$begingroup$
@Sören did you look at the video in the noisy situation?
$endgroup$
– Vaalizaadeh
Apr 21 '18 at 14:58
$begingroup$
I did, though in my case I don't have canonical structures like integer numbers that could be recognized, right? My task is not object recognition. I would rather put it into the style transfer category. I actually want to conserve the information in the picture, just removing scratches. The images are arbitrary street scenes.
$endgroup$
– Sören
Apr 24 '18 at 17:10
$begingroup$
I did, though in my case I don't have canonical structures like integer numbers that could be recognized, right? My task is not object recognition. I would rather put it into the style transfer category. I actually want to conserve the information in the picture, just removing scratches. The images are arbitrary street scenes.
$endgroup$
– Sören
Apr 24 '18 at 17:10
$begingroup$
@Sören You mean you have a noisy environment and you want to inhance the inputs?
$endgroup$
– Vaalizaadeh
Apr 26 '18 at 13:08
$begingroup$
@Sören You mean you have a noisy environment and you want to inhance the inputs?
$endgroup$
– Vaalizaadeh
Apr 26 '18 at 13:08
$begingroup$
Yes, you can treat it as noise. I have a set of noisy images and a set of clean images. I want a neural network that learns to recognize the noise and removes it. But I don't have a version of the same image noisy and clean with the same content.Therefore, it needs to be something like a style transfer network, I think. Like "draw the noisy picture in the style of the clean images". The architecture that I have (GAN and dual GAN) works somewhat, but not quite as I want it.
$endgroup$
– Sören
May 22 '18 at 16:12
$begingroup$
Yes, you can treat it as noise. I have a set of noisy images and a set of clean images. I want a neural network that learns to recognize the noise and removes it. But I don't have a version of the same image noisy and clean with the same content.Therefore, it needs to be something like a style transfer network, I think. Like "draw the noisy picture in the style of the clean images". The architecture that I have (GAN and dual GAN) works somewhat, but not quite as I want it.
$endgroup$
– Sören
May 22 '18 at 16:12
|
show 4 more comments
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27770%2fhow-to-use-deep-learning-to-add-local-e-g-repairing-transformations-to-images%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
This problem is called descreening or, more generally, image restoration/inpainting. Scratch removal is a thing too, but yours don't look like scratches.
$endgroup$
– Emre
Apr 16 '18 at 22:00