How to pad real-valued sequences Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsTensorflow and OpenCV real-time classificationHow do attention mechanisms in RNNs learn weights for a variable length inputHow/What to initialize the hidden states in RNN sequence-to-sequence models?Keras LSTM model for binary classification with sequencesBatch processing with variable length sequenceshow to deal with varying output layerComplex-Valued input to CNNTensorflow tf.divide how to useHow to determine feature importance in a neural network?One-hot encode multi-class multi-label sequences

Would "destroying" Wurmcoil Engine prevent its tokens from being created?

What would be the ideal power source for a cybernetic eye?

Is it a good idea to use CNN to classify 1D signal?

Closed form of recurrent arithmetic series summation

Should I use a zero-interest credit card for a large one-time purchase?

また usage in a dictionary

Is safe to use va_start macro with this as parameter?

Most bit efficient text communication method?

Is it ethical to give a final exam after the professor has quit before teaching the remaining chapters of the course?

Crossing US/Canada Border for less than 24 hours

Do wooden building fires get hotter than 600°C?

Do square wave exist?

Is "Reachable Object" really an NP-complete problem?

bold in theorem

Dating a Former Employee

Find the length x such that the two distances in the triangle are the same

Is there a holomorphic function on open unit disc with this property?

Ports Showing Closed/Filtered in Nmap Scans

What causes the direction of lightning flashes?

Is there any way for the UK Prime Minister to make a motion directly dependent on Government confidence?

What does this Jacques Hadamard quote mean?

How to Make a Beautiful Stacked 3D Plot

How could we fake a moon landing now?

How to down pick a chord with skipped strings?



How to pad real-valued sequences



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsTensorflow and OpenCV real-time classificationHow do attention mechanisms in RNNs learn weights for a variable length inputHow/What to initialize the hidden states in RNN sequence-to-sequence models?Keras LSTM model for binary classification with sequencesBatch processing with variable length sequenceshow to deal with varying output layerComplex-Valued input to CNNTensorflow tf.divide how to useHow to determine feature importance in a neural network?One-hot encode multi-class multi-label sequences










1












$begingroup$


I have several sequences of univariate real-valued time-series data. The sequences are of different lengths and right now I cannot batch them and feed them to a network.
What is the correct procedure to pad these sequences? Is it even possible in this case since I can't use any number as a special symbol?



UPDATE 1



I'm working with arbitrary univariate time-series data (not related to one specific domain, unbounded range). To give example of one such a series consider standardized stock dataset (only first 10 elements shown):



d = array([-0.37807043, 0.14321786, -0.37807043, 0.13478392, 0.18733381,
1.19576774, 0.25675156, 0.26064414, 0.30930144, 0.38650436])









share|improve this question











$endgroup$




bumped to the homepage by Community 38 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.














  • $begingroup$
    Welcome to the site! I think your question is open ended, can you give some example or sample sequence for better understanding. Accordingly we can suggest you better. Thank you!
    $endgroup$
    – Toros91
    Mar 21 '18 at 5:39










  • $begingroup$
    Updated my question. However, the time-series I'm working with are arbitrary.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:15











  • $begingroup$
    I think that combining such data together is not going to give you good insights. In the scenarios where you want to combine different time series data, you need to check for the trend of the data and if they both are similar then it makes sense to combine them or else it is very wrong to do it.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:19










  • $begingroup$
    My aim is to implement a time-series autoencoder presented in a conference paper and later use these seq-embeddings to improve classification/regression performance.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:28











  • $begingroup$
    hmm I understand, even I'm also working on something similar, since I don't have the future values, I forecast the values and these are used for classifying the target outcome. But combining data on which you don't have enough support(proof) is wrong way of doing. This is what I feel.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:32















1












$begingroup$


I have several sequences of univariate real-valued time-series data. The sequences are of different lengths and right now I cannot batch them and feed them to a network.
What is the correct procedure to pad these sequences? Is it even possible in this case since I can't use any number as a special symbol?



UPDATE 1



I'm working with arbitrary univariate time-series data (not related to one specific domain, unbounded range). To give example of one such a series consider standardized stock dataset (only first 10 elements shown):



d = array([-0.37807043, 0.14321786, -0.37807043, 0.13478392, 0.18733381,
1.19576774, 0.25675156, 0.26064414, 0.30930144, 0.38650436])









share|improve this question











$endgroup$




bumped to the homepage by Community 38 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.














  • $begingroup$
    Welcome to the site! I think your question is open ended, can you give some example or sample sequence for better understanding. Accordingly we can suggest you better. Thank you!
    $endgroup$
    – Toros91
    Mar 21 '18 at 5:39










  • $begingroup$
    Updated my question. However, the time-series I'm working with are arbitrary.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:15











  • $begingroup$
    I think that combining such data together is not going to give you good insights. In the scenarios where you want to combine different time series data, you need to check for the trend of the data and if they both are similar then it makes sense to combine them or else it is very wrong to do it.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:19










  • $begingroup$
    My aim is to implement a time-series autoencoder presented in a conference paper and later use these seq-embeddings to improve classification/regression performance.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:28











  • $begingroup$
    hmm I understand, even I'm also working on something similar, since I don't have the future values, I forecast the values and these are used for classifying the target outcome. But combining data on which you don't have enough support(proof) is wrong way of doing. This is what I feel.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:32













1












1








1





$begingroup$


I have several sequences of univariate real-valued time-series data. The sequences are of different lengths and right now I cannot batch them and feed them to a network.
What is the correct procedure to pad these sequences? Is it even possible in this case since I can't use any number as a special symbol?



UPDATE 1



I'm working with arbitrary univariate time-series data (not related to one specific domain, unbounded range). To give example of one such a series consider standardized stock dataset (only first 10 elements shown):



d = array([-0.37807043, 0.14321786, -0.37807043, 0.13478392, 0.18733381,
1.19576774, 0.25675156, 0.26064414, 0.30930144, 0.38650436])









share|improve this question











$endgroup$




I have several sequences of univariate real-valued time-series data. The sequences are of different lengths and right now I cannot batch them and feed them to a network.
What is the correct procedure to pad these sequences? Is it even possible in this case since I can't use any number as a special symbol?



UPDATE 1



I'm working with arbitrary univariate time-series data (not related to one specific domain, unbounded range). To give example of one such a series consider standardized stock dataset (only first 10 elements shown):



d = array([-0.37807043, 0.14321786, -0.37807043, 0.13478392, 0.18733381,
1.19576774, 0.25675156, 0.26064414, 0.30930144, 0.38650436])






tensorflow sequence-to-sequence






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 21 '18 at 6:15







Aechlys

















asked Mar 21 '18 at 5:30









AechlysAechlys

1063




1063





bumped to the homepage by Community 38 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 38 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.













  • $begingroup$
    Welcome to the site! I think your question is open ended, can you give some example or sample sequence for better understanding. Accordingly we can suggest you better. Thank you!
    $endgroup$
    – Toros91
    Mar 21 '18 at 5:39










  • $begingroup$
    Updated my question. However, the time-series I'm working with are arbitrary.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:15











  • $begingroup$
    I think that combining such data together is not going to give you good insights. In the scenarios where you want to combine different time series data, you need to check for the trend of the data and if they both are similar then it makes sense to combine them or else it is very wrong to do it.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:19










  • $begingroup$
    My aim is to implement a time-series autoencoder presented in a conference paper and later use these seq-embeddings to improve classification/regression performance.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:28











  • $begingroup$
    hmm I understand, even I'm also working on something similar, since I don't have the future values, I forecast the values and these are used for classifying the target outcome. But combining data on which you don't have enough support(proof) is wrong way of doing. This is what I feel.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:32
















  • $begingroup$
    Welcome to the site! I think your question is open ended, can you give some example or sample sequence for better understanding. Accordingly we can suggest you better. Thank you!
    $endgroup$
    – Toros91
    Mar 21 '18 at 5:39










  • $begingroup$
    Updated my question. However, the time-series I'm working with are arbitrary.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:15











  • $begingroup$
    I think that combining such data together is not going to give you good insights. In the scenarios where you want to combine different time series data, you need to check for the trend of the data and if they both are similar then it makes sense to combine them or else it is very wrong to do it.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:19










  • $begingroup$
    My aim is to implement a time-series autoencoder presented in a conference paper and later use these seq-embeddings to improve classification/regression performance.
    $endgroup$
    – Aechlys
    Mar 21 '18 at 6:28











  • $begingroup$
    hmm I understand, even I'm also working on something similar, since I don't have the future values, I forecast the values and these are used for classifying the target outcome. But combining data on which you don't have enough support(proof) is wrong way of doing. This is what I feel.
    $endgroup$
    – Toros91
    Mar 21 '18 at 6:32















$begingroup$
Welcome to the site! I think your question is open ended, can you give some example or sample sequence for better understanding. Accordingly we can suggest you better. Thank you!
$endgroup$
– Toros91
Mar 21 '18 at 5:39




$begingroup$
Welcome to the site! I think your question is open ended, can you give some example or sample sequence for better understanding. Accordingly we can suggest you better. Thank you!
$endgroup$
– Toros91
Mar 21 '18 at 5:39












$begingroup$
Updated my question. However, the time-series I'm working with are arbitrary.
$endgroup$
– Aechlys
Mar 21 '18 at 6:15





$begingroup$
Updated my question. However, the time-series I'm working with are arbitrary.
$endgroup$
– Aechlys
Mar 21 '18 at 6:15













$begingroup$
I think that combining such data together is not going to give you good insights. In the scenarios where you want to combine different time series data, you need to check for the trend of the data and if they both are similar then it makes sense to combine them or else it is very wrong to do it.
$endgroup$
– Toros91
Mar 21 '18 at 6:19




$begingroup$
I think that combining such data together is not going to give you good insights. In the scenarios where you want to combine different time series data, you need to check for the trend of the data and if they both are similar then it makes sense to combine them or else it is very wrong to do it.
$endgroup$
– Toros91
Mar 21 '18 at 6:19












$begingroup$
My aim is to implement a time-series autoencoder presented in a conference paper and later use these seq-embeddings to improve classification/regression performance.
$endgroup$
– Aechlys
Mar 21 '18 at 6:28





$begingroup$
My aim is to implement a time-series autoencoder presented in a conference paper and later use these seq-embeddings to improve classification/regression performance.
$endgroup$
– Aechlys
Mar 21 '18 at 6:28













$begingroup$
hmm I understand, even I'm also working on something similar, since I don't have the future values, I forecast the values and these are used for classifying the target outcome. But combining data on which you don't have enough support(proof) is wrong way of doing. This is what I feel.
$endgroup$
– Toros91
Mar 21 '18 at 6:32




$begingroup$
hmm I understand, even I'm also working on something similar, since I don't have the future values, I forecast the values and these are used for classifying the target outcome. But combining data on which you don't have enough support(proof) is wrong way of doing. This is what I feel.
$endgroup$
– Toros91
Mar 21 '18 at 6:32










1 Answer
1






active

oldest

votes


















0












$begingroup$

How you pad it (and even whether you do so) would depend on what you expect of the data. This imposes boundary conditions on the data which will induce artifacts in any transform you make. How bad this effect depends on how well geared your data is to accepting a particular padding method.



Padding methods include zero padding or a periodic bound.



Padding doesn't have to be done in the time domain. Eg interpolating in the frequency domain and back transforming allows you to extrapolate.



If your analytics has a finite history (eg FIR filters) then you can isolate time regions where padding is unnecessary and draw comparisons therefrom.






share|improve this answer









$endgroup$













    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "557"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f29348%2fhow-to-pad-real-valued-sequences%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    How you pad it (and even whether you do so) would depend on what you expect of the data. This imposes boundary conditions on the data which will induce artifacts in any transform you make. How bad this effect depends on how well geared your data is to accepting a particular padding method.



    Padding methods include zero padding or a periodic bound.



    Padding doesn't have to be done in the time domain. Eg interpolating in the frequency domain and back transforming allows you to extrapolate.



    If your analytics has a finite history (eg FIR filters) then you can isolate time regions where padding is unnecessary and draw comparisons therefrom.






    share|improve this answer









    $endgroup$

















      0












      $begingroup$

      How you pad it (and even whether you do so) would depend on what you expect of the data. This imposes boundary conditions on the data which will induce artifacts in any transform you make. How bad this effect depends on how well geared your data is to accepting a particular padding method.



      Padding methods include zero padding or a periodic bound.



      Padding doesn't have to be done in the time domain. Eg interpolating in the frequency domain and back transforming allows you to extrapolate.



      If your analytics has a finite history (eg FIR filters) then you can isolate time regions where padding is unnecessary and draw comparisons therefrom.






      share|improve this answer









      $endgroup$















        0












        0








        0





        $begingroup$

        How you pad it (and even whether you do so) would depend on what you expect of the data. This imposes boundary conditions on the data which will induce artifacts in any transform you make. How bad this effect depends on how well geared your data is to accepting a particular padding method.



        Padding methods include zero padding or a periodic bound.



        Padding doesn't have to be done in the time domain. Eg interpolating in the frequency domain and back transforming allows you to extrapolate.



        If your analytics has a finite history (eg FIR filters) then you can isolate time regions where padding is unnecessary and draw comparisons therefrom.






        share|improve this answer









        $endgroup$



        How you pad it (and even whether you do so) would depend on what you expect of the data. This imposes boundary conditions on the data which will induce artifacts in any transform you make. How bad this effect depends on how well geared your data is to accepting a particular padding method.



        Padding methods include zero padding or a periodic bound.



        Padding doesn't have to be done in the time domain. Eg interpolating in the frequency domain and back transforming allows you to extrapolate.



        If your analytics has a finite history (eg FIR filters) then you can isolate time regions where padding is unnecessary and draw comparisons therefrom.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 21 '18 at 7:31









        Paul ChildsPaul Childs

        101




        101



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f29348%2fhow-to-pad-real-valued-sequences%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown