Find-out abnormal behavior over the timeHow to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?

Optimising a list searching algorithm

Generic TVP tradeoffs?

How does one measure the Fourier components of a signal?

Pronounciation of the combination "st" in spanish accents

Print last inputted byte

Bash - pair each line of file

Why is there so much iron?

Usage and meaning of "up" in "...worth at least a thousand pounds up in London"

Knife as defense against stray dogs

How to terminate ping <dest> &

In Aliens, how many people were on LV-426 before the Marines arrived​?

How to define limit operations in general topological spaces? Are nets able to do this?

Hausdorff dimension of the boundary of fibres of Lipschitz maps

Do US professors/group leaders only get a salary, but no group budget?

Can other pieces capture a threatening piece and prevent a checkmate?

World War I as a war of liberals against authoritarians?

What does Jesus mean regarding "Raca," and "you fool?" - is he contrasting them?

What are substitutions for coconut in curry?

Can you move over difficult terrain with only 5 feet of movement?

Using Past-Perfect interchangeably with the Past Continuous

Light propagating through a sound wave

Wrapping homogeneous Python objects

In what cases must I use 了 and in what cases not?

Describing a chess game in a novel



Find-out abnormal behavior over the time


How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?













0












$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12















0












$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12













0












0








0





$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$




I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks







machine-learning dataset time-series statistics probability






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 13 '18 at 6:29









JavaUserJavaUser

1011




1011







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12












  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12







1




1




$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12




$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12










1 Answer
1






active

oldest

votes


















2












$begingroup$

Your problem definition



You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



Using a statistical model



The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



Using machine learning



You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



For more information on anomaly detection for time series refer to:



Using time series data from a sensor for ML



How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






share|improve this answer









$endgroup$












    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "557"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    Your problem definition



    You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



    I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



    Using a statistical model



    The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



    You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



    Using machine learning



    You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



    If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



    For more information on anomaly detection for time series refer to:



    Using time series data from a sensor for ML



    How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






    share|improve this answer









    $endgroup$

















      2












      $begingroup$

      Your problem definition



      You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



      I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



      Using a statistical model



      The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



      You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



      Using machine learning



      You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



      If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



      For more information on anomaly detection for time series refer to:



      Using time series data from a sensor for ML



      How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






      share|improve this answer









      $endgroup$















        2












        2








        2





        $begingroup$

        Your problem definition



        You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



        I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



        Using a statistical model



        The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



        You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



        Using machine learning



        You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



        If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



        For more information on anomaly detection for time series refer to:



        Using time series data from a sensor for ML



        How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






        share|improve this answer









        $endgroup$



        Your problem definition



        You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



        I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



        Using a statistical model



        The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



        You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



        Using machine learning



        You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



        If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



        For more information on anomaly detection for time series refer to:



        Using time series data from a sensor for ML



        How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Feb 13 '18 at 7:11









        JahKnowsJahKnows

        5,137625




        5,137625



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            ValueError: Error when checking input: expected conv2d_13_input to have shape (3, 150, 150) but got array with shape (150, 150, 3)2019 Community Moderator ElectionError when checking : expected dense_1_input to have shape (None, 5) but got array with shape (200, 1)Error 'Expected 2D array, got 1D array instead:'ValueError: Error when checking input: expected lstm_41_input to have 3 dimensions, but got array with shape (40000,100)ValueError: Error when checking target: expected dense_1 to have shape (7,) but got array with shape (1,)ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)Keras exception: ValueError: Error when checking input: expected conv2d_1_input to have shape (150, 150, 3) but got array with shape (256, 256, 3)Steps taking too long to completewhen checking input: expected dense_1_input to have shape (13328,) but got array with shape (317,)ValueError: Error when checking target: expected dense_3 to have shape (None, 1) but got array with shape (7715, 40000)Keras exception: Error when checking input: expected dense_input to have shape (2,) but got array with shape (1,)

            Ружовы пелікан Змест Знешні выгляд | Пашырэнне | Асаблівасці біялогіі | Літаратура | НавігацыяДагледжаная версіяправерана1 зменаДагледжаная версіяправерана1 змена/ 22697590 Сістэматыкана ВіківідахВыявына Вікісховішчы174693363011049382

            Illegal assignment from SObject to ContactFetching String, Id from Map - Illegal Assignment Id to Field / ObjectError: Compile Error: Illegal assignment from String to BooleanError: List has no rows for assignment to SObjectError on Test Class - System.QueryException: List has no rows for assignment to SObjectRemote action problemDML requires SObject or SObject list type error“Illegal assignment from List to List”Test Class Fail: Batch Class: System.QueryException: List has no rows for assignment to SObjectMapping to a user'List has no rows for assignment to SObject' Mystery