Find-out abnormal behavior over the timeHow to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?

Optimising a list searching algorithm

Generic TVP tradeoffs?

How does one measure the Fourier components of a signal?

Pronounciation of the combination "st" in spanish accents

Print last inputted byte

Bash - pair each line of file

Why is there so much iron?

Usage and meaning of "up" in "...worth at least a thousand pounds up in London"

Knife as defense against stray dogs

How to terminate ping <dest> &

In Aliens, how many people were on LV-426 before the Marines arrived​?

How to define limit operations in general topological spaces? Are nets able to do this?

Hausdorff dimension of the boundary of fibres of Lipschitz maps

Do US professors/group leaders only get a salary, but no group budget?

Can other pieces capture a threatening piece and prevent a checkmate?

World War I as a war of liberals against authoritarians?

What does Jesus mean regarding "Raca," and "you fool?" - is he contrasting them?

What are substitutions for coconut in curry?

Can you move over difficult terrain with only 5 feet of movement?

Using Past-Perfect interchangeably with the Past Continuous

Light propagating through a sound wave

Wrapping homogeneous Python objects

In what cases must I use 了 and in what cases not?

Describing a chess game in a novel



Find-out abnormal behavior over the time


How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?













0












$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12















0












$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12













0












0








0





$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$




I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks







machine-learning dataset time-series statistics probability






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 13 '18 at 6:29









JavaUserJavaUser

1011




1011







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12












  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12







1




1




$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12




$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12










1 Answer
1






active

oldest

votes


















2












$begingroup$

Your problem definition



You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



Using a statistical model



The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



Using machine learning



You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



For more information on anomaly detection for time series refer to:



Using time series data from a sensor for ML



How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






share|improve this answer









$endgroup$












    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "557"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    Your problem definition



    You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



    I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



    Using a statistical model



    The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



    You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



    Using machine learning



    You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



    If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



    For more information on anomaly detection for time series refer to:



    Using time series data from a sensor for ML



    How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






    share|improve this answer









    $endgroup$

















      2












      $begingroup$

      Your problem definition



      You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



      I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



      Using a statistical model



      The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



      You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



      Using machine learning



      You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



      If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



      For more information on anomaly detection for time series refer to:



      Using time series data from a sensor for ML



      How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






      share|improve this answer









      $endgroup$















        2












        2








        2





        $begingroup$

        Your problem definition



        You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



        I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



        Using a statistical model



        The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



        You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



        Using machine learning



        You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



        If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



        For more information on anomaly detection for time series refer to:



        Using time series data from a sensor for ML



        How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






        share|improve this answer









        $endgroup$



        Your problem definition



        You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



        I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



        Using a statistical model



        The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



        You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



        Using machine learning



        You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



        If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



        For more information on anomaly detection for time series refer to:



        Using time series data from a sensor for ML



        How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Feb 13 '18 at 7:11









        JahKnowsJahKnows

        5,137625




        5,137625



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Францішак Багушэвіч Змест Сям'я | Біяграфія | Творчасць | Мова Багушэвіча | Ацэнкі дзейнасці | Цікавыя факты | Спадчына | Выбраная бібліяграфія | Ушанаванне памяці | У філатэліі | Зноскі | Літаратура | Спасылкі | НавігацыяЛяхоўскі У. Рупіўся дзеля Бога і людзей: Жыццёвы шлях Лявона Вітан-Дубейкаўскага // Вольскі і Памідораў з песняй пра немца Адвакат, паэт, народны заступнік Ашмянскі веснікВ Минске появится площадь Богушевича и улица Сырокомли, Белорусская деловая газета, 19 июля 2001 г.Айцец беларускай нацыянальнай ідэі паўстаў у бронзе Сяргей Аляксандравіч Адашкевіч (1918, Мінск). 80-я гады. Бюст «Францішак Багушэвіч».Яўген Мікалаевіч Ціхановіч. «Партрэт Францішка Багушэвіча»Мікола Мікалаевіч Купава. «Партрэт зачынальніка новай беларускай літаратуры Францішка Багушэвіча»Уладзімір Іванавіч Мелехаў. На помніку «Змагарам за родную мову» Барэльеф «Францішак Багушэвіч»Памяць пра Багушэвіча на Віленшчыне Страчаная сталіца. Беларускія шыльды на вуліцах Вільні«Krynica». Ideologia i przywódcy białoruskiego katolicyzmuФранцішак БагушэвічТворы на knihi.comТворы Францішка Багушэвіча на bellib.byСодаль Уладзімір. Францішак Багушэвіч на Лідчыне;Луцкевіч Антон. Жыцьцё і творчасьць Фр. Багушэвіча ў успамінах ягоных сучасьнікаў // Запісы Беларускага Навуковага таварыства. Вільня, 1938. Сшытак 1. С. 16-34.Большая российская1188761710000 0000 5537 633Xn9209310021619551927869394п

            Partai Komunis Tiongkok Daftar isi Kepemimpinan | Pranala luar | Referensi | Menu navigasidiperiksa1 perubahan tertundacpc.people.com.cnSitus resmiSurat kabar resmi"Why the Communist Party is alive, well and flourishing in China"0307-1235"Full text of Constitution of Communist Party of China"smengembangkannyas

            ValueError: Expected n_neighbors <= n_samples, but n_samples = 1, n_neighbors = 6 (SMOTE) The 2019 Stack Overflow Developer Survey Results Are InCan SMOTE be applied over sequence of words (sentences)?ValueError when doing validation with random forestsSMOTE and multi class oversamplingLogic behind SMOTE-NC?ValueError: Error when checking target: expected dense_1 to have shape (7,) but got array with shape (1,)SmoteBoost: Should SMOTE be ran individually for each iteration/tree in the boosting?solving multi-class imbalance classification using smote and OSSUsing SMOTE for Synthetic Data generation to improve performance on unbalanced dataproblem of entry format for a simple model in KerasSVM SMOTE fit_resample() function runs forever with no result