binary classification for time series dataHow to merge monthly, daily and weekly data?Time series data: How I measure influence of new product sales on existing product sales (statistically)?What is a better approach for cross-validation with time-related predictorsHow do I find a repeating pattern of unknown length and start within a stringTime series finance -Correlation between a sector and MSCI ACWI returnsbinary classification for counting - estimating the error on counts due to error on prediction scoreDefining Input Shape for Time Series using LSTM in KerasDe-noising/removing measurement error from time series with very few observationsMultivariate Time Series Binary ClassificationBinary classification with time-series features
Air travel with refrigerated insulin
Why would five hundred and five be same as one?
Are Captain Marvel's powers affected by Thanos breaking the Tesseract and claiming the stone?
How much do grades matter for a future academia position?
Is it feasible to let a newcomer play the "Gandalf"-like figure I created for my campaign?
How to make a list of partial sums using forEach
How to leave product feedback on macOS?
Is there a RAID 0 Equivalent for RAM?
Grepping string, but include all non-blank lines following each grep match
What does "tick" mean in this sentence?
How to test the sharpness of a knife?
When is "ei" a diphthong?
How do you justify more code being written by following clean code practices?
How do I Interface a PS/2 Keyboard without Modern Techniques?
Why do Radio Buttons not fill the entire outer circle?
Is there anyway, I can have two passwords for my wi-fi
Giving feedback to someone without sounding prejudiced
Do I have to take mana from my deck or hand when tapping a dual land?
Check if object is null and return null
Review your own paper in Mathematics
Is there a reason to prefer HFS+ over APFS for disk images in High Sierra and/or Mojave?
Can I cause damage to electrical appliances by unplugging them when they are turned on?
"Oh no!" in Latin
Sound waves in different octaves
binary classification for time series data
How to merge monthly, daily and weekly data?Time series data: How I measure influence of new product sales on existing product sales (statistically)?What is a better approach for cross-validation with time-related predictorsHow do I find a repeating pattern of unknown length and start within a stringTime series finance -Correlation between a sector and MSCI ACWI returnsbinary classification for counting - estimating the error on counts due to error on prediction scoreDefining Input Shape for Time Series using LSTM in KerasDe-noising/removing measurement error from time series with very few observationsMultivariate Time Series Binary ClassificationBinary classification with time-series features
$begingroup$
I am a new to data science and I really appreciate any feedback on this problem.
I have a dataset with 450 subjects. A binary response (yes/no) is measured every week on each subject for a total of 104 weeks and the average proportion of events is 30%. Also, about 2000 features are measured each week from each subject to predict the weekly response. My questions are:
1- What is the best model to fit this data?
My concern about using machine learning models such as Random Forest or XGboost is that those models rely on the assumption that the analyzed data are independent and identically distributed (i.i.d.) which is not the case here because of the temporal correlation.
2- How do I split my data to train/test sets? For example, do I use only 350 subjects with 104w of data for training and 100 subjects with 104w for test set? Or use only 80 weeks of data for 450 subjects for training and 24w for 450 subjects as test set?
Thanks
classification time-series
New contributor
$endgroup$
add a comment |
$begingroup$
I am a new to data science and I really appreciate any feedback on this problem.
I have a dataset with 450 subjects. A binary response (yes/no) is measured every week on each subject for a total of 104 weeks and the average proportion of events is 30%. Also, about 2000 features are measured each week from each subject to predict the weekly response. My questions are:
1- What is the best model to fit this data?
My concern about using machine learning models such as Random Forest or XGboost is that those models rely on the assumption that the analyzed data are independent and identically distributed (i.i.d.) which is not the case here because of the temporal correlation.
2- How do I split my data to train/test sets? For example, do I use only 350 subjects with 104w of data for training and 100 subjects with 104w for test set? Or use only 80 weeks of data for 450 subjects for training and 24w for 450 subjects as test set?
Thanks
classification time-series
New contributor
$endgroup$
add a comment |
$begingroup$
I am a new to data science and I really appreciate any feedback on this problem.
I have a dataset with 450 subjects. A binary response (yes/no) is measured every week on each subject for a total of 104 weeks and the average proportion of events is 30%. Also, about 2000 features are measured each week from each subject to predict the weekly response. My questions are:
1- What is the best model to fit this data?
My concern about using machine learning models such as Random Forest or XGboost is that those models rely on the assumption that the analyzed data are independent and identically distributed (i.i.d.) which is not the case here because of the temporal correlation.
2- How do I split my data to train/test sets? For example, do I use only 350 subjects with 104w of data for training and 100 subjects with 104w for test set? Or use only 80 weeks of data for 450 subjects for training and 24w for 450 subjects as test set?
Thanks
classification time-series
New contributor
$endgroup$
I am a new to data science and I really appreciate any feedback on this problem.
I have a dataset with 450 subjects. A binary response (yes/no) is measured every week on each subject for a total of 104 weeks and the average proportion of events is 30%. Also, about 2000 features are measured each week from each subject to predict the weekly response. My questions are:
1- What is the best model to fit this data?
My concern about using machine learning models such as Random Forest or XGboost is that those models rely on the assumption that the analyzed data are independent and identically distributed (i.i.d.) which is not the case here because of the temporal correlation.
2- How do I split my data to train/test sets? For example, do I use only 350 subjects with 104w of data for training and 100 subjects with 104w for test set? Or use only 80 weeks of data for 450 subjects for training and 24w for 450 subjects as test set?
Thanks
classification time-series
classification time-series
New contributor
New contributor
New contributor
asked 9 mins ago
AbuSalimAbuSalim
11
11
New contributor
New contributor
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
AbuSalim is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47711%2fbinary-classification-for-time-series-data%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
AbuSalim is a new contributor. Be nice, and check out our Code of Conduct.
AbuSalim is a new contributor. Be nice, and check out our Code of Conduct.
AbuSalim is a new contributor. Be nice, and check out our Code of Conduct.
AbuSalim is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47711%2fbinary-classification-for-time-series-data%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown