Which Format is Faster, Matlab (.mat) or NumPy (.npy)? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsMATLAB Perceptronunique column value in python numpyAppending to numpy array for creating datasetis numpy isclose function returning bad answer?Python - Converting 3D numpy array to 2DQuestion Related Numpywhy nobody uses matlabHow to make a numpy ndarray from several numpy arraysOpen image saved using python in matlabNumpy Python deep learning framework
What to do with chalk when deepwater soloing?
Is it ethical to give a final exam after the professor has quit before teaching the remaining chapters of the course?
Overriding an object in memory with placement new
Is the Standard Deduction better than Itemized when both are the same amount?
What does the "x" in "x86" represent?
How to find out what spells would be useless to a blind NPC spellcaster?
How to find all the available tools in macOS terminal?
Echoing a tail command produces unexpected output?
How to react to hostile behavior from a senior developer?
Identify plant with long narrow paired leaves and reddish stems
Why is "Consequences inflicted." not a sentence?
What is a non-alternating simple group with big order, but relatively few conjugacy classes?
Dating a Former Employee
English words in a non-english sci-fi novel
51k Euros annually for a family of 4 in Berlin: Is it enough?
How to deal with a team lead who never gives me credit?
What does an IRS interview request entail when called in to verify expenses for a sole proprietor small business?
List *all* the tuples!
Error "illegal generic type for instanceof" when using local classes
In predicate logic, does existential quantification (∃) include universal quantification (∀), i.e. can 'some' imply 'all'?
Do I really need recursive chmod to restrict access to a folder?
Why did the Falcon Heavy center core fall off the ASDS OCISLY barge?
Can I cast Passwall to drop an enemy into a 20-foot pit?
String `!23` is replaced with `docker` in command line
Which Format is Faster, Matlab (.mat) or NumPy (.npy)?
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsMATLAB Perceptronunique column value in python numpyAppending to numpy array for creating datasetis numpy isclose function returning bad answer?Python - Converting 3D numpy array to 2DQuestion Related Numpywhy nobody uses matlabHow to make a numpy ndarray from several numpy arraysOpen image saved using python in matlabNumpy Python deep learning framework
$begingroup$
I am working on a deep learning problem to detect cancer in images of size 250 x 250. I have hardware limitations and I have been running out of memory.
I decided to convert my images to Matlab formatted files (".mat"), with some improvement; however, I still run out of memory. I have explored some resources that highly recommend using NumPy files (".npy").
It would be costly to convert my images to NumPy files, so I would like to make sure that converting will make a difference. I am not asking for memory enhancement algorithms (e.g. batching), just the memory difference between ".mat" and ".npy" files.
python deep-learning matlab numpy
$endgroup$
bumped to the homepage by Community♦ 23 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
I am working on a deep learning problem to detect cancer in images of size 250 x 250. I have hardware limitations and I have been running out of memory.
I decided to convert my images to Matlab formatted files (".mat"), with some improvement; however, I still run out of memory. I have explored some resources that highly recommend using NumPy files (".npy").
It would be costly to convert my images to NumPy files, so I would like to make sure that converting will make a difference. I am not asking for memory enhancement algorithms (e.g. batching), just the memory difference between ".mat" and ".npy" files.
python deep-learning matlab numpy
$endgroup$
bumped to the homepage by Community♦ 23 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
$begingroup$
Can you post your loading code? Can you post benchmarking, including memory usage?
$endgroup$
– Brian Spiering
Nov 10 '18 at 18:07
add a comment |
$begingroup$
I am working on a deep learning problem to detect cancer in images of size 250 x 250. I have hardware limitations and I have been running out of memory.
I decided to convert my images to Matlab formatted files (".mat"), with some improvement; however, I still run out of memory. I have explored some resources that highly recommend using NumPy files (".npy").
It would be costly to convert my images to NumPy files, so I would like to make sure that converting will make a difference. I am not asking for memory enhancement algorithms (e.g. batching), just the memory difference between ".mat" and ".npy" files.
python deep-learning matlab numpy
$endgroup$
I am working on a deep learning problem to detect cancer in images of size 250 x 250. I have hardware limitations and I have been running out of memory.
I decided to convert my images to Matlab formatted files (".mat"), with some improvement; however, I still run out of memory. I have explored some resources that highly recommend using NumPy files (".npy").
It would be costly to convert my images to NumPy files, so I would like to make sure that converting will make a difference. I am not asking for memory enhancement algorithms (e.g. batching), just the memory difference between ".mat" and ".npy" files.
python deep-learning matlab numpy
python deep-learning matlab numpy
edited Nov 16 '18 at 18:07
from keras import michael
29810
29810
asked Jun 28 '18 at 10:53
Andrew NaguibAndrew Naguib
676
676
bumped to the homepage by Community♦ 23 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 23 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
$begingroup$
Can you post your loading code? Can you post benchmarking, including memory usage?
$endgroup$
– Brian Spiering
Nov 10 '18 at 18:07
add a comment |
$begingroup$
Can you post your loading code? Can you post benchmarking, including memory usage?
$endgroup$
– Brian Spiering
Nov 10 '18 at 18:07
$begingroup$
Can you post your loading code? Can you post benchmarking, including memory usage?
$endgroup$
– Brian Spiering
Nov 10 '18 at 18:07
$begingroup$
Can you post your loading code? Can you post benchmarking, including memory usage?
$endgroup$
– Brian Spiering
Nov 10 '18 at 18:07
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
From my research:
- both can store data in binary format
- both store the data type of the data
- I am unsure about compression ratios and load time, which seems to be the subtext of your question.
One thing you don't seem to address is what you are loading the data into, or whether you are considering moving from a MATLAB environment to Python environment or visa-versa.
That said, I found this post useful and thought it may be helpful to you if you have not seen it already. https://stackoverflow.com/a/10997335/3259054 Perhaps you could write a small script to sample some files and see the difference.
Have you considered the HDF5 format? If you are looking to make a change, you might as well test other options too and HDF5 has a lot of momentum towards becoming the de facto standard for scientific computing.
Finally, purely out of a desire to learn, why are you concerned about the file format if you have memory constraints?
$endgroup$
add a comment |
$begingroup$
MATLAB has known for it's memory consumption. So even if you use same data for processing in Python overall system memory utilization will be less in Python.
Based on my experience so far using Python helped me dealing more data with better performance.
One other hand Python have many libraries/Frameworks out of box to further enhance the overall performance and Machine Learning/ Deep Learning (I am not much sure if similar Libraries & Frameworks are available in Matlab also).
$endgroup$
add a comment |
$begingroup$
One way to reduce in-memory bottlenecks is to more efficiently handle data processing (regardless of the on-disk format).
There are software frameworks designed to improve the training process, especially for loading images. Dask is one such framework to scale existing Python workflows, thus mostly likely it will reduce the memory bottleneck for .npy
files relative to .mat
files (the only way to be sure is to benchmark).
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f33758%2fwhich-format-is-faster-matlab-mat-or-numpy-npy%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
From my research:
- both can store data in binary format
- both store the data type of the data
- I am unsure about compression ratios and load time, which seems to be the subtext of your question.
One thing you don't seem to address is what you are loading the data into, or whether you are considering moving from a MATLAB environment to Python environment or visa-versa.
That said, I found this post useful and thought it may be helpful to you if you have not seen it already. https://stackoverflow.com/a/10997335/3259054 Perhaps you could write a small script to sample some files and see the difference.
Have you considered the HDF5 format? If you are looking to make a change, you might as well test other options too and HDF5 has a lot of momentum towards becoming the de facto standard for scientific computing.
Finally, purely out of a desire to learn, why are you concerned about the file format if you have memory constraints?
$endgroup$
add a comment |
$begingroup$
From my research:
- both can store data in binary format
- both store the data type of the data
- I am unsure about compression ratios and load time, which seems to be the subtext of your question.
One thing you don't seem to address is what you are loading the data into, or whether you are considering moving from a MATLAB environment to Python environment or visa-versa.
That said, I found this post useful and thought it may be helpful to you if you have not seen it already. https://stackoverflow.com/a/10997335/3259054 Perhaps you could write a small script to sample some files and see the difference.
Have you considered the HDF5 format? If you are looking to make a change, you might as well test other options too and HDF5 has a lot of momentum towards becoming the de facto standard for scientific computing.
Finally, purely out of a desire to learn, why are you concerned about the file format if you have memory constraints?
$endgroup$
add a comment |
$begingroup$
From my research:
- both can store data in binary format
- both store the data type of the data
- I am unsure about compression ratios and load time, which seems to be the subtext of your question.
One thing you don't seem to address is what you are loading the data into, or whether you are considering moving from a MATLAB environment to Python environment or visa-versa.
That said, I found this post useful and thought it may be helpful to you if you have not seen it already. https://stackoverflow.com/a/10997335/3259054 Perhaps you could write a small script to sample some files and see the difference.
Have you considered the HDF5 format? If you are looking to make a change, you might as well test other options too and HDF5 has a lot of momentum towards becoming the de facto standard for scientific computing.
Finally, purely out of a desire to learn, why are you concerned about the file format if you have memory constraints?
$endgroup$
From my research:
- both can store data in binary format
- both store the data type of the data
- I am unsure about compression ratios and load time, which seems to be the subtext of your question.
One thing you don't seem to address is what you are loading the data into, or whether you are considering moving from a MATLAB environment to Python environment or visa-versa.
That said, I found this post useful and thought it may be helpful to you if you have not seen it already. https://stackoverflow.com/a/10997335/3259054 Perhaps you could write a small script to sample some files and see the difference.
Have you considered the HDF5 format? If you are looking to make a change, you might as well test other options too and HDF5 has a lot of momentum towards becoming the de facto standard for scientific computing.
Finally, purely out of a desire to learn, why are you concerned about the file format if you have memory constraints?
answered Nov 9 '18 at 14:42
SkiddlesSkiddles
695210
695210
add a comment |
add a comment |
$begingroup$
MATLAB has known for it's memory consumption. So even if you use same data for processing in Python overall system memory utilization will be less in Python.
Based on my experience so far using Python helped me dealing more data with better performance.
One other hand Python have many libraries/Frameworks out of box to further enhance the overall performance and Machine Learning/ Deep Learning (I am not much sure if similar Libraries & Frameworks are available in Matlab also).
$endgroup$
add a comment |
$begingroup$
MATLAB has known for it's memory consumption. So even if you use same data for processing in Python overall system memory utilization will be less in Python.
Based on my experience so far using Python helped me dealing more data with better performance.
One other hand Python have many libraries/Frameworks out of box to further enhance the overall performance and Machine Learning/ Deep Learning (I am not much sure if similar Libraries & Frameworks are available in Matlab also).
$endgroup$
add a comment |
$begingroup$
MATLAB has known for it's memory consumption. So even if you use same data for processing in Python overall system memory utilization will be less in Python.
Based on my experience so far using Python helped me dealing more data with better performance.
One other hand Python have many libraries/Frameworks out of box to further enhance the overall performance and Machine Learning/ Deep Learning (I am not much sure if similar Libraries & Frameworks are available in Matlab also).
$endgroup$
MATLAB has known for it's memory consumption. So even if you use same data for processing in Python overall system memory utilization will be less in Python.
Based on my experience so far using Python helped me dealing more data with better performance.
One other hand Python have many libraries/Frameworks out of box to further enhance the overall performance and Machine Learning/ Deep Learning (I am not much sure if similar Libraries & Frameworks are available in Matlab also).
edited Nov 13 '18 at 3:11
Stephen Rauch♦
1,52551330
1,52551330
answered Nov 13 '18 at 2:27
mannumannu
745
745
add a comment |
add a comment |
$begingroup$
One way to reduce in-memory bottlenecks is to more efficiently handle data processing (regardless of the on-disk format).
There are software frameworks designed to improve the training process, especially for loading images. Dask is one such framework to scale existing Python workflows, thus mostly likely it will reduce the memory bottleneck for .npy
files relative to .mat
files (the only way to be sure is to benchmark).
$endgroup$
add a comment |
$begingroup$
One way to reduce in-memory bottlenecks is to more efficiently handle data processing (regardless of the on-disk format).
There are software frameworks designed to improve the training process, especially for loading images. Dask is one such framework to scale existing Python workflows, thus mostly likely it will reduce the memory bottleneck for .npy
files relative to .mat
files (the only way to be sure is to benchmark).
$endgroup$
add a comment |
$begingroup$
One way to reduce in-memory bottlenecks is to more efficiently handle data processing (regardless of the on-disk format).
There are software frameworks designed to improve the training process, especially for loading images. Dask is one such framework to scale existing Python workflows, thus mostly likely it will reduce the memory bottleneck for .npy
files relative to .mat
files (the only way to be sure is to benchmark).
$endgroup$
One way to reduce in-memory bottlenecks is to more efficiently handle data processing (regardless of the on-disk format).
There are software frameworks designed to improve the training process, especially for loading images. Dask is one such framework to scale existing Python workflows, thus mostly likely it will reduce the memory bottleneck for .npy
files relative to .mat
files (the only way to be sure is to benchmark).
edited Nov 10 '18 at 18:03
answered Nov 10 '18 at 17:51
Brian SpieringBrian Spiering
4,2781129
4,2781129
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f33758%2fwhich-format-is-faster-matlab-mat-or-numpy-npy%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Can you post your loading code? Can you post benchmarking, including memory usage?
$endgroup$
– Brian Spiering
Nov 10 '18 at 18:07