Why we do not accept the result of our simulation study as evidence of a limitation of one method
$begingroup$
I am doing a mixture model. I have established a new method using EM-algorithm. I have simulated data from a mixture model. Then, I applied my new method to the data. The result is very satisfying. Then, for comparison reason, the non-mixture model shows inaccurate results, as accepted. I have used this as evidence that the non-mixture model (for a specific area) is not able to deal with mixture dependency. Someone told me that is not surprising as the data is a mixture data. I already knew that but to make the reader aware of the importance of the mixture model and how the non-mixture fails in these cases. Then, he asked me to applied both non-mixture and mixture models on real data and see the results. The data I have used is general (I just would like to test the model on it and have no experiment information about it). I read that for real data, we should understand it or have a strong background on it, otherwise the comparison is not fair. For example, suppose that I fit a model on a data where I really do not know it very well. Suppose further that the first model (model A (non-mixture) fit different distribution (say arbitrary Gaussian models) to the data, while the mixture model (model B) fit only specific mixture Gaussian model. Then, it may possible that model A outperforms model B. However, if we have a great knowledge of our data, then fit the most appropriate mixture model, then, the possibility that model B fits the data better than model A is high.
My question is why we do not trust the simulation study to illustrates our problem (if we have not interested in specific data) or have data with no experiment knowledge? In other words, as I need to illustrate one point, then why do simulation data is not enough?
New edit
In other words,
My idea is, is it fair to compare model A with model B where I do not have enough information or knowledge of the data at hand? Which may make model A fits the data better than model B (due to poor knowledge of the data). I think, for this case, the fair comparison is can only hold if we have a great knowledge of the data and therefore fit the most appropriate model to it before the comparison. That is, to compare two models on real data, I should have enough knowledge about the data. Otherwise, if I fit wrong model, even if it mixture model, to the real mixture data, then, the non-mixture may fit the data better than the mixture model just because I fit the wrong mixture model? Is that correct? Therefore, the non-mixture model even shows a better model fit than the mixture model, still, give me wrong fits (because the data is a mixture). Hence, in this case, my simulation data is good to illustrate the limitation of the non-mixture model.
mixed-model simulation fitting
$endgroup$
add a comment |
$begingroup$
I am doing a mixture model. I have established a new method using EM-algorithm. I have simulated data from a mixture model. Then, I applied my new method to the data. The result is very satisfying. Then, for comparison reason, the non-mixture model shows inaccurate results, as accepted. I have used this as evidence that the non-mixture model (for a specific area) is not able to deal with mixture dependency. Someone told me that is not surprising as the data is a mixture data. I already knew that but to make the reader aware of the importance of the mixture model and how the non-mixture fails in these cases. Then, he asked me to applied both non-mixture and mixture models on real data and see the results. The data I have used is general (I just would like to test the model on it and have no experiment information about it). I read that for real data, we should understand it or have a strong background on it, otherwise the comparison is not fair. For example, suppose that I fit a model on a data where I really do not know it very well. Suppose further that the first model (model A (non-mixture) fit different distribution (say arbitrary Gaussian models) to the data, while the mixture model (model B) fit only specific mixture Gaussian model. Then, it may possible that model A outperforms model B. However, if we have a great knowledge of our data, then fit the most appropriate mixture model, then, the possibility that model B fits the data better than model A is high.
My question is why we do not trust the simulation study to illustrates our problem (if we have not interested in specific data) or have data with no experiment knowledge? In other words, as I need to illustrate one point, then why do simulation data is not enough?
New edit
In other words,
My idea is, is it fair to compare model A with model B where I do not have enough information or knowledge of the data at hand? Which may make model A fits the data better than model B (due to poor knowledge of the data). I think, for this case, the fair comparison is can only hold if we have a great knowledge of the data and therefore fit the most appropriate model to it before the comparison. That is, to compare two models on real data, I should have enough knowledge about the data. Otherwise, if I fit wrong model, even if it mixture model, to the real mixture data, then, the non-mixture may fit the data better than the mixture model just because I fit the wrong mixture model? Is that correct? Therefore, the non-mixture model even shows a better model fit than the mixture model, still, give me wrong fits (because the data is a mixture). Hence, in this case, my simulation data is good to illustrate the limitation of the non-mixture model.
mixed-model simulation fitting
$endgroup$
add a comment |
$begingroup$
I am doing a mixture model. I have established a new method using EM-algorithm. I have simulated data from a mixture model. Then, I applied my new method to the data. The result is very satisfying. Then, for comparison reason, the non-mixture model shows inaccurate results, as accepted. I have used this as evidence that the non-mixture model (for a specific area) is not able to deal with mixture dependency. Someone told me that is not surprising as the data is a mixture data. I already knew that but to make the reader aware of the importance of the mixture model and how the non-mixture fails in these cases. Then, he asked me to applied both non-mixture and mixture models on real data and see the results. The data I have used is general (I just would like to test the model on it and have no experiment information about it). I read that for real data, we should understand it or have a strong background on it, otherwise the comparison is not fair. For example, suppose that I fit a model on a data where I really do not know it very well. Suppose further that the first model (model A (non-mixture) fit different distribution (say arbitrary Gaussian models) to the data, while the mixture model (model B) fit only specific mixture Gaussian model. Then, it may possible that model A outperforms model B. However, if we have a great knowledge of our data, then fit the most appropriate mixture model, then, the possibility that model B fits the data better than model A is high.
My question is why we do not trust the simulation study to illustrates our problem (if we have not interested in specific data) or have data with no experiment knowledge? In other words, as I need to illustrate one point, then why do simulation data is not enough?
New edit
In other words,
My idea is, is it fair to compare model A with model B where I do not have enough information or knowledge of the data at hand? Which may make model A fits the data better than model B (due to poor knowledge of the data). I think, for this case, the fair comparison is can only hold if we have a great knowledge of the data and therefore fit the most appropriate model to it before the comparison. That is, to compare two models on real data, I should have enough knowledge about the data. Otherwise, if I fit wrong model, even if it mixture model, to the real mixture data, then, the non-mixture may fit the data better than the mixture model just because I fit the wrong mixture model? Is that correct? Therefore, the non-mixture model even shows a better model fit than the mixture model, still, give me wrong fits (because the data is a mixture). Hence, in this case, my simulation data is good to illustrate the limitation of the non-mixture model.
mixed-model simulation fitting
$endgroup$
I am doing a mixture model. I have established a new method using EM-algorithm. I have simulated data from a mixture model. Then, I applied my new method to the data. The result is very satisfying. Then, for comparison reason, the non-mixture model shows inaccurate results, as accepted. I have used this as evidence that the non-mixture model (for a specific area) is not able to deal with mixture dependency. Someone told me that is not surprising as the data is a mixture data. I already knew that but to make the reader aware of the importance of the mixture model and how the non-mixture fails in these cases. Then, he asked me to applied both non-mixture and mixture models on real data and see the results. The data I have used is general (I just would like to test the model on it and have no experiment information about it). I read that for real data, we should understand it or have a strong background on it, otherwise the comparison is not fair. For example, suppose that I fit a model on a data where I really do not know it very well. Suppose further that the first model (model A (non-mixture) fit different distribution (say arbitrary Gaussian models) to the data, while the mixture model (model B) fit only specific mixture Gaussian model. Then, it may possible that model A outperforms model B. However, if we have a great knowledge of our data, then fit the most appropriate mixture model, then, the possibility that model B fits the data better than model A is high.
My question is why we do not trust the simulation study to illustrates our problem (if we have not interested in specific data) or have data with no experiment knowledge? In other words, as I need to illustrate one point, then why do simulation data is not enough?
New edit
In other words,
My idea is, is it fair to compare model A with model B where I do not have enough information or knowledge of the data at hand? Which may make model A fits the data better than model B (due to poor knowledge of the data). I think, for this case, the fair comparison is can only hold if we have a great knowledge of the data and therefore fit the most appropriate model to it before the comparison. That is, to compare two models on real data, I should have enough knowledge about the data. Otherwise, if I fit wrong model, even if it mixture model, to the real mixture data, then, the non-mixture may fit the data better than the mixture model just because I fit the wrong mixture model? Is that correct? Therefore, the non-mixture model even shows a better model fit than the mixture model, still, give me wrong fits (because the data is a mixture). Hence, in this case, my simulation data is good to illustrate the limitation of the non-mixture model.
mixed-model simulation fitting
mixed-model simulation fitting
edited 35 mins ago
Maryam
asked 1 hour ago
MaryamMaryam
5012
5012
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:
- Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.
- Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.
- The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.
$endgroup$
$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago
1
$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f389476%2fwhy-we-do-not-accept-the-result-of-our-simulation-study-as-evidence-of-a-limitat%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:
- Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.
- Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.
- The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.
$endgroup$
$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago
1
$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago
add a comment |
$begingroup$
Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:
- Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.
- Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.
- The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.
$endgroup$
$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago
1
$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago
add a comment |
$begingroup$
Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:
- Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.
- Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.
- The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.
$endgroup$
Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:
- Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.
- Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.
- The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.
answered 46 mins ago
BjörnBjörn
10.5k11039
10.5k11039
$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago
1
$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago
add a comment |
$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago
1
$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago
$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago
$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago
1
1
$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago
$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f389476%2fwhy-we-do-not-accept-the-result-of-our-simulation-study-as-evidence-of-a-limitat%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown