Why we do not accept the result of our simulation study as evidence of a limitation of one method

I am doing a mixture model. I have established a new method using EM-algorithm. I have simulated data from a mixture model. Then, I applied my new method to the data. The result is very satisfying. Then, for comparison reason, the non-mixture model shows inaccurate results, as accepted. I have used this as evidence that the non-mixture model (for a specific area) is not able to deal with mixture dependency. Someone told me that is not surprising as the data is a mixture data. I already knew that but to make the reader aware of the importance of the mixture model and how the non-mixture fails in these cases. Then, he asked me to applied both non-mixture and mixture models on real data and see the results. The data I have used is general (I just would like to test the model on it and have no experiment information about it). I read that for real data, we should understand it or have a strong background on it, otherwise the comparison is not fair. For example, suppose that I fit a model on a data where I really do not know it very well. Suppose further that the first model (model A (non-mixture) fit different distribution (say arbitrary Gaussian models) to the data, while the mixture model (model B) fit only specific mixture Gaussian model. Then, it may possible that model A outperforms model B. However, if we have a great knowledge of our data, then fit the most appropriate mixture model, then, the possibility that model B fits the data better than model A is high.

My question is why we do not trust the simulation study to illustrates our problem (if we have not interested in specific data) or have data with no experiment knowledge? In other words, as I need to illustrate one point, then why do simulation data is not enough?

New edit

In other words,

My idea is, is it fair to compare model A with model B where I do not have enough information or knowledge of the data at hand? Which may make model A fits the data better than model B (due to poor knowledge of the data). I think, for this case, the fair comparison is can only hold if we have a great knowledge of the data and therefore fit the most appropriate model to it before the comparison. That is, to compare two models on real data, I should have enough knowledge about the data. Otherwise, if I fit wrong model, even if it mixture model, to the real mixture data, then, the non-mixture may fit the data better than the mixture model just because I fit the wrong mixture model? Is that correct? Therefore, the non-mixture model even shows a better model fit than the mixture model, still, give me wrong fits (because the data is a mixture). Hence, in this case, my simulation data is good to illustrate the limitation of the non-mixture model.

edited 35 mins ago

asked 1 hour ago

Maryam

5012

add a comment |

New edit

In other words,

edited 35 mins ago

asked 1 hour ago

Maryam

5012

add a comment |

New edit

In other words,

edited 35 mins ago

asked 1 hour ago

Maryam

5012

New edit

In other words,

mixed-model simulation fitting

edited 35 mins ago

asked 1 hour ago

Maryam

5012

edited 35 mins ago

asked 1 hour ago

Maryam

5012

edited 35 mins ago

asked 1 hour ago

Maryam

5012

asked 1 hour ago

Maryam

5012

asked 1 hour ago

Maryam

5012

add a comment |

1 Answer
1

active

oldest

votes

Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:

Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.

Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.

The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.

answered 46 mins ago

Björn

10.5k11039

$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago

1

$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f389476%2fwhy-we-do-not-accept-the-result-of-our-simulation-study-as-evidence-of-a-limitat%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:

Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.

Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.

The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.

answered 46 mins ago

Björn

10.5k11039

$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago

1

$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago

add a comment |

Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:

Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.

Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.

The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.

answered 46 mins ago

Björn

10.5k11039

$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago

1

$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago

add a comment |

Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:

Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.

Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.

The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.

answered 46 mins ago

Björn

10.5k11039

Simulation studies that show that it is great when the data generating model and the analysis model are the same are very common. What people really want to see is more general:

Model performing well when the data generating merchanism has all the complexity of real life. There is a lot of judgement here, but some other aspect of the data generating mechanism may have a much bigger impact than others. Simulations are actually great for exploring that, but are too often poorly done.

Don't just knock down a strawman, but all the reasonable / frequently used methods. E.g. adjustment for covariates might make omitting a random effect less important.

The differences in performance need to be striking enough that it truly matters in practice. A good example can also help here to illustrate that one can get strikingly different conclusions.

answered 46 mins ago

Björn

10.5k11039

answered 46 mins ago

Björn

10.5k11039

answered 46 mins ago

Björn

10.5k11039

answered 46 mins ago

Björn

10.5k11039

$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago

1

$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago

add a comment |

$begingroup$
Thank you so much for your answer. I appreciate it. I have edited my question.
$endgroup$
– Maryam
31 mins ago

1

$begingroup$
I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.
$endgroup$
– Björn
30 mins ago

Thank you so much for your answer. I appreciate it. I have edited my question.

– Maryam
31 mins ago

I guess part of the problem is: why should anyone care that omitting a random effect on a model matters, if there is one in the data generating process, if they have no idea whether this can realistically occur in practice. Also, does the form of the random effect matter etc.

– Björn
30 mins ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Fdzfgy