目前日期文章:200602 (4)

瀏覽方式: 標題列表 簡短摘要

I met with Marianne, a post doc in the School of Nursing in UNC-CH,
to discuss some remaining problems in her publication. Those
problems caused me confused as well, but my supervisor Mark didn't
have time to participate in our meeting last Friday. Hence, I stop
by Mark's office again today (Monday). There are some useful
conclusion after meeting with Mark.



The first question is whether Likelihood Ratio Test(LRT) is
suitable in comparing two models have the same independent
variables but some of them have differnt attributes. For example,
one variable is continuous in a Model, but how about using a
categorical one in the same model? That's the key point of this
question. The only confused thing is that LRT is only used in
nested model, but I am not sure whether this kind of situation is
the same. Mark told me it is because we can regard the continuse
one is a reduced model and the categorical one is full model in
that there are more variables in categorical one. Based on this
assumption, we can use LRT as usual.



The second question is more complicated. Marianne had already
finish the part of model selection, but just need a LRT to confirm
her final models are the best one. By using LRT, the decision
should be non-significant with large p-value, then we can have no
rejection of the null hypothesis which is reduced (final) model.
However, it is totally conflicted because the result is
significant. After checking the original SAS code, there is no
problem as well. However, Mark said, based on Marianne study
design, she needs to keep two important variables in this model
whatever it is significant or non-significant. After including the
two variables in this model, the conflict was eliminated. But, I
was still wondering whether one of them is highly overlapped with
another one because they are all geographic variables and have
highly similarity. I dropped out a less important one and fit the
model again, the result looked better. From this problem, we can
understand that we need to know more about variables before model
fitting, then we will decrease confusion from that.



The two solutions had already been emailed to Marianne. Hope she
will feel useful.












cchien 發表在 痞客邦 PIXNET 留言(2) 人氣()

June Cho, a Korean woman who is a postdoc in the School of Nursing in UNC-CH. I handled with her dissertation from 2004.DEC to 2005.MAY, and she graduated smoothly on 2005.JUL. Her husband is a professor in the School of Pharmacy. I guess they have been the U.S. citizens. After she graduated, she stay here to be her advisor's postdoc, and keep doing advanced research from her dissertation.



She wants to do a 2-way ANOVA to compare simple main effect in her current study. It's very easy, but she just needs my confirmation. I constructed a macro to her and she can just call this macro to fit all of her models (18 models). However, simple main effect is only used under the interaction term is significant. I only ran a model and the interaction term is significant, but I can predict not all of them have significant interaction terms. However, simple main effect is her only purpose of current research. How could we do it under non-significant interaction?



Regularly, I asked my supervisor, Mark. He said even though the interaction term is not significant, but we can still keep it in GLM model. Therefore, we con consist all results of simple main effect from those 18 models because all of them include interaction term. This could be a more suitable conclusion in discussion section.














-----

cchien 發表在 痞客邦 PIXNET 留言(0) 人氣()

In some statistical analysis, we'd like to test assumption of
normality in the beginning before analyzing. In univariate case, we
all understand Q-Q plot and some K-S statistic can be used to
assess normality. However, in multivariate normal distribution, how
about that?



Mardia's statistic is a test for multivariate normality. Based on
functions of skewness and kurtosis, Mardia's PK should be less than
3 to assume the assumption of multivariate normality is met. But,
whatever in SAS or SPSS, there is no easy way to use any statement
to perform it in any procedure.



In SAS, we need to use a macro procedure to calculate Mardia's PK
statistics. SAS Inc. released the codes on official website. Please
check the following link:



http://support.sas.com/ctx/samples/index.jsp?sid=480



Also, in SPSS, we need to use a macro to examine
bivariate/multivariate normality. Check it:



http://www.columbia.edu/~ld208/














-----

cchien 發表在 痞客邦 PIXNET 留言(0) 人氣()

Lindsey Austin, a master student (I guess) who works for a professor to be something (I am not sure whether she is a TA). Her professor requests her to analyze some records to see student's study ability. However, she is not good at statsitics, so she sent the data set to me.



The question is very easy: how to calculate the correlation between individual scores and GPA in reading, math, science, and fundamentals in some courses. The individual score variable is scale (0~100), but the GPA is ordinal (A+, A, A-,...., F).



In correlation analysis, there are three correlation coefficients we often use: Pearson, Kendall's tau, and Spearman. However, none of them are for the case of "scale vs ordinal".



I am wondering whether there are some special correlation coefficients that I don't know. I went to check SAS menu to see "PROC CORR", but there is no special correlation. My supervisor, Mark, even took his old handouts (because he also graduates from biostatics department in UNC-CH) to search for any evidence, but there is no way as well.



Finally, we conclude that, we can rank the individual score variable, and use Spearman correlation.



This is a pretty special case. I think there should be a specific correlation for this situation, but we haven't figure it out. If so, I will show here.












cchien 發表在 痞客邦 PIXNET 留言(2) 人氣()

找更多相關文章與討論