I would like to group the samples into the following please:
Malignant cell samples are: AOCS1, G33, G164
Fibroblasts are: G342, G351, G369
After I grouped them into the two sample categories, Malignant and Fibroblast, I would like to do the test test for each row
of the genes,
For example,
I am using R studio
I am new to this kind of analysis so any help will be greatly appreciated.
Many Thanks,
Ishack
pValues <- apply(df, 1, function(x) t.test(x[2:4],x[5:7])$p.value)
But I got the following error
Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") : missing value where TRUE/FALSE needed
Can someone shed some light on this?
Many Thanks
Chris
Here is my my data
https://drive.google.com/open?id=1LiJD7T6oR5MtABwYqkhUrJFfo7XRxJ_z
For a problem like this you want to look into the
apply
function in R. This function will let you perform a function row-wise or column-wise on a dataframe or matrix.
From the help menu:
apply(X, MARGIN, FUN, ...)
where
X
is your dataframe/matrix,
MARGIN
is either
1
for row-wise or
2
for column-wise, and
FUN
is the function that you want to perform. Depending on what you want to do, you can have a base
FUN
such as
median
or
sum
, or you can define your own
function(x)
, where
x
is each row (or column) in your dataframe.
So for the example of dataframe
df
where columns 2-4 are Malignant and 5-7 are fibroblast you can run:
pValues <- apply(df, 1, function(x) t.test(x[2:4],x[5:7])$p.value)
This will take df and for each row (indicated by the 1, as opposed to each column) it will perform function(x), whereby a t-test is performed on the elements 2-4 compared to 5-7, and the p-value is reported (hence the $p.value). This will perform that function for each row, and store the p-values in the vector pValues.
Hi, I applied this line pValues <- apply(df, 1, function(x) t.test(x[2:4],x[5:7])$p.value)
But I got the following error Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") : missing value where TRUE/FALSE needed
Can you shed some light on this? Many Thanks Chris
Here is my my data https://drive.google.com/open?id=1LiJD7T6oR5MtABwYqkhUrJFfo7XRxJ_z
Hi Chris, can you try the following please?
pValues <- apply(df, 1, function(x) t.test(x[1:3],x[4:6])$p.value)
I think you are refering to the wrong column numbers. Your data has 6 columns, 1-3 columns are group 1 and 4-6 columns are group 2.