Below is the outline of what I need to do for a mock coursework assignment at University. There are a few more notes to help if you accept the assignment plus a dataset.
This is the scenario:
Whilst working on a foot ulcer trial application as a medical statistician, a colleague asks you to
provide assistance in designing a surgical trial in colorectal cancer. A surgeon has
approached your department for help with an application for a trial looking at
changes in patient weight before and after a surgical procedure.
In the consultancy session, the discussion turned to some datasets from a trial of surgical
techniques for colorectal cancer that you have previously worked on, so this data could help inform the design of your colleague’s trial. However, the discussion concluded that the key parameter is the COEFFICIENT OF VARIATION (CV) of the patients’ Body Mass Index (BMI). To complicate matters, your colleague needs you to provide the CV for your data, but also a
plausible range of values that the CV could take.
The CV is defined as the standard deviation divided by the mean. An expression for
the distribution of a CV is not straightforward. While SAS can provide a standard
error and confidence interval for the mean of observations, a variance or standard
error for the CV is not available by default.
In situations where you need to know the distribution of a parameter, and cannot easily obtain this analytically, you can use computationally-intensive methods to simulate a possible distribution based on the underlying data. One such approach is the non-parametric bootstrap using PROC SURVEYSELECT. A call to PROC SURVEYSELECT would look something like this:
proc surveyselect data=<DATASET> out=<DATASET>
You need to implement a non-parametric bootstrap to estimate
the CV of BMI for patients in your workshop dataset.
You need to choose the PROC SURVEYSELECT options to ensure that this procedure correctly performs the non-parametric bootstrap to analyse the results in each sample.
1) Estimate the Coefficient of Variation of the Body Mass Index of the patients in
the baseline dataset. You will need to first derive the BMI for the patients in your baseline dataset;
2) Using SAS, implement the non-parametric bootstrap using PROC
SURVEYSELECT as described above to draw a suitable number of bootstrap
samples from the baseline dataset;
3) Include one or more PROC steps to summarise the results of your
simulations, so that your colleague can investigate the impact if the true value
is in a plausible range of values.
4) Provide your SAS code and the full SAS Output to show what was produced.
1-2 sides of A4 incl. comments - each step must have a comment explaining what you have done
Submit your SAS code as a SAS code file, or paste the code into Word. You
should also save your plain text output OR your HTML output and include that
Use size 10pt Courier typeface, and single line spacing, as the text
appears in SAS.
Readable code is important as well as my being able to reproduce your output from the code provided.
Must be done by Friday 16:00 GMT! Please only bid if you have access to SAS and can code it. Thanks :)