4 Week 2 Exercises
4.1 In-class exercises
- We can run the script on a file (assuming we are in
bash_for_bio/):
./scripts/week2/sam_count.sh ./data/my_file.bamTry out the above script on ./data/CALU1_combined_final.sam.
Make sure you are in the top folder (bash_for_bio) when you do it.
- Run
run_bwa.shusing:
./scripts/week2/run_bwa.sh ./data/CALU1_combined_final.fasta- Run
process_data.R:
module load fhR
Rscript process_data.R input_file="LUSC_clinical.csv"
module purge- Run
process_data.py:
module load fhPython
python3 process_file.py lusc_file.csv
module purge4.2 Homework Exercises
All exercises are required for the badge. Where possible, please paste your code in the grey boxes and output below that. (If the output is long, then just the first few lines is fine.) Make sure to answer the questions.
- Copy the below script into a file called
samtools_count.sh. What does the script do?
#!/bin/bash
module load SAMtools
samtools view -c $1 How would we modify the above script to redirect the output to ${1}.counts.txt?
- Make sure you can get
scripts/week2/run_bwa.shto run onrhino. Run it ondata/MOLM13_combined_final.fastq.
When it’s successful, run head on the MOLM113_combined_final.fastq output, and paste your command and the output below.
- Modify
scripts/week2/run_bwa.shto- Take an additional argument, a folder path
- Save the SAM file to this folder path
Hint 1: I recommend that you copy run_bwa.sh into a new script and work from there. Put your code into the codeblock below:
Write an example to run your new version of the script below and run it:
For question 4, pick one language to answer.
4R. (R) Modify the below R script and save it to a file called scripts/week2/r_csv_script.R. It should also take an argument called $FILEPATH for read.csv():
library(tidyverse)
csv_file <- read_csv("myfile.csv")
summary(csv_file)How would you run this on the command line?
module load fhR
-----
module purgeHow would you redirect the output of your script to a file?
4Py. (py) Modify the below Python script to be runnable and save it to scripts/week2/py_csv_script.py. Your new version should also take the first position argument (a file path) and process the file:
#| eval: false
import pandas as pd
csv_file = pd.read_csv("my_file.csv")
csv_file.describe()How would you run this on the command line?
module load fhPython
-----
module purgeHow would you redirect the output of your script to a file?