Biostar入门学习笔记(1):Some basic but useful code

The following code are learned from the following link.

http://swcarpentry.github.io/shell-novice/

cd ../..
cd ~ #user home "/home/user_name/""
cd / #to root home "/""

">" redirect the command's output to a file
Be careful when using ">", it will override with new contents and the previous contents will be lost.

touch length.txt
wc -lwc *.pdb > length.txt
sort -n length.txt > sorted.length.txt
head -n 6 sorted.length.txt
Difference between ">" & ">>"

">" redirect the command's output to a file
">>" append the command's output to a file

echo hello > testfile01.txt
echo hello > testfile01.txt
echo hello >> testfile02.txt
echo hello >> testfile02.txt
cat testfile01.txt
cat testfile02.txt

"|" between two commands is called a pipe.
Use the output of the command on the left as the input to the command on the right.

wc -l *.pdb | sort -n
wc -l *.pdb | sort # alphabetical order
wc -l *.pdb | sort -n | head -n 5
wc -l *.pdb | sort -n | tail -n 5

*[AB].txt

*matches any number of characters; the expression

*[AB] matches either an ‘A’ or a ‘B’,

"uniq" Only Remove Adjacent Duplicates

the file data-shell/data/salmon.txt contains:

cat samon.txt
# output:
coho
coho
steelhead
coho
steelhead
steelhead
uniq salmon.txt
# output:
coho
steelhead
coho
steelhead

sort salmon.txt | uniq
# output:
coho
steelhead
# show the total cout of each type of animals in the file
cut -d, -f 2 animals.txt | sort | uniq -c
# output:
1 bear
2 deer
1 fox
3 rabbit
1 raccoon
Appending Data
touch animalsUpd.txt
head -3 animals.txt > animalsUpd.txt
tail -2 animals.txt >> animalsUpd.txt
  • The best way to use the shell is to use pipes to combine simple single-purpose programs (filters).

Loops

for filename in *.dat
do
    echo $filename
    head -n 100 $filename | tail -n 20
done

rename multiple files

$ for filename in *.dat
> do
>    cp $filename original-$filename
> done
# use "echo" for debugging
$ for filename in *.dat
> do
>   echo cp $filename original-$filename
> done
# output:
cp basilisk.dat original-basilisk.dat
cp unicorn.dat original-unicorn.dat
$ cd north-pacific-gyre/2012-07-03
$ for datafile in NENE*[AB].txt
> do
>     echo $datafile
> done

History Commands

$ history

Ctrl-R

!$

less !$

Nested loops

for species in cubane ethane methane
do
    for temperature in 25 30 37 40
    do
        mkdir $species-$temperature
    done
done

Inside a shell script, $1 means “the first filename (or other argument) on the command line”, $2 the filename or argument.
("$@" is equivalent to "$1" "$2" …)
Create middle.sh and enter the following code:

# Select lines from middle of a file (line 11 to line 20)
# Useage: bash middle.sh filename end_line num_lines
head -n 15 "$1" | tail -n 5
head -n "$2" "$1" | tail -n "$3"
$ bash middle.sh octane.pdb
$ bash middle.sh octane.pdb 10 5

Write a shell script called species.sh that takes any number of filenames as command-line arguments, and uses cut, sort, and uniq to print a list of the unique species appearing in each of those files separately.

# Script to find unique species in csv files where species is the second data field
# This script accepts any number of file names as command line arguments

# Loop over all files
for file in $@ 
do
    echo "Unique species in $file:"
    # Extract species names
    cut -d , -f 2 $file | sort | uniq
done

find

find .
# find all directories
find . -type -d
# find all files
find . -type -f
# find all 
find . -name filename.extension
find . -name 'filenane.extension'

command1 $(command2)
When the shell executes this command, the first thing it does is run whatever is inside the $(). It then replaces the $() expression with that command’s output.

$ wc -l $(find . -name '*.txt')

推荐阅读更多精彩内容