Unix
Getting started
We made a video to remind people about how to get comfortable with UNIX commands:
This is the easiest book for learning this stuff; it is short and gets right to the point:
https://go.oreilly.com/purdue-university/library/view/-/0596002610
you just log in and you can see it all; we suggest Chapters 1, 3, 4, 5, 7 (you can basically skip chapters 2 and 6 the first time through).
It is a very short read (maybe, say, 2 or 3 hours altogether?), just a thin book that gets right to the details.
Zoe Yang asked us about the difference in these 5 words: bash/Linux/terminal/shell/UNIX. Here you go:
UNIX(Unix) and Linux are operating systems, just like Mac OS X and Windows 10 are operating systems. There are many variants. Within Linux, the main different is the kernel (the main piece of code that makes things work) and sometimes the default configurations, like the GUI (i.e., the way stuff looks when you log in and see your desktop and interact with the windows and folders and files). UNIX dates back to the 1970's, and was from AT&T Bell Labs, and then people decided to make lots and lots of variants of this, and hence, the many flavors of Linux.
OK? The terminal is an application that runs in UNIX or Linux. It is the thing that you open and type things into it, and you see the output.
It is hard to tell the difference between the terminal and the shell. The shell is the way that you interact with UNIX/Linux directly (without pointing and clicking). You can tell the shell directly what you want to do with the files on the computer, for instance. You might think that the terminal and the shell are the same thing, but they are not quite. There are lots of different types of shells that can run in the terminal. To see which one you are using, you can type:
echo $SHELL
By default, it will say:
/bin/bash
There are other shells in your /bin
directory. bash
(Bourne Again SHell) is the default one. Many people consider this to be the "best" shell, or at least, the one that people know the most. Others are Bourne (sh
), Korn (ksh
), Z shell (zsh
), C shell (csh
), TENEX C shell (tcsh
), and dozens more. Any of these shells would run in the terminal, just like bash
does, and you might not even realize at the start which shell you are using, unless you type the command mentioned above:
echo $SHELL
They each have differences, but some of the differences are small. Again, bash
is still the default on most Linux operating systems. A big recent change is that Mac OS Catalina just started using zsh
instead of bash
as the default shell but it is just because of a licensing issue, and Dr Ward thinks that Mac users who open the terminal and use the shell are very likely to switch from zsh
back to bash
. That's what Dr Ward did immediately when Apple made this change to zsh
, i.e., he switched back to bash
.
Wow, sorry for the long-winded answer.
Standard utilities
man
man
stand for manual and is a command which presents all of the information you need in order to use a command. To use man
simply execute man <command>
where command is the command for which you want to read the manual.
You can scroll up by typing "k" or the up arrow. You can scroll down by typing "j" or the down arrow. To exit the man pages, type "q" (for quit).
How do I show the man pages for the wc
utility?
Click here for solution
man wc
cat
cat
stands for concatenate and print files. It is an extremely useful tool that prints the entire contents of a file by default. This is especially useful when we want to quickly check to see what is inside of a file. It can be used as a tool to output the contents of a file and immediately pipe the contents to another tool for some sort of analysis if the other tool doesn't natively support reading the contents from the file.
A similar, but alternative UNIX command that incrementally shows the contents of the file is called less
. less
starts at the top of the file and scrolls through the rest of the file as the user pages down.
head
head
is a simple utility that displays the first n lines of a file, or input.
How do I show the first 5 lines of a file called input.txt
?
Click here for solution
head -n5 input.txt
Alternatively:
cat input.txt | head -n5
tail
tail
is a similar utility to head
, that displays the last n lines of a file, or input.
How do I show the last 5 lines of a file called input.txt
?
Click here for solution
tail -n5 input.txt
Alternatively:
cat input.txt | tail -n5
ls
ls
is a utility that lists files and folders. By default, ls
will list the files and folders in your current working directory. To list files in a certain directory, simply provide the directory to ls
as the first argument.
How do I list the files in my $HOME
directory?
Click here for solution
ls $HOME
# or
ls ~
How do I list the files in the directory /home/$USER/projects
?
Click here for solution
ls /home/$USER/projects
How do I list all files and folders in /home/$USER/projects
in a list format, including information like permissions, filesize, etc?
Click here for solution
ls -l /home/$USER/projects
du
du
is a tool used to get file space usage.
Examples
How do I get the size of a file called ./metadata.csv
in bytes?
Click here for solution
du -b ./metadata.csv
How do I get the size of a file called ./metadata.csv
in kilobytes?
Click here for solution
du -k ./metadata.csv
## 1792 ./metadata.csv
Why is the result of du -b ./metadata.csv
divided by 1024 not the result of du -k ./metadata.csv
?
Click here for solution
du
reports disk usage by default not necessarily actual size. File systems typically divide a disk into blocks. When a program tells the file system it wants say 3 bytes of space, if the block size is 1024 bytes, the file system may allocate 1024 bytes of space to store the 3 bytes of data. To see the apparent size, do this:
du -b ./metadata.csv
du -k --apparent-size ./metadata.csv
cp
cp
is a utility used for copying files an folders from one location to another.
How do I copy /home/$USER/some_file.txt
to /home/$USER/projects/same_file.txt
?
Click here for solution
cp /home/$USER/some_file.txt /home/$USER/projects/same_file.txt
# If currently in /home/$USER
cd $HOME
cp some_file.txt projects/same_file.txt
# If currently in /home/$USER/projects
cd $HOME/projects
cp ../some_file.txt .
mv
mv
very similar to cp
, but rather than copy a file, mv
moves the file. Moving a file removes it from its old location and places it in the new location.
How do I move /home/$USER/some_file.txt
to /home/$USER/projects/same_file.txt
?
Click here for solution
mv /home/$USER/some_file.txt /home/$USER/projects/same_file.txt
# If currently in /home/$USER
cd $HOME
mv some_file.txt projects/same_file.txt
# If currently in /home/$USER/projects
cd $HOME/projects
mv ../some_file.txt .
touch
touch
is a command used to update the access and modification times of a file to the current time. More commonly, it is used to create an empty file that you can add contents to later on. To use this command, type touch
followed by the file name (with the intended file path added when necessary).
mkdir
mkdir
is the command to create a directory. It is simple to use, just type mkdir
followed by a path to the new directory.
Examples
How do I create a new directory called my_directory
in the current directory?
Click here for solution
mkdir my_directory
How do I create a new directory called my_directory
in the parent directory?
Click here for solution
mkdir ../my_directory
How do I create a set of two new nested directories in the current directory?
Click here for solution
# You can either make the directories one at a time like this:
mkdir first_dir
cd first_dir
mkdir second_dir
# Or, you can use the -p option:
mkdir -p first_dir/second_dir
rm
rm
is the command to remove files or directories. You can find the available options by checking its manual page.
Examples
How do I remove a folder called my_folder
and all of its contents recursively. Assume my_folder
is in /home/user/projects
.
Click here for solution
rm -r /home/user/projects/my_folder
How do I remove all files in a folder ending in .txt
? Assume we are looking at files in /home/user/projects
.
Click here for solution
rm /home/user/projects/*.txt
rmdir
rmdir
is a tool to remove empty directories. Simply type rmdir
followed by the path to the empty directory you'd like to remove. Note that this command only removes empty directories. For this reason, rm
is better suited to remove directories with content.
pwd
pwd
stands for print working directory and it does just that -- it prints the current working directory to standard output.
type
type
is a useful command to find the location of some command, or whether the command is an alias, function, or something else.
Where is the file that is executed when I type ls
?
Click here for solution
type ls
## ls is /bin/ls
uniq
uniq
reads the lines of a specified input file and compares each adjacent line and returns each unique line. Repeated lines in the input will not be detected if they are not adjacent. What this means is you must sort prior to using uniq
if you want to ensure you have no duplicates.
wc
You can think of wc
as standing for "word count". wc
displays the number of lines, words, and bytes from the input file.
How do I count the number of lines of an input file called input.txt
?
Click here for solution
wc -l input.txt
How do I count the number of characters of an input file called input.txt
?
Click here for solution
wc -m input.txt
How do I count the number of words of an input file called input.txt
?
Click here for solution
wc -w input.txt
ssh
mosh
scp
cut
cut
is a tool to cut out parts of a line based on position/character/delimiter/etc and directing the output to stdout. It is particularly useful to get a certain column of data.
How do I get the first column of a csv file called 'office.csv`?
Click here for solution
cut -d, -f1 office.csv
How do I get the first and third column of a csv file called 'office.csv`?
Click here for solution
cut -d, -f1,3 office.csv
How do I get the first and third column of a file with columns separated by the "|" character?
Click here for solution
cut -d '|' -f1,3 office.csv
sed
grep
It is very simple to get started searching for patterns in files using grep
.
How do I search for lines with the word "Exact" in the file located /home/john/report.txt
?
Click here for solution
grep Exact /home/john/report.txt
# or
grep 'Exact' '/home/john/report.txt'
How do I search for lines with the word "Exact" or "exact" in the file located /home/john/report.txt
?
Click here for solution
# The -i option means that the text we are searching for is
# not case-sensitive. So the following lines will match
# lines that contain "Exact" or "exact" or "ExAcT".
grep -i Exact /home/john/report.txt
# or
grep -i 'Exact' '/home/john/report.txt'
How do I search for lines with a string containing multiple words, like "how do I"?
Click here for solution
# The -i option means that the text we are searching for is
# not case-sensitive. So the following lines will match
# lines that contain "Exact" or "exact" or "ExAcT".
# By adding quotes, we are able to search for the entire
# string "how do i". Without the quotes this would only
# search for "how".
grep -i 'how do i' /home/john/report.txt
How do I search for lines with the word "Exact" or "exact" in the files in the folder and all sub-folders located /home/john/
?
Click here for solution
# The -R option means to search recursively in the folder
# /home/john. A recursive search means that it will search
# all folders and sub-folders starting with /home/john.
grep -Ri Exact /home/john
How do I search for the lines that don't contain the words "Exact" or "exact" in the folder and all sub-folders located /home/john/
?
Click here for solution
# The -v option means to search for an inverted match.
# In this case it means search for all lines of text
# where the word "exact" is not found.
grep -Rvi Exact /home/john
How do I search for lines where one or more of the words "first" or "second" appears in the current folder and all sub-folders?
Click here for solution
# The "|" character in grep is the logical OR operator.
# If we do not escape the "|" character with a preceding
# "\" grep searches for the literal string "first|second"
# instead of "first" OR "second".
grep -Ri 'first\|second' .
How do I search for lines that begin with the word "Exact" (case insensitive) in the folder and all sub-folders located in the current directory?
Click here for solution
The "^" is called an anchor and indicates the start of a line.
grep -Ri '^Exact' .
How do I search for lines that end with the word "Exact" (case insensitive) in the files in the current folder and all sub-folders?
Click here for solution
The "$" is called an anchor and indicates the end of a line.
grep -Ri 'Exact$' .
How do I search for lines that contain only the word "Exact" (case insensitive) in the files in the current folder and all sub-folders?
Click here for solution
grep -Ri '^Exact$' .
How do I search for strings or sub-strings where the first character could be anything, but the next two characters are "at"? For example: "cat", "bat", "hat", "rat", "pat", "mat", etc.
Click here for solution
The "." is a wildcard, meaning it matches any character (including spaces).
grep -Ri '.at' .
How do I search for zero or one of, zero or more of, one or more of, exactly n of a certain character using grep and regular expressions?
Click here for solution
"*" stands for 0+ of the previous character. "+" stands for 1+ of the previous character. "?" stands for 0 or 1 of the previous character. "{n}" stands for exactly n of the previous character.
# Matches any lines with text like "cat", "bat", "hat", "rat", "pat", "mat", etc.
# Does NOT match "at", but does match " at". The "." indicates a single character.
grep -Ri '.at' .
# Matches any lines with text like "cat", "bat", "hat", "rat", "pat", "mat", etc.
# Matches "at" as well as " at". The "." followed by the "?" means
# 0 or 1 of any character.
grep -Ri '.?at' .
# Matches any lines with any amount of text followed by "at".
grep -Ri '.*at' .
# Only matches words that end in "at": "bat", "cat", "spat", "at". Does not match "spatula".
grep -Ri '.*at$' .
# Matches lines that contain consecutive "e"'s.
grep -Ri '.*e{2}.*' .
# Matches any line. 0+ of the previous character, which in this case is the wildcard "."
# that represents any character. So 0+ of any character.
grep -Ri '.*'
Resources
https://regex101.com/ is an excellent tool that helps you quickly test and better understand writing regular expressions. It allows you to test four different "flavors" or regular expressions: PCRE (PHP), ECMAScript (JavaScript), Python, and Golang. regex101 also provides a library of useful, pre-made regular expressions.
This is an excellent resource to better understand positive and negative lookahead and lookbehind operations using grep
.
An excellent quick reference for regular expressions. Examples using grep
in R.
ripgrep
ripgrep
is a "line-oriented search tool that recursively searches your current directory for a regex pattern." You can read about why you may want to use ripgrep
here. Generally, ripgrep
is frequently faster than grep
. If you are working with code it has sane defaults (respects .gitignore). You can easily search for specific types of files.
How do I exclude a filetype when searching for foo
in my_directory
?
Click here for solution
# exclude javascript (.js) files
rg -Tjs foo my_directory
# exclude r (.r) files
rg -Tr foo my_directory
# exclude Python (.py) files
rg -Tpy foo my_directory
How do I search for a particular filetype when searching for foo
in my_directory
?
Click here for solution
# search javascript (.js) files
rg -tjs foo my_directory
# search r (.r) files
rg -tr foo my_directory
# search Python (.py) files
rg -tpy foo my_directory
How do I search for a specific word, where the word isn't part of another word?
Click here for solution
# this is roughly equivalent to putting \b before and after all search patterns in grep
rg -w foo my_directory
How do I replace every match foo
in my_directory
with the text given, bar
, when printing results?
Click here for solution
rg foo my_directory -r bar
How do I trim whitespace from the beginning and ending of each printed line?
Click here for solution
rg foo my_directory --trim
How do I follow symbolic links when searching a directory, my_directory
?
Click here for solution
rg -L foo my_directory
find
find
is an aptly named tool that traverses directories and searches for files.
Examples
How do I find a file named foo.txt
in the current working directory or subdirectories?
Click here for solution
find . -name foo.txt
How do I find a file named foo.txt
or Foo.txt
or FoO.txt
(i.e. ignoring case) in the current working directory or subdirectories?
Click here for solution
find . -iname foo.txt
# or
find . -i -name foo.txt
How do I find a directory named foo
in the current working directory or subdirectories?
Click here for solution
find . -type d -name foo
How do I find all of the Python files in the current working directory or subdirectories?
Click here for solution
find . -name "*.py"
How do I find files over 1gb in size in the current working directory or subdirectories?
Click here for solution
find . -size +1G
How do I find files under 10mb in size in the current working directory or subdirectories?
Click here for solution
find . -size -10M
less
less
is a utility that opens a page of text from a file and allows the user to scroll forward or backward in the file using "j" and "k" keys or down and up arrows. less
does not read the entire file into memory at once, and is therefore faster when loading large files.
How do I display the contents of a file, foo.txt
?
Click here for solution
less foo.txt
How do I scroll up and down in less
?
Click here for solution
To scroll down use "j" or the down arrow. To scroll up use "k" or the up arrow.
How do I exit less
?
Click here for solution
Press the "q" key on your keyboard.
sort
sort
is a utility that sorts lines of text.
Examples
How do I sort a csv, flights_sample.csv
alphabetically by the 18th column?
Click here for solution
# the r option sorts ascending
sort -t, -k18,18 flights_sample.csv
## 1990,10,18,7,729,730,847,849,PS,1451,NA,78,79,NA,-2,-1,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,19,1,749,730,922,849,PS,1451,NA,93,79,NA,33,19,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,21,3,728,730,848,849,PS,1451,NA,80,79,NA,-1,-2,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,22,4,728,730,852,849,PS,1451,NA,84,79,NA,3,-2,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,23,5,731,730,902,849,PS,1451,NA,91,79,NA,13,1,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,24,6,744,730,908,849,PS,1451,NA,84,79,NA,19,14,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance,TaxiIn,TaxiOut,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay
## 1987,10,14,3,741,730,912,849,PS,1451,NA,91,79,NA,23,11,SAN,SFO,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1990,10,15,4,729,730,903,849,PS,1451,NA,94,79,NA,14,-1,SAN,SFO,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1990,10,17,6,741,730,918,849,PS,1451,NA,97,79,NA,29,11,SAN,SFO,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
How do I sort a csv, flights_sample.csv
alphabetically by the 18th column, and then in descending order by the 4th column?
Click here for solution
sort -t, -k18,18 -k4,4r flights_sample.csv
## 1990,10,18,7,729,730,847,849,PS,1451,NA,78,79,NA,-2,-1,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,24,6,744,730,908,849,PS,1451,NA,84,79,NA,19,14,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,23,5,731,730,902,849,PS,1451,NA,91,79,NA,13,1,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,22,4,728,730,852,849,PS,1451,NA,84,79,NA,3,-2,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,21,3,728,730,848,849,PS,1451,NA,80,79,NA,-1,-2,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1991,10,19,1,749,730,922,849,PS,1451,NA,93,79,NA,33,19,SAN,ABC,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance,TaxiIn,TaxiOut,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay
## 1990,10,17,6,741,730,918,849,PS,1451,NA,97,79,NA,29,11,SAN,SFO,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1990,10,15,4,729,730,903,849,PS,1451,NA,94,79,NA,14,-1,SAN,SFO,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
## 1987,10,14,3,741,730,912,849,PS,1451,NA,91,79,NA,23,11,SAN,SFO,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA
git
See here.
awk
awk
is a powerful programming language that specializes in processing and manipulating text data.
In awk, a command looks something like this:
awk -F, 'BEGIN{ } { } END{ }'
The delimiter is specified with the -F
option (in this case our delimiter is a comma). The BEGIN chunk is run only once at the start of execution. The middle chunk is run once per line of the file. The END chunk is run only once, at the end of execution.
The BEGIN and END portions are always optional.
The variables: $1
, $2
, $3
, etc., refer to the 1st, 2nd, and 3rd fields in a line of data. For example, the following would print the 4th field of every row in a csv file:
awk -F, '{print $4}'
$0
represents the entire row.
awk
is very powerful. We can achieve the same effect as using cut
:
head 5000_products.csv | cut -d, -f3
# or
head 5000_products.csv | awk -F, '{print $3}'
Built in variables
awk
has some special built in variables that can be very useful. See here.
Examples
How do I print only rows where the DAYOFWEEK
is 5
?
Click here for solution
head metadata.csv | awk -F, '{if ($3 == 5) {print $0}}'
## 01/01/2015,,5,0,0,1,2015,CHRISTMAS PEAK,0,5,nyd,1,,,,0,0,CHRISTMAS PEAK,73.02,59.81,66.41,,0,,0,,0,,0,,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,17:42,1,1,0,0,18,19,17,0,0,0,0,0,0,0,1,13,17,15,0,0,0,0,1,0,14,16,14,0,1,0,0,0,0,11,15,12,8:00,25:00,17,7:00,25:00,8:00,26:00,18,8:00,25:00,17,8:00,21:00,13,8:00,21:00,8:00,25:00,17,8:00,21:00,13,8:00,22:00,14,8:00,22:00,8:00,24:00,16,8:00,22:00,14,8:00,19:00,11,8:00,19:00,8:00,22:00,14,8:00,20:00,12,1,1,0,0,NONE,53.375714286,70.3,50.2,0.12,616246,367265,296273,236654,53904354,34718635,26907827,20971646,1600,1000,2,12:00,15:30,Disney Festival of Fantasy Parade,1,22:15,,Main Street Electrical Parade,1,21:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,3,18:30,20:00,Fantasmic!,1,0,,,,,0,,,
## 01/08/2015,,5,7,1,1,2015,CHRISTMAS,8,0,,0,,marwk,,0,1,CHRISTMAS,59.44,38.7,49.07,,0,,0,,0,,0,,88%,94%,99%,78%,97%,83%,69%,94%,100%,100%,100%,76%,100%,100%,93%,100%,100%,100%,100%,100%,100%,63%,93%,17:47,1,0,0,0,13,12,12,0,0,0,0,0,0,0,1,12,12,14,0,0,0,0,0,0,10,10,10,0,1,0,0,0,0,8,9,9,9:00,21:00,12,8:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,19:00,10,9:00,19:00,9:00,19:00,10,9:00,19:00,10,9:00,17:00,8,9:00,17:00,9:00,17:00,8,9:00,18:00,9,1,1,0,0,NONE,48.372142857,70.3,49.4,0.08,615046,367265,296273,236654,53894754,34718635,26907827,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,2,19:00,21:00,Main Street Electrical Parade,1,20:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,1,19:00,,Fantasmic!,1,0,,,,,0,,,
How do I print the first, fourth, and fifth columns of rows where the DAYOFWEEK
is 5
?
Click here for solution
head metadata.csv | awk -F, '{if ($3 == 5) {print $1, $4, $5}}'
## 01/01/2015 0 0
## 01/08/2015 7 1
How do I print only rows where DAYOFWEEK
is 5
OR YEAR
is 2015
?
Click here for solution
head metadata.csv | awk -F, '{if ($3 == 5 || $7 == 2015) {print $0}}'
## 01/01/2015,,5,0,0,1,2015,CHRISTMAS PEAK,0,5,nyd,1,,,,0,0,CHRISTMAS PEAK,73.02,59.81,66.41,,0,,0,,0,,0,,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,17:42,1,1,0,0,18,19,17,0,0,0,0,0,0,0,1,13,17,15,0,0,0,0,1,0,14,16,14,0,1,0,0,0,0,11,15,12,8:00,25:00,17,7:00,25:00,8:00,26:00,18,8:00,25:00,17,8:00,21:00,13,8:00,21:00,8:00,25:00,17,8:00,21:00,13,8:00,22:00,14,8:00,22:00,8:00,24:00,16,8:00,22:00,14,8:00,19:00,11,8:00,19:00,8:00,22:00,14,8:00,20:00,12,1,1,0,0,NONE,53.375714286,70.3,50.2,0.12,616246,367265,296273,236654,53904354,34718635,26907827,20971646,1600,1000,2,12:00,15:30,Disney Festival of Fantasy Parade,1,22:15,,Main Street Electrical Parade,1,21:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,3,18:30,20:00,Fantasmic!,1,0,,,,,0,,,
## 01/02/2015,,6,1,0,1,2015,CHRISTMAS,2,5,,0,,,,0,0,CHRISTMAS,78,60.72,69.36,,0,,0,,0,,0,,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,17:43,0,1,0,0,17,18,16,0,0,0,0,0,1,0,0,15,13,12,0,0,1,0,0,0,14,14,14,0,0,0,0,0,0,12,11,11,8:00,25:00,17,8:00,25:00,8:00,25:00,17,9:00,25:00,16,8:00,21:00,13,8:00,23:00,8:00,21:00,13,9:00,21:00,12,8:00,22:00,14,8:00,22:00,8:00,22:00,14,9:00,22:00,13,8:00,20:00,12,8:00,20:00,8:00,19:00,11,8:00,19:00,11,1,1,0,0,NONE,53.750714286,70.3,50,0.12,616246,367265,296273,236654,53904354,34718635,26907827,20971646,1600,1000,2,12:00,15:30,Disney Festival of Fantasy Parade,1,22:15,,Main Street Electrical Parade,1,21:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,3,18:30,20:00,Fantasmic!,1,0,,,,,0,,,
## 01/03/2015,,7,2,0,1,2015,CHRISTMAS,3,0,,0,,,,0,0,CHRISTMAS,83.12,67.31,75.22,,0,,0,,0,,0,,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,17:44,0,0,0,0,16,17,15,0,0,0,0,0,0,1,0,12,15,12,1,0,0,0,0,0,14,14,11,0,0,1,0,0,0,11,12,12,9:00,25:00,16,9:00,25:00,8:00,25:00,17,9:00,24:00,15,9:00,21:00,12,9:00,21:00,8:00,21:00,13,9:00,21:00,12,9:00,22:00,13,8:00,22:00,8:00,22:00,14,9:00,20:00,11,8:00,19:00,11,8:00,19:00,8:00,20:00,12,9:00,20:00,11,1,1,0,0,NONE,49.212857143,70.3,49.9,0.07,616246,367265,296273,236654,53904354,34718635,26907827,20971646,1600,1000,2,12:00,15:30,Disney Festival of Fantasy Parade,1,22:15,,Main Street Electrical Parade,1,21:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,2,18:30,20:00,Fantasmic!,1,0,,,,,0,,,
## 01/04/2015,,1,3,1,1,2015,CHRISTMAS,4,0,,0,,,,0,0,CHRISTMAS,83.93,67.97,75.95,,0,,0,,0,,0,,67%,74%,77%,74%,74%,70%,66%,94%,68%,57%,56%,70%,79%,43%,93%,100%,100%,100%,100%,100%,48%,63%,84%,17:44,0,0,0,0,15,16,14,0,0,0,0,0,0,0,0,12,12,12,0,1,0,0,0,1,11,14,13,1,0,0,0,0,0,12,11,8,9:00,24:00,15,9:00,24:00,9:00,25:00,16,9:00,23:00,14,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,20:00,11,9:00,20:00,9:00,22:00,13,9:00,20:00,11,9:00,20:00,11,8:00,20:00,8:00,19:00,11,9:00,17:00,8,1,1,0,0,NONE,48.270714286,70.3,49.8,0.12,616246,367265,296273,236654,53904354,34718635,26907827,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,2,20:00,22:00,Main Street Electrical Parade,1,21:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,2,19:00,20:30,Fantasmic!,1,0,,,,,0,,,
## 01/05/2015,,2,4,1,1,2015,CHRISTMAS,5,0,,0,,,,0,0,CHRISTMAS,72.3,56.89,64.6,,0,,0,,0,,0,,67%,74%,77%,74%,74%,70%,66%,94%,68%,57%,56%,70%,79%,43%,93%,100%,100%,100%,100%,100%,48%,63%,84%,17:45,0,0,0,0,14,15,12,0,0,0,0,1,0,0,0,12,12,13,0,0,0,1,0,0,13,11,10,0,1,0,0,0,0,8,12,8,9:00,23:00,14,9:00,23:00,9:00,24:00,15,9:00,21:00,12,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,20:00,11,9:00,22:00,9:00,20:00,11,9:00,19:00,10,9:00,17:00,8,9:00,17:00,9:00,20:00,11,9:00,17:00,8,1,1,0,0,NONE,48.971538462,70.3,49.6,0.12,616246,367265,306272,236654,53904354,34718635,27897728,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,2,20:00,22:00,Main Street Electrical Parade,1,21:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,2,19:00,20:30,Fantasmic!,1,0,,,,,0,,,
## 01/06/2015,,3,5,1,1,2015,CHRISTMAS,6,0,,0,,,,0,0,CHRISTMAS,77.67,54.88,66.28,,0,,0,,0,,0,,86%,92%,98%,77%,96%,82%,69%,94%,100%,98%,98%,76%,100%,96%,93%,100%,100%,83%,100%,100%,92%,63%,93%,17:46,0,0,0,0,12,14,12,0,0,1,0,0,0,0,0,13,12,12,0,0,0,0,1,0,10,13,10,0,0,1,0,0,0,8,8,9,9:00,21:00,12,9:00,21:00,9:00,23:00,14,9:00,21:00,12,9:00,21:00,12,8:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,19:00,10,9:00,19:00,9:00,20:00,11,9:00,19:00,10,9:00,17:00,8,9:00,17:00,9:00,17:00,8,9:00,17:00,8,1,1,0,0,NONE,50.093571429,70.2,49.5,0.12,615046,367265,296273,236654,53894754,34718635,26907827,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,0,,,,1,20:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,1,19:00,,Fantasmic!,1,0,,,,,0,,,
## 01/07/2015,,4,6,1,1,2015,CHRISTMAS,7,0,,0,,marwk,,0,1,CHRISTMAS,67.24,48.56,57.9,,0,,0,,0,,0,,88%,94%,99%,78%,97%,83%,69%,94%,100%,100%,100%,76%,100%,100%,93%,100%,100%,100%,100%,100%,100%,63%,93%,17:47,0,0,1,0,12,12,13,0,0,0,1,0,0,0,0,12,13,12,0,0,0,0,0,0,10,10,10,1,0,0,0,0,0,9,8,8,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,19:00,10,9:00,19:00,9:00,19:00,10,9:00,19:00,10,9:00,17:00,8,8:00,17:00,9:00,17:00,8,9:00,17:00,8,1,1,0,0,NONE,47.188571429,70.3,49.5,0.12,615046,367265,296273,236654,53894754,34718635,26907827,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,0,,,,1,20:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,1,19:00,,Fantasmic!,1,0,,,,,0,,,
## 01/08/2015,,5,7,1,1,2015,CHRISTMAS,8,0,,0,,marwk,,0,1,CHRISTMAS,59.44,38.7,49.07,,0,,0,,0,,0,,88%,94%,99%,78%,97%,83%,69%,94%,100%,100%,100%,76%,100%,100%,93%,100%,100%,100%,100%,100%,100%,63%,93%,17:47,1,0,0,0,13,12,12,0,0,0,0,0,0,0,1,12,12,14,0,0,0,0,0,0,10,10,10,0,1,0,0,0,0,8,9,9,9:00,21:00,12,8:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,19:00,10,9:00,19:00,9:00,19:00,10,9:00,19:00,10,9:00,17:00,8,9:00,17:00,9:00,17:00,8,9:00,18:00,9,1,1,0,0,NONE,48.372142857,70.3,49.4,0.08,615046,367265,296273,236654,53894754,34718635,26907827,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,2,19:00,21:00,Main Street Electrical Parade,1,20:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,1,19:00,,Fantasmic!,1,0,,,,,0,,,
## 01/09/2015,,6,8,1,1,2015,CHRISTMAS,9,0,,0,,marwk,,0,1,CHRISTMAS,54.89,45.37,50.13,,0,,0,,0,,0,,88%,94%,99%,78%,97%,83%,69%,94%,100%,100%,100%,76%,100%,100%,93%,100%,100%,100%,100%,100%,100%,63%,93%,17:48,0,1,0,0,12,13,14,0,1,0,0,0,1,0,0,14,12,12,0,0,1,0,0,0,10,10,12,0,0,0,0,0,0,9,8,11,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,23:00,14,9:00,21:00,12,9:00,23:00,9:00,21:00,12,9:00,21:00,12,9:00,19:00,10,9:00,19:00,9:00,19:00,10,9:00,20:00,11,9:00,18:00,9,9:00,18:00,9:00,17:00,8,9:00,20:00,11,1,1,0,0,NONE,51.094285714,70.3,49.3,0.11,615046,367265,296273,236654,53894754,34718635,26907827,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,1,19:00,,Main Street Electrical Parade,1,20:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,1,19:00,,Fantasmic!,1,0,,,,,0,,,
How do I print only rows where DAYOFWEEK
is 5
AND YEAR
is 2015
?
Click here for solution
head metadata.csv | awk -F, '{if ($3 == 5 && $7 == 2015) {print $0}}'
## 01/01/2015,,5,0,0,1,2015,CHRISTMAS PEAK,0,5,nyd,1,,,,0,0,CHRISTMAS PEAK,73.02,59.81,66.41,,0,,0,,0,,0,,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,0%,17:42,1,1,0,0,18,19,17,0,0,0,0,0,0,0,1,13,17,15,0,0,0,0,1,0,14,16,14,0,1,0,0,0,0,11,15,12,8:00,25:00,17,7:00,25:00,8:00,26:00,18,8:00,25:00,17,8:00,21:00,13,8:00,21:00,8:00,25:00,17,8:00,21:00,13,8:00,22:00,14,8:00,22:00,8:00,24:00,16,8:00,22:00,14,8:00,19:00,11,8:00,19:00,8:00,22:00,14,8:00,20:00,12,1,1,0,0,NONE,53.375714286,70.3,50.2,0.12,616246,367265,296273,236654,53904354,34718635,26907827,20971646,1600,1000,2,12:00,15:30,Disney Festival of Fantasy Parade,1,22:15,,Main Street Electrical Parade,1,21:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,3,18:30,20:00,Fantasmic!,1,0,,,,,0,,,
## 01/08/2015,,5,7,1,1,2015,CHRISTMAS,8,0,,0,,marwk,,0,1,CHRISTMAS,59.44,38.7,49.07,,0,,0,,0,,0,,88%,94%,99%,78%,97%,83%,69%,94%,100%,100%,100%,76%,100%,100%,93%,100%,100%,100%,100%,100%,100%,63%,93%,17:47,1,0,0,0,13,12,12,0,0,0,0,0,0,0,1,12,12,14,0,0,0,0,0,0,10,10,10,0,1,0,0,0,0,8,9,9,9:00,21:00,12,8:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,21:00,12,9:00,21:00,9:00,21:00,12,9:00,21:00,12,9:00,19:00,10,9:00,19:00,9:00,19:00,10,9:00,19:00,10,9:00,17:00,8,9:00,17:00,9:00,17:00,8,9:00,18:00,9,1,1,0,0,NONE,48.372142857,70.3,49.4,0.08,615046,367265,296273,236654,53894754,34718635,26907827,20971646,1600,1000,1,15:00,,Disney Festival of Fantasy Parade,2,19:00,21:00,Main Street Electrical Parade,1,20:00,,Wishes Nighttime Spectacular,1,21:00,,IllumiNations: Reflections of Earth,0,,,0,,,,1,19:00,,Fantasmic!,1,0,,,,,0,,,
How do I get the average of values in a column containing the max temperature, WDWMAXTEMP
?
Click here for solution
# Here NR represents the number of rows
head metadata.csv | awk -F, '{sum = sum + $19}END{print "Average max temp: " sum/NR}'
# Or alternatively we could track the number of rows as we go
head metadata.csv | awk -F, '{sum = sum + $19; count++}END{print "Average max temp: " sum/count}'
## Average max temp: 64.961
## Average max temp: 64.961
How do I get counts of each unique value in a column, SEASON
?
Click here for solution
When executing the middle chunk of code, awk
will create a set of values called seasons
, whose elements are named by unique values in the 8-th column SEASON
. For the SEASON
value in a line, awk
will add 1 to the corresponding element (this is ++
). Thus, we get the count for each unique value.
In the END chunk of code, we print out season
by going through its elements. The season
in for (season in seasons)
refers to the name of the elements. To access the actual value, we use seasons[season]
.
This is just one example of arrays in awk
. You can find more details here: https://www.gnu.org/software/gawk/manual/html_node/Arrays.html
cat metadata.csv | awk -F, '{seasons[$8]++}END{for (season in seasons) {print season, seasons[season]}}'
## SUMMER BREAK 236
## CHRISTMAS 245
## JERSEY WEEK 50
## SEPTEMBER LOW 140
## PRESIDENTS WEEK 55
## FALL 212
## HALLOWEEN 26
## MEMORIAL DAY 20
## CHRISTMAS PEAK 176
## SEASON 1
## COLUMBUS DAY 20
## SPRING 490
## THANKSGIVING 60
## EASTER 95
## MARTIN LUTHER KING JUNIOR DAY 45
## MARDI GRAS 15
## JULY 4TH 25
## WINTER 222
How do I get counts of each unique value in a column, SEASON
, but only print the values for FALL
, WINTER
, SUMMER
, and SPRING
?
Click here for solution
cat metadata.csv | awk -F, '{seasons[$8]++}END{for (season in seasons) {if (season == "FALL" || season == "SUMMER" || season == "WINTER" || season == "SPRING") print season, seasons[season]}}'
## FALL 212
## SPRING 490
## WINTER 222
Or a better solution would be to use the ~
operator:
cat metadata.csv | awk -F, '{seasons[$8]++}END{for (season in seasons) {if (season ~ /WINTER|SPRING|SUMMER|FALL/) print season, seasons[season]}}'
## SUMMER BREAK 236
## FALL 212
## SPRING 490
## WINTER 222
If you want to exclude "SUMMER BREAK", use the $
regular expression anchor. This forces it to only accept strings where the entire string ends in "SUMMER" so "SUMMER BREAK" is excluded as it ends in " BREAK" not "SUMMER":
cat metadata.csv | awk -F, '{seasons[$8]++}END{for (season in seasons) {if (season ~ /WINTER|SPRING|SUMMER$|FALL/) print season, seasons[season]}}'
## FALL 212
## SPRING 490
## WINTER 222
~ & . & ..
~
represents the location which is in the environment variable $HOME
. If you change $HOME
, ~
also changes. As you are navigating directories, to jump to the most previously visited directory, you can run ~-
. For example, if you navigate to /home/$USER/projects/project1/output
, then to /home/$USER
, and you'd like to jump directly back to /home/$USER/projects/project1/output
, simply run ~-
. ~-
is simply a reference to the location stored in $OLDPWD
.
.
represents the current working directory. For example, if you are in your home directory /home/$USER
, .
means "in this directory", and ./some_file.txt
would represent a file named some_file.txt
which is in your home directory /home/$USER
.
..
represents the parent directory. For example, /home
is the parent directory of /home/$USER
. If you are currently in /home/$USER/projects
and you want to access some file in the home directory, you could do ../some_file.txt
. ../some_file.txt
is called a relative path as it is relative to your current location. If we accessed ../some_file.txt
from the home directory, this would be different than accessing ../some_file.txt
from a different directory. /home/$USER/some_file.txt
is an absolute or full path of a file some_file.txt
.
Examples
If I am in the directory /home/kamstut/projects
directory, what is the relative path to /home/mdw/
?
Click here for solution
../../mdw
If I am in the directory /home/kamstut/projects/project1
, what is the absolute path to the file ../../scripts/runthis.sh
?
Click here for solution
/home/kamstut/scripts/runthis.sh
How can I navigate to my $HOME
directory?
Click here for solution
cd
cd ~
cd $HOME
cd /home/$USER
Piping & Redirection
Redirection is the act of writing standard input (stdin) or standard output (stdout) or standard error (stderr) somewhere else. stdin, stdout, and stderr all have numeric representations of 0, 1, & 2 respectively.
Piping is a form of redirection, but rather than redirect output to stdin, stdout, or stderr, we redirect the output to further commands for more processing.
Redirection
Examples
For the following examples we use the example file redirection.txt
. The contents of which are:
cat redirection.txt
## This is a simple file with some text.
## It has a couple of lines of text.
## Here is some more.
How do I redirect text from a command like ls
to a file like redirection.txt
, completely overwriting any text already within redirection.txt
?
Click here for solution
# Save the stdout from the ls command to redirection.txt
ls > redirection.txt
# The new contents of redirection.txt
head redirection.txt
## 01-scholar.Rmd
## 02-data-formats.Rmd
## 03-unix.Rmd
## 04-sql.Rmd
## 05-r.Rmd
## 06-python.Rmd
## 07-tools.Rmd
## 08-faqs.Rmd
## 09-projects.Rmd
## 10-fall-2020-projects.Rmd
How do I redirect text from a command like ls
to a file like redirection.txt
, without overwriting any text, but rather appending the text to the end of the file?
Click here for solution
# Append the stdout from the ls command to the end of redirection.txt
ls >> redirection.txt
head redirection.txt
## This is a simple file with some text.
## It has a couple of lines of text.
## Here is some more.
## 01-scholar.Rmd
## 02-data-formats.Rmd
## 03-unix.Rmd
## 04-sql.Rmd
## 05-r.Rmd
## 06-python.Rmd
## 07-tools.Rmd
How can I redirect text from a file to be used as stdin for another program or command?
Click here for solution
# Let's count the number of words in redirection.txt
wc -w < redirection.txt
## 20
How can I use multiple redirects in a single line?
Click here for solution
# Here we count the number of words in redirection.txt and then
# save that value to value.txt.
wc -w < redirection.txt > value.txt
head value.txt
## 20
Piping
Piping is the act of taking the output of one or more commands and making the output the input of another command. This is accomplished using the "|" character.
Examples
For the following examples we use the example file piping.txt
. The contents of which are:
cat piping.txt
## apples, oranges, grapes
## pears, apples, peaches,
## celery, carrots, peanuts
## fruits, vegetables, ok
How can I use the output from a grep
command to another command?
Click here for solution
grep -i "p\{2\}" piping.txt | wc -w
## 6
How can I chain multiple commands together?
Click here for solution
# Get the third column of piping.txt and
# get all lines that end in "s" and sort
# the words in reverse order, and append
# to a file called food.txt.
cut -d, -f3 piping.txt | grep -i ".*s$" | sort -r > food.txt
Resources
A quick introduction to stdin, stdout, stderr, redirection, and piping.
Cron
Cron is a unix application used to schedule commands or tasks to run at a specific time or at a specific time interval. For example, let's say you have a program called generate_report.py
that reads some data from the system and generates a report to email to your superiors. Cron would be perfectly suited to do this at the end of each month, without you needing to do a single manual task. To do so, do the following:
Open the crontab. The crontab is the text document containing your cron jobs.
# -e stands for "edit"
crontab -e
This command will open a text editor for you to write your crontab. Then, on a single line, paste the following content:
0 0 1 * * /full/path/to/generate_report.py
Once you save the file, the cron job will take effect. This cron job would run at midnight, on the first day of every month. The rough format of a cron job is:
minute hour day (of month) month day (of week)
So, the first 0 represented minute 0. The second 0, hour 0. The first 1, the first day of the month. The first *, every month. The second *, any day of the week.
If you are uncomfortable using the text editor on Scholar (nano/vim/emacs), there is an alternative way to modify your crontab.
Create a text file in RStudio by clicking File > New File > Text File
. To the first line, paste the following content:
0 0 1 * * /full/path/to/generate_report.py
Important note: You must include the newline following the line of text.
Save the file to your $HOME
directory as my_cron.txt
. Once you've saved your file, you should be able to see it in the bottom right hand corner of RStudio (you may need to click the refresh button to make it appear).
Once complete, open a terminal by clicking Code > Terminal > Open New Terminal at File Location
. If this option isn't present, it is likely you already have a terminal tab open in RStudio. Navigate to the terminal. To update your crontab to the contents of your text file, my_cron.txt
, type the following (into the terminal):
crontab $HOME/my_cron.txt
Important note: If you get an error that says "premature EOF", you forgot to add a newline (empty line) to the end of your my_cron.txt
.
If the command runs without error, your crontab has been successfully installed! You can check by running the following command in the terminal:
crontab -l
Examples
Write a cron job that runs generate_report.py
every minute.
Click here for solution
* * * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
every 5 minutes.
Click here for solution
*/5 * * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
every 10 minutes.
Click here for solution
*/10 * * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
on the 5th minute of every hour.
Click here for solution
5 * * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
every hour.
Click here for solution
0 * * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
every other hour.
Click here for solution
0 */2 * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
every other minute of every other hour.
Click here for solution
*/2 */2 * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
every day at 5 AM.
Click here for solution
0 5 * * * /full/path/to/generate_report.py
Write a cron job that runs generate_report.py
every day at 2:22 PM.
Click here for solution
22 14 * * * /full/path/to/generate_report.py
How do I remove a cron job when I no longer want it to run?
Click here for solution
First, open the crontabs:
crontab -e
Then, delete the line containing the cron job you no longer wish to run. Save the file. Upon saving the file, the cron job you deleted will no longer run.
Emacs
Nano
Vim
Writing scripts
bash stands for "Bourne Again Shell". There are many types of shells, including but not limited to: ksh, zsh, csh, tcsh, fish. When you open a terminal emulator, it will typically run a shell. You can write a bash script, zsh script, csh script, etc. Typically, when you have an interpreter, you can write scripts for them. For example, even though R and Python are not shells, we can write scripts for those languages. As bash is the default shell for many linux operating systems today, we will keep referring to scripts as "bash scripts", but take note that in general the same applies for other shells too.
A bash script is more or less a series of bash commands used to perform a sequence of actions. It is similar to a .R
script, but instead of R code, we have bash commands.
A bash script starts with the "shebang" or "bang" line or "hash-bang" -- #!/bin/bash
. The shebang is used to indicate which interpreter to use to execute the script. For example, if you were using zsh instead, your shebang might read #!/bin/zsh
.
Take the following bash script:
#!/bin/bash
echo "First argument: $1"
echo "Second argument: $2"
If you were to place that text inside of a file called my_script
:
echo '#!/bin/bash
echo "First argument: $1"
echo "Second argument: $2"' > $HOME/my_script
And then run it:
cd $HOME
chmod +x ./my_script
./my_script okay cool
The second line of code is to set the permission so that your script is executable. You would get the following result:
First argument: okay
Second argument: cool
The operating system would use the interpreter located /bin/bash
to execute the script. This would produce the same results:
cd $HOME
/bin/bash my_script okay cool
But instead we only have to run:
cd $HOME
./my_script okay cool
Note that if you were to change the shebang to say #!/usr/bin/python
and try running the following:
cd $HOME
./my_script okay cool
You would get an error that reads:
File "./my_script", line 3
echo "First argument: $1"
^
SyntaxError: invalid syntax
The reason is that the operating system is using the Python interpreter located /usr/bin/python
to run the bash code in our script, my_script
. Since our code is not Python code, we get this error.
Arguments
A bash script can accept arguments. This is just like many programs we've used to date (grep
, cut
, awk
, etc.). For example:
grep -i 'special'
Here, -i
and 'special'
are arguments to grep
. -i
is the first argument, and 'special'
is the second. If you run the following script:
#!/bin/bash
echo "First argument: $1"
echo "Second argument: $2"
You can see that this is indeed the truth:
cd $HOME
./my_script -i 'special'
First argument: -i
Second argument: special
In a bash script the first argument is denoted by $1
the second by $2
the third by $3
etc. In fact, $0
denotes the command used to run the script:
#!/bin/bash
echo "Command: $0"
echo "First argument: $1"
echo "Second argument: $2"
cd $HOME
./my_script okay cool
Command: ./my_script
First argument: okay
Second argument: cool
Examples
Write a script called indyflights.sh
that takes a file from this directoy as its input: /class/datamine/data/flights/subset
and returns the number of flights that have IND
as the origin or destination.
Click here for solution
#!/bin/bash
cat /class/datamine/data/flights/subset/$1 | cut -d, -f17,18 | grep IND | wc -l
Modify your script from this problem to accept an argument containing an airport code (for example IND
). Your script should determine how many flights have origin or destination IND
(or your given airport code) altogether (across all years in all of the flights files).
Click here for solution
#!/bin/bash
for i in {1987..2008}; do
count=$(cat /class/datamine/data/flights/subset/$i.csv | cut -d, -f17,18 | grep $1 | wc -l)
sum=$((sum + count))
done
echo "$sum"
or
Note: This option would work better if you need to use variable substitution in your range (from 1987 to 2008).
#!/bin/bash
for ((i=1987; i<=2008; i++)); do
count=$(cat /class/datamine/data/flights/subset/$i.csv | cut -d, -f17,18 | grep $1 | wc -l)
sum=$((sum + count))
done
echo "$sum"