Unix / Linux
Unix and Linux are the backbone of every server and cloud instance. TCS ILP tests this in FA — expect 5 MCQs plus it's the minor subject in Round 2 (15-20 marks).
FA Round 2: Minor subject — 15-20 marks
Hot topics: chmod numeric, grep flags, pipes, redirection, permission math, shell scripts
What is Unix/Linux?
Why should you care? Every server you'll ever work with at TCS runs Linux. Every cloud instance (AWS, Azure, GCP) runs Linux. Every Android phone runs a Linux kernel. When you deploy your Java web app at work, it's going to a Linux server. You won't be clicking icons — you'll be typing commands in a terminal. That's Unix.
Think of it like this: Windows is a car with automatic transmission — you click buttons, drag things around, the OS handles the details. Unix/Linux is a manual transmission — you tell it exactly what to do, command by command. More effort to learn, but total control once you know it.
Unix was built at Bell Labs in 1969. Linux is a free, open-source recreation of Unix by Linus Torvalds (1991). For exam purposes, treat them as the same thing.
- Everything is a file. Your text documents, your keyboard, your network connection — Unix treats them ALL as files. This is why commands like
catwork on almost anything. - Case-sensitive. Always.
File.txtandfile.txtare two completely different files. Forget this and you'll waste hours debugging. - Small tools, chained together. Instead of one giant program that does everything, Unix gives you tiny tools (grep, sort, cut, awk) that you connect with pipes. Like LEGO blocks.
In Unix, Report.txt and report.txt are:
Report.txt and report.txt are two completely separate files. This is different from Windows where they'd be the same file.File System Hierarchy
Why does this matter? In Windows, you dump everything into C:\Users or wherever you want. Unix is organized like a well-planned office building — every department has its floor, and if you need something, you know exactly which floor to go to.
The entire Unix file system is one giant tree starting from / (root). There are no drive letters like C: or D:. Everything lives under /.
Think of it as an apartment building:
| Directory | Purpose | Apartment Building Analogy |
|---|---|---|
/ | Root — top of the entire tree | The building itself — everything is inside it |
/home | User home directories | Residential floors — each tenant has their own flat |
/etc | System configuration files | Building manager's office — all the rules and settings |
/bin | Essential user binaries (ls, cp, mv) | The toolbox — basic tools everyone needs |
/tmp | Temporary files — cleared on reboot | The lobby whiteboard — wiped clean every morning |
/var | Variable data — logs, mail | The mailroom — always changing, logs of who came/went |
/root | Home directory of root (admin) user | The building owner's penthouse — separate from tenants |
A system administrator wants to check the configuration of a service. Which directory should they look in?
/etc stores all system configuration files. Think "Every Thing Configured." If you need to change how a service behaves, its config file is in /etc.Essential Commands — Quick Reference
Why learn these? In Unix, there's no "right-click > New Folder" or "drag file to trash." Everything happens through commands. Think of each command as a verb — ls = "look", cd = "go to", cp = "copy", rm = "delete". Once you learn 20 verbs, you can do anything.
Don't memorize this table. Read through it once, then use the practice questions to test what stuck. Come back to this table as a reference.
| Command | What it does | Key flags |
|---|---|---|
ls | List directory | -l long, -a hidden, -la both, -R recursive |
pwd | Print current directory | — |
cd | Change directory | cd ~ home, cd .. parent, cd / root |
mkdir | Make directory | -p creates parents too |
rmdir | Remove empty directory | Only works if empty |
rm | Remove files/dirs | -r recursive, -f force |
cp | Copy | -r for directories |
mv | Move or rename | mv old.txt new.txt renames |
touch | Create empty file | — |
cat | Display file contents | — |
head | First N lines | -n 5 (default 10) |
tail | Last N lines | -n 5 (default 10) |
echo | Print text | echo $HOME |
wc | Count words/lines/chars | -w words, -l lines, -c bytes |
sort | Sort lines | -r reverse, -n numeric, -k 2 by field |
cut | Extract columns | -d ':' delimiter, -f 1 field |
grep | Search patterns | -i ignore case, -r recursive, -n line nums, -c count, -v invert |
find | Search for files | -name, -type f/d, -size +1M |
sed | Stream editor | sed 's/old/new/g' file |
awk | Column processing | awk '{print $1}' file |
ps | Show processes | ps aux all processes |
kill | Signal process | kill -9 PID force kill |
chmod | Change permissions | chmod 755 file |
chown | Change owner | chown user:group file |
tar | Archive files | -cvzf create, -xvzf extract |
gzip | Compress | gunzip to decompress |
du | Disk usage | -h human-readable |
df | Disk free space | -h human-readable |
File Permissions
Why do permissions exist? Imagine you work in an office building. Not everyone should have access to every room. The intern shouldn't be able to edit the CEO's files. The accountant shouldn't be able to delete the codebase. Unix permissions are like security badges — they control who can open which doors and what they can do once inside.
Every single file and folder in Unix has a permission badge attached to it. This badge answers three questions for three groups of people:
- Owner (u) — the person who created the file. Like the author of a document.
- Group (g) — a team of users. Like "the marketing department" — everyone in the group gets the same access.
- Others (o) — everyone else on the system. Strangers.
For each group, Unix tracks three things (WHAT they can do):
| Permission | Symbol | On a File | On a Directory |
|---|---|---|---|
| Read | r | Can see the contents | Can list files inside (ls) |
| Write | w | Can modify/edit | Can create/delete files inside |
| Execute | x | Can run it as a program | Can enter the directory (cd) |
Reading Permission Strings
When you run ls -l, you see something like this:
# ls -l output:
-rwxr-xr-- 1 darshan staff 1234 Mar 18 script.sh
↑↑↑↑↑↑↑↑↑
│└──┴──┴── owner(rwx) group(r-x) others(r--)
└ file type: - = file, d = directory
Read it like a sentence: "This is a file (-). The owner (darshan) can read, write, and execute it (rwx). The group (staff) can read and execute but NOT write (r-x). Everyone else can only read (r--)."
Numeric (Octal) Permissions — The Math
Why numbers? Typing chmod 755 is faster than typing chmod u=rwx,g=rx,o=rx. Each permission has a number value, and you just add them up per group.
| Permission | Symbol | Value | Think of it as... |
|---|---|---|---|
| Read | r | 4 | 4 is the biggest — reading is the most basic access |
| Write | w | 2 | 2 is in the middle |
| Execute | x | 1 | 1 is the smallest |
| No permission | - | 0 | Zero = nothing |
How to calculate: Add the values for each group separately. Three groups = three digits.
7 = 4+2+1 = r+w+x = rwx (owner can do everything)
5 = 4+0+1 = r+-+x = r-x (group can read and execute, not write)
4 = 4+0+0 = r+-+- = r-- (others can only read)
Result: rwxr-xr--
chmod 755 file.sh # rwxr-xr-x → owner:7(rwx), group:5(r-x), others:5(r-x) chmod 644 file.txt # rw-r--r-- → owner:6(rw-), group:4(r--), others:4(r--) chmod 777 file # rwxrwxrwx → everyone full access (DANGEROUS!) chmod 400 secret # r-------- → owner read-only, nobody else can touch it
chmod 777 = everyone can read, write, and execute. It's like leaving all doors unlocked with a "please come in and take whatever you want" sign. You'll see it in exams, but in real servers, this is a security disaster.
Symbolic Permissions — The English Way
Instead of math, you can use letters. The format is: WHO (+/- WHAT)
chmod u+x file # u(owner) +(add) x(execute) → "give owner execute permission" chmod g-w file # g(group) -(remove) w(write) → "take write away from group" chmod o+r file # o(others) +(add) r(read) → "let others read" chmod a+x file # a(all) +(add) x(execute) → "everyone can execute"
755 = rwxr-xr-x (typical script), 644 = rw-r--r-- (typical file), 777 = rwxrwxrwx (everything open). The math is always: r=4, w=2, x=1. Add per group.
A file has permissions -rw-r-----. What is the octal value?
You want to make a shell script executable by the owner, but keep it read-only for everyone else. Which command?
grep — Pattern Searching
Why does grep exist? Imagine you have a 10,000-line server log and something crashed at 3 AM. Are you going to read all 10,000 lines? No. You want to search for "error" or "failed" and see only those lines. That's grep — it's Ctrl+F for the terminal, but way more powerful.
The name grep stands for "Global Regular Expression Print" — but just think of it as "search."
grep "error" log.txt # find lines containing "error" grep -i "error" log.txt # case-insensitive (catches Error, ERROR, error) grep -r "TODO" ./src/ # search ALL files inside src/ folder grep -n "main" app.py # show which LINE NUMBER each match is on grep -c "error" log.txt # just tell me HOW MANY lines matched grep -v "error" log.txt # show lines that DON'T contain "error"
Your app's log file has 5000 lines. You want to know how many times "NullPointerException" appeared, ignoring case:
grep -ic "nullpointerexception" server.log
Flags combine: -i (ignore case) + -c (count) = case-insensitive count.
| Flag | Meaning | Memory hook |
|---|---|---|
-i | Case-insensitive | i = Ignore case |
-r | Recursive (search folders) | r = Recurse into directories |
-n | Line numbers | n = Numbers |
-c | Count matches | c = Count |
-v | Invert match | v = inVert (show NON-matching lines) |
You want to find all lines in access.log that do NOT contain "200 OK". Which command?
-v inverts the match — it shows every line that does NOT contain the pattern. This is useful for filtering out noise and finding errors (anything that isn't "200 OK" is probably a problem).Pipes and Redirection
Why do pipes exist? Remember the Unix philosophy — small tools chained together? Pipes are the chains. Each command does one thing well, and the pipe (|) connects them like LEGO bricks snapping together.
Think of it like a factory assembly line. Worker 1 cuts the wood, passes it to Worker 2 who sands it, who passes it to Worker 3 who paints it. Each worker is a command. The conveyor belt between them is the pipe.
Pipes (|) — The Assembly Line
The pipe takes whatever a command prints to the screen and feeds it as input to the next command. The data flows left to right.
ls -l | wc -l # ls lists files → wc counts the lines = number of files cat file.txt | grep "error" # cat shows file → grep filters only "error" lines ps aux | grep "nginx" # ps shows all processes → grep finds "nginx" cat names.txt | sort | uniq # show → sort alphabetically → remove duplicates
cat access.log | grep "404" | wc -l
Read it like: "Show the log file, then keep only lines with '404', then count how many." Result: number of 404 errors.
Redirection — Saving Output to Files
Why? By default, commands print output to your screen (called "stdout"). But what if you want to save it? Redirection lets you reroute the output into a file instead of the screen. Like redirecting a river into a canal.
| Operator | What it does | Analogy | Example |
|---|---|---|---|
> | Write to file (overwrite) | Erase the whiteboard and write new text | ls > files.txt |
>> | Write to file (append) | Add new text at the bottom of the whiteboard | echo "line" >> log.txt |
< | Read input from file | Read from a script instead of typing live | sort < names.txt |
2> | Redirect error messages | Send complaints to a separate box | cmd 2> errors.txt |
> DESTROYS whatever was in the file and writes fresh. >> ADDS to the end without touching existing content. If you use > on an important file by mistake, the old data is gone forever. This distinction is asked in almost every Unix exam.
What does cat file1.txt file2.txt | sort | uniq > result.txt do?
cat concatenates both files into one stream. sort alphabetizes all lines. uniq removes consecutive duplicate lines. > saves the final result to result.txt (overwriting it if it existed). This is a classic Unix pipeline — four small tools doing one big job.You run echo "hello" > greet.txt three times. What does greet.txt contain?
> overwrites every time. Each run erases the file and writes "hello" fresh. After three runs, you still just have one "hello". If you wanted three lines, you'd use >> (append).Process Management
Why learn this? When your Java app hangs on a server, you can't just click the X button — there's no GUI. You need to find the frozen process and kill it from the terminal. Process management is how you become the Task Manager of a Linux server.
Think of processes like apps running on your phone. You can see which ones are running (ps), watch them in real-time (top), close them nicely (kill), or force-close a frozen one (kill -9).
Every running program is a process. Each process gets a unique number called a PID (Process ID) — like an employee ID badge. You use the PID to target a specific process.
Viewing Processes
ps # show YOUR processes only ps aux # show ALL processes, ALL users (the one you'll use most) top # live dashboard — updates every second (press q to quit)
Killing Processes
Unix uses "signals" to communicate with processes. Think of it as sending a message:
kill 1234 # SIGTERM — "please shut down gracefully" (process CAN refuse) kill -9 1234 # SIGKILL — "you're done NOW" (process CANNOT refuse)
kill PID = politely asking someone to leave. They can save their work and exit gracefully. They can also say "no."
kill -9 PID = calling security to physically remove them. No negotiation. No saving work. Instant termination. Use this only when regular kill doesn't work.
Background and Foreground
./script.sh & # run in background — terminal stays free for other commands Ctrl+C # terminate the currently running foreground process Ctrl+Z # suspend (pause) the foreground process — it's still alive but frozen bg # resume a suspended process in the background fg # bring a background process to the foreground
kill -9 = SIGKILL = force kill. This is the answer to "how do you force kill a process?" Also remember: & at the end = run in background.
A process with PID 5678 is frozen and not responding to kill 5678. What should you do?
kill -9 sends SIGKILL, which the process cannot catch, ignore, or refuse. It's the "nuclear option" — the OS forcefully terminates the process. Regular kill sends SIGTERM (signal 15), which a frozen process might ignore.tar and gzip
Why? You know how you right-click a folder in Windows and "Send to > Compressed (zipped) folder"? Unix has two separate steps for this: tar bundles files together (like putting papers in a folder), and gzip compresses the bundle (like vacuum-sealing the folder to make it smaller). Usually you do both at once.
tar = Tape ARchive. It was originally for saving files to tape drives. Today it bundles files into one archive.
tar -cvf archive.tar dir/ # Bundle dir/ into archive.tar (no compression) tar -xvf archive.tar # Unpack archive.tar tar -cvzf archive.tar.gz dir/ # Bundle AND compress (z = gzip) tar -xvzf archive.tar.gz # Decompress AND unpack gzip file.txt # compress single file → file.txt.gz (original deleted!) gunzip file.txt.gz # decompress → file.txt
tar -cvzf name.tar.gz folder/Extracting: xvzf = "eXtract, Verbose, unZip, File" →
tar -xvzf name.tar.gzThe only letter that changes is c (create) vs x (extract). Everything else stays the same.
sed — Stream Editor
Why does sed exist? You need to replace "http" with "https" in 500 config files. Or delete all blank lines from a log. Or change every occurrence of a username. You're NOT going to open each file and Ctrl+H manually. sed is the "Find and Replace" of the terminal — it processes text automatically, line by line, at machine speed.
Think of sed like a factory worker on an assembly line. Each line of the file rolls past on a conveyor belt. The worker applies one rule (replace this, delete that), and the modified line comes out the other end. The worker never sees the whole file at once — just one line at a time.
Substitution — The Core Operation
The most common sed command is s/old/new/ — substitute "old" with "new":
sed 's/old/new/' file.txt # replace FIRST "old" on each line sed 's/old/new/g' file.txt # replace ALL "old" on each line (g = global) sed -i 's/foo/bar/g' file.txt # actually MODIFY the file (-i = in-place)
Without /g: only the FIRST match on each line is replaced. If a line says "error error error", only the first "error" changes.
With /g: ALL matches on the line are replaced. All three "error"s change.
Also: without -i, sed just PRINTS the result — it does NOT modify the file. With -i, it edits the file directly.
Deleting and Printing Lines
sed '3d' file.txt # delete line 3 sed '2,5d' file.txt # delete lines 2 through 5 sed '/pattern/d' file.txt # delete lines containing "pattern" sed -n '3p' file.txt # print ONLY line 3 (-n suppresses all other output) sed -n '2,4p' file.txt # print only lines 2 to 4 sed 's/[0-9]//g' file.txt # remove all digits (replace each digit with nothing)
s/old/new/ = substitute. d = delete. -n with p = print only specific lines. Add /g for global (all matches). Add -i to edit the actual file.
What does sed 's/error/warning/' log.txt do if a line contains "error: error occurred"?
/g, sed only replaces the FIRST match on each line. The result would be "warning: error occurred" — the second "error" is untouched. To replace both, you'd need s/error/warning/g.Which command deletes all lines containing "DEBUG" from a file and saves the change?
-i (in-place) to actually modify the file. Option C removes the word "DEBUG" but keeps the rest of the line. Option D also works for filtering but doesn't modify the original file.awk — THE Most Important FA Round 2 Topic (15 marks)
Why does awk exist? Imagine you have a CSV file with 10,000 employee records. You need to find everyone in the "Sales" department earning over 50,000 and calculate their average salary. In Java, that's 20+ lines of code (open file, read line by line, split by comma, check conditions...). In awk, it's ONE line. awk was built specifically for this kind of "look at data in columns, filter and calculate" work.
Think of awk as a spreadsheet that runs in the terminal. It reads your file row by row, automatically splits each row into columns ($1, $2, $3...), and lets you filter rows, do math, and print results — all in a single command. It's grep (search) + cut (columns) + a calculator, all rolled into one.
Basic Syntax
# awk 'pattern {action}' file # If pattern matches → action runs. If no pattern → runs on every line. awk '{print}' file.txt # print every line (like cat) awk '{print $0}' file.txt # same — $0 = entire line awk '{print $1}' file.txt # print 1st column only awk '{print $1, $3}' file.txt # print 1st and 3rd columns awk '{print $NF}' file.txt # print LAST column (NF = number of fields)
Field Variables — Memorize These
| Variable | Meaning | Example |
|---|---|---|
$0 | Entire line | awk '{print $0}' — prints full line |
$1, $2, $3... | 1st, 2nd, 3rd field (column) | awk '{print $2}' — prints 2nd column |
$NF | Last field | awk '{print $NF}' — prints last column |
NF | Number of fields in current line | awk '{print NF}' — how many columns |
NR | Current line number (record number) | awk '{print NR, $0}' — numbered lines |
FS | Field separator (default: space/tab) | awk -F: '{print $1}' — use : as separator |
OFS | Output field separator | awk -v OFS="," '{print $1,$2}' |
RS | Record separator (default: newline) | Each line is one record |
NF = the NUMBER of fields (e.g., 5). $NF = the VALUE of the last field. $(NF-1) = second-to-last field. This distinction gets asked.
Field Separator (-F flag)
# Default separator is space/tab. Use -F to change it. awk -F':' '{print $1}' /etc/passwd # colon-separated (like /etc/passwd) awk -F',' '{print $1,$2}' data.csv # comma-separated (CSV file) awk -F'\t' '{print $2}' file.tsv # tab-separated awk -F'|' '{print $3}' file.txt # pipe-separated
Pattern Matching — Filter Lines
# Only process lines that match a condition awk '/pattern/ {print}' file.txt # lines containing "pattern" awk '$3 > 50000 {print $1, $3}' emp.txt # salary (col3) > 50000 awk '$2 == "IT" {print $1}' emp.txt # department (col2) is IT awk 'NR >= 2 {print}' file.txt # skip header (line 1), print rest awk 'NR == 3 {print}' file.txt # print only line 3 awk '$1 != "Name" {print}' file.txt # skip lines where col1 is "Name" awk '$3 > 30000 && $2 == "HR"' emp.txt # multiple conditions with && awk '$3 > 30000 || $2 == "HR"' emp.txt # OR condition
BEGIN and END Blocks
# BEGIN runs BEFORE processing any lines # END runs AFTER processing all lines awk 'BEGIN {print "=== Report ==="} {print $0} END {print "=== Done ==="}' file.txt # Calculate total salary awk 'BEGIN {total=0} {total += $3} END {print "Total:", total}' emp.txt # Count lines awk 'END {print NR, "lines"}' file.txt # Calculate average salary awk 'BEGIN {sum=0; count=0} {sum += $3; count++} END {print "Avg:", sum/count}' emp.txt
BEGIN to initialize variables, main block to accumulate, END to print result. This is the most common awk question pattern.
FA Exam Pattern: Employee Table Questions
Assume this file emp.txt (space-separated):
Name Dept Salary Darshan IT 45000 Rahul HR 38000 Priya IT 52000 Amit Finance 41000 Sneha HR 47000 Kiran IT 55000
Q1: Print all employee names and salaries
awk 'NR > 1 {print $1, $3}' emp.txt # NR > 1 skips the header row # Output: Darshan 45000, Rahul 38000, ...
Q2: Print employees in IT department
awk '$2 == "IT" {print $1, $3}' emp.txt # Output: Darshan 45000, Priya 52000, Kiran 55000
Q3: Find the maximum salary
awk 'NR > 1 {if ($3 > max) {max = $3; name = $1}} END {print name, max}' emp.txt # Output: Kiran 55000
Q4: Find the minimum salary
awk 'NR == 2 {min = $3; name = $1} NR > 2 {if ($3 < min) {min = $3; name = $1}} END {print name, min}' emp.txt # Initialize min with first data row (NR==2), then compare rest # Output: Rahul 38000
Q5: Calculate total salary
awk 'NR > 1 {total += $3} END {print "Total:", total}' emp.txt # Output: Total: 278000
Q6: Calculate average salary
awk 'NR > 1 {sum += $3; count++} END {print "Average:", sum/count}' emp.txt # Output: Average: 46333.3
Q7: Count employees per department
awk 'NR > 1 {dept[$2]++} END {for (d in dept) print d, dept[d]}' emp.txt # Output: IT 3, HR 2, Finance 1 # dept[$2]++ creates an associative array (like a HashMap) counting by department
Q8: Print employees with salary above average
# Two-pass approach: first calculate average, then filter awk 'NR > 1 {sum += $3; count++; names[NR]=$1; sals[NR]=$3} END {avg=sum/count; for (i in names) if (sals[i] > avg) print names[i], sals[i]}' emp.txt
Q9: Print employees sorted by salary (with sort pipe)
awk 'NR > 1 {print $3, $1}' emp.txt | sort -n # Output: 38000 Rahul, 41000 Amit, 45000 Darshan, ...
Q10: Department-wise total salary
awk 'NR > 1 {dept[$2] += $3} END {for (d in dept) print d, dept[d]}' emp.txt # Output: IT 152000, HR 85000, Finance 41000
- Skip header:
NR > 1 - Filter by column:
$2 == "value" - Sum a column:
{total += $3} END {print total} - Average:
{sum += $3; count++} END {print sum/count} - Max:
{if ($3 > max) max = $3} END {print max} - Count per group:
{arr[$2]++} END {for (k in arr) print k, arr[k]} - Sum per group:
{arr[$2] += $3} END {for (k in arr) print k, arr[k]}
awk with printf — Formatted Output
# printf gives formatted output (like C's printf) awk 'NR > 1 {printf "%-10s %-8s %d\n", $1, $2, $3}' emp.txt # %-10s = left-aligned string, 10 chars wide # %d = integer, %f = float, %s = string # \n = newline (printf doesn't add one automatically) # Print salary with 2 decimal places awk 'NR > 1 {printf "%s: %.2f\n", $1, $3}' emp.txt
awk — if/else and Loops
# if/else inside awk awk 'NR > 1 {if ($3 > 45000) print $1, "HIGH"; else print $1, "LOW"}' emp.txt # Ternary operator awk 'NR > 1 {print $1, ($3 > 45000 ? "HIGH" : "LOW")}' emp.txt # for loop — print each field on a line awk '{for (i = 1; i <= NF; i++) print $i}' file.txt
awk with Pipes — Combining Commands
# awk works beautifully with pipes # Find top 3 salaries awk 'NR > 1 {print $3, $1}' emp.txt | sort -rn | head -3 # Count unique departments awk 'NR > 1 {print $2}' emp.txt | sort -u | wc -l # grep + awk combo — find IT employees and their salaries grep "IT" emp.txt | awk '{print $1, $3}' # Process command output ls -l | awk '{print $9, $5}' # filename and size from ls -l df -h | awk 'NR > 1 {print $1, $5}' # disk usage %
"Write an awk command to find the employee with the highest salary"
"Write an awk command to calculate total salary of IT department"
"Write an awk command to print names of employees earning more than 40000"
All of these follow the patterns above. Practice with the emp.txt examples until they're automatic.
Quick Reference — awk Cheat Sheet
| Task | Command |
|---|---|
| Print column 1 | awk '{print $1}' file |
| Print last column | awk '{print $NF}' file |
| Custom separator | awk -F',' '{print $1}' file |
| Filter rows | awk '$3 > 100' file |
| String match | awk '$2 == "IT"' file |
| Pattern match | awk '/regex/' file |
| Skip header | awk 'NR > 1' file |
| Line numbers | awk '{print NR, $0}' file |
| Sum column | awk '{s += $3} END {print s}' file |
| Average | awk '{s+=$3;c++} END {print s/c}' file |
| Max value | awk '{if($3>m)m=$3} END {print m}' file |
| Count per group | awk '{a[$2]++} END {for(k in a) print k,a[k]}' file |
| Sum per group | awk '{a[$2]+=$3} END {for(k in a) print k,a[k]}' file |
| Formatted output | awk '{printf "%-10s %d\n",$1,$3}' file |
Shell Scripting Basics
Why write shell scripts? You've been typing commands one at a time. But what if you need to run the same 10 commands every morning? Or automate a backup every night? A shell script is just a text file full of commands that run automatically, one after another. It's like recording a macro in Excel — write the steps once, run them whenever you want.
Think of a shell script as a recipe card. The recipe lists steps in order. When you "run" the recipe (execute the script), the kitchen (the shell) follows each step from top to bottom.
The Shebang — Every Script Starts Here
The very first line of every shell script must be:
#!/bin/bash
This is called the shebang (or hashbang). It tells the OS: "Use the bash shell to run this script." Without it, the system might not know how to interpret your file. It's like writing "Language: English" at the top of a document — so the reader knows how to read it.
Variables — Storing Values
Variables in bash are simple but have one critical gotcha:
# CORRECT — no spaces around = NAME="Darshan" echo "Hello $NAME" # prints: Hello Darshan # WRONG — spaces around = causes an error! # NAME = "Darshan" ← this FAILS. Bash thinks NAME is a command.
NAME="Darshan" works. NAME = "Darshan" FAILS. No spaces around the = sign. This is the opposite of every other programming language and trips up everyone.
# Read user input read -p "Enter name: " USERNAME echo "You entered: $USERNAME"
if-then-else — Making Decisions
Bash's if-else looks different from Java/Python. The structure is: if [ condition ]; then ... elif ... else ... fi. Note: fi is "if" spelled backwards — it closes the if block.
if [ $AGE -gt 18 ]; then echo "Adult" elif [ $AGE -eq 18 ]; then echo "Just 18" else echo "Minor" fi
> means "redirect to file" (remember the redirection section?). So bash uses word-based operators for number comparisons: -gt (greater than), -lt (less than), etc. Think of them as abbreviations: greater than, less than, equal, not equal, greater-or-equal, less-or-equal.
Loops — Repeating Actions
# for loop — iterate over a list for i in 1 2 3 4 5; do echo "$i" done # while loop — repeat while condition is true COUNT=1 while [ $COUNT -le 5 ]; do echo "$COUNT" COUNT=$((COUNT + 1)) # $(( )) does arithmetic in bash done
$(( )) is how bash does math. $((5 + 3)) = 8. Without it, bash treats numbers as text. Also: loops end with done, not a closing brace.
case Statement — The Switch of Bash
Like Java's switch. Pattern matches a value and runs the matching block. ;; is like break. *) is the default case.
case $1 in start) echo "Starting..." ;; stop) echo "Stopping..." ;; *) echo "Usage: $0 {start|stop}" ;; esac
esac is "case" spelled backwards — it closes the case block. (fi closes if, esac closes case.)
Comparison Operators
Bash uses different operators for numbers vs. strings. This is a common exam trap.
| Numbers | Meaning | Strings | Meaning |
|---|---|---|---|
-eq | equal | = | equal |
-ne | not equal | != | not equal |
-gt | greater than | -z | string is empty |
-lt | less than | -n | string is non-empty |
-ge | greater or equal | ||
-le | less or equal |
For numbers: [ 5 -gt 3 ] means "is 5 greater than 3?" (yes)
For strings: [ "$NAME" = "Darshan" ] means "is NAME exactly Darshan?" (yes)
You can't use = for number comparison or -eq for string comparison. The exam will try to trick you with this.
File Test Operators
These check properties of files. Incredibly useful in scripts — "does this file exist before I try to read it?"
| Operator | True if... | Use case |
|---|---|---|
-f file | Regular file exists | Check before reading |
-d file | Directory exists | Check before cd-ing into it |
-e file | File exists (any type) | General existence check |
-r file | Readable | Check before cat/grep |
-w file | Writable | Check before writing to it |
-x file | Executable | Check before running a script |
-s file | Size > 0 (not empty) | Check if file has content |
if [ -f "data.csv" ]; then echo "Processing data..." awk -F',' 'NR > 1 {sum += $3} END {print sum}' data.csv else echo "Error: data.csv not found!" fi
Always check if a file exists before operating on it. This prevents ugly error messages.
What is wrong with this script? NAME = "Darshan"; echo $NAME
= in variable assignment cause an error. Bash interprets NAME as a command name, not a variable. The correct syntax is NAME="Darshan" with NO spaces. This is the single most common shell scripting mistake.Which operator checks if a number is greater than or equal to another in bash?
-ge = greater-or-equal. Bash uses -eq, -ne, -gt, -lt, -ge, -le for numeric comparisons. The >= operator doesn't work in [ ] test brackets.Special Path Symbols
These are shortcuts for navigating the file system. Think of them as bookmarks:
| Symbol | Meaning | Real-world analogy |
|---|---|---|
~ | Home directory (/home/darshan) | Your house — cd ~ always takes you home |
. | Current directory | "Right here" — where you're standing now |
.. | Parent directory (one level up) | "Go upstairs" — the folder containing this one |
/ | Root directory | The ground floor — the very top of the file tree |
- | Previous directory (where you just were) | "Go back" — like the back button in a browser |
~ = home directory is the most commonly asked. Also remember: cd .. goes up one level, cd ../.. goes up two levels.
crontab — Scheduling
Why? You wrote a backup script. Great. Now who runs it at 2 AM every night? You're not setting an alarm to wake up and type the command. crontab is the alarm clock of Unix — it runs commands automatically on a schedule, whether you're asleep, on vacation, or doing anything else.
The format has 5 time fields followed by the command to run:
# ┌───── minute (0-59) # │ ┌───── hour (0-23) # │ │ ┌───── day of month (1-31) # │ │ │ ┌───── month (1-12) # │ │ │ │ ┌───── day of week (0-6, 0=Sunday) # │ │ │ │ │ # * * * * * command 0 2 * * * /home/darshan/backup.sh # 2:00 AM, every day */5 * * * * /scripts/check.sh # every 5 minutes 0 9 * * 1 /scripts/weekly.sh # 9:00 AM every Monday (1=Monday) 30 18 1 * * /scripts/monthly.sh # 6:30 PM on the 1st of every month
* = "every" (every minute, every hour, etc.). */5 = "every 5th." Read left to right: minute, hour, day, month, weekday. So 0 9 * * 1 = "at minute 0, hour 9, any day, any month, on Monday."
What does the cron expression 0 0 * * 0 mean?
Practice Questions — Unix
ls -a shows ALL files including hidden ones (starting with dot).chmod 755 assign?wc -w counts words. -l = lines, -c = bytes.~ represent?kill -9 sends SIGKILL — force kill, cannot be caught or ignored.rw-r--r--?ls | wc -l do?-i = ignore case.> and >>?> overwrites the file. >> appends to the end.x=5; if [ $x -gt 3 ]; then echo "yes"; else echo "no"; figrep -r = recursive grep through all files in directory./tmp stores temporary files cleared on reboot.for i in 1 2 3; do echo $i; doneproject/?tar -cvzf: c=Create, v=Verbose, z=gzip, f=File.-rwxr-x--- in octal?- Case-insensitive matching — use
tolower()in AWK orIGNORECASE=1. Almost every question needs this - Exact output format —
"Total Salary = 135000"must match character for character. Copy the format from the problem - "Not found" message — if no rows match, you must print the fallback message (e.g.,
"No Employee Found") - Command-line argument — filename comes as
$1when the script runs. Use it, don't hardcode a filename - Skip header row — CSV files usually have a header. Use
NR > 1in AWK to skip it
Real FA Subjective Question — Walkthrough
This is an actual FA Round 2 practice question from HackerRank.
Employee data is in a CSV file: EmpName,Role,Division,Salary
Calculate the total salary of employees in the "Sales" division (case-insensitive). Print Total Salary = [sum]. If no match, print No Employee Found.
File name is passed as command-line argument $1.
Solution:
awk -F"," ' BEGIN { total = 0; found = 0 } NR > 1 && tolower($3) == "sales" { # skip header, case-insensitive match total += $4; # $4 = Salary column found = 1 } END { if (found) print "Total Salary = " total # exact format from problem else print "No Employee Found" # exact fallback message } ' $1 # $1 = filename from command-line
-F","— sets comma as field separator for CSVNR > 1— skips the header row (line 1)tolower($3) == "sales"— case-insensitive comparison on Division columnfoundflag — tracks whether any match was found, for the "No Employee Found" case$1at the end — reads the file whose name was passed as argument- Exact output strings — copied from the problem statement, not typed from memory
→ HackerRank FA Subjective Mock Practice