Unix / Linux

Unix and Linux are the backbone of every server and cloud instance. TCS ILP tests this in FA — expect 5 MCQs plus it's the minor subject in Round 2 (15-20 marks).

📋 Exam snapshot
FA Round 1: 5 MCQs on Unix
FA Round 2: Minor subject — 15-20 marks
Hot topics: chmod numeric, grep flags, pipes, redirection, permission math, shell scripts

What is Unix/Linux?

Why should you care? Every server you'll ever work with at TCS runs Linux. Every cloud instance (AWS, Azure, GCP) runs Linux. Every Android phone runs a Linux kernel. When you deploy your Java web app at work, it's going to a Linux server. You won't be clicking icons — you'll be typing commands in a terminal. That's Unix.

Think of it like this: Windows is a car with automatic transmission — you click buttons, drag things around, the OS handles the details. Unix/Linux is a manual transmission — you tell it exactly what to do, command by command. More effort to learn, but total control once you know it.

Unix was built at Bell Labs in 1969. Linux is a free, open-source recreation of Unix by Linus Torvalds (1991). For exam purposes, treat them as the same thing.

Three rules of Unix
  1. Everything is a file. Your text documents, your keyboard, your network connection — Unix treats them ALL as files. This is why commands like cat work on almost anything.
  2. Case-sensitive. Always. File.txt and file.txt are two completely different files. Forget this and you'll waste hours debugging.
  3. Small tools, chained together. Instead of one giant program that does everything, Unix gives you tiny tools (grep, sort, cut, awk) that you connect with pipes. Like LEGO blocks.
Question

In Unix, Report.txt and report.txt are:

  1. The same file
  2. Two different files
  3. An error — you can't have both
  4. Depends on the directory
Unix is case-sensitive. Report.txt and report.txt are two completely separate files. This is different from Windows where they'd be the same file.

File System Hierarchy

Why does this matter? In Windows, you dump everything into C:\Users or wherever you want. Unix is organized like a well-planned office building — every department has its floor, and if you need something, you know exactly which floor to go to.

The entire Unix file system is one giant tree starting from / (root). There are no drive letters like C: or D:. Everything lives under /.

Think of it as an apartment building:

DirectoryPurposeApartment Building Analogy
/Root — top of the entire treeThe building itself — everything is inside it
/homeUser home directoriesResidential floors — each tenant has their own flat
/etcSystem configuration filesBuilding manager's office — all the rules and settings
/binEssential user binaries (ls, cp, mv)The toolbox — basic tools everyone needs
/tmpTemporary files — cleared on rebootThe lobby whiteboard — wiped clean every morning
/varVariable data — logs, mailThe mailroom — always changing, logs of who came/went
/rootHome directory of root (admin) userThe building owner's penthouse — separate from tenants
💡 Exam tip
/etc = config, /tmp = temporary (wiped on reboot), /var = logs. These three come up most. Remember: Etc = Every Thing Configured. Tmp = TeMporary. Var = Varies (changes constantly).
Question

A system administrator wants to check the configuration of a service. Which directory should they look in?

  1. /home
  2. /tmp
  3. /etc
  4. /bin
/etc stores all system configuration files. Think "Every Thing Configured." If you need to change how a service behaves, its config file is in /etc.

Essential Commands — Quick Reference

Why learn these? In Unix, there's no "right-click > New Folder" or "drag file to trash." Everything happens through commands. Think of each command as a verb — ls = "look", cd = "go to", cp = "copy", rm = "delete". Once you learn 20 verbs, you can do anything.

Don't memorize this table. Read through it once, then use the practice questions to test what stuck. Come back to this table as a reference.

CommandWhat it doesKey flags
lsList directory-l long, -a hidden, -la both, -R recursive
pwdPrint current directory
cdChange directorycd ~ home, cd .. parent, cd / root
mkdirMake directory-p creates parents too
rmdirRemove empty directoryOnly works if empty
rmRemove files/dirs-r recursive, -f force
cpCopy-r for directories
mvMove or renamemv old.txt new.txt renames
touchCreate empty file
catDisplay file contents
headFirst N lines-n 5 (default 10)
tailLast N lines-n 5 (default 10)
echoPrint textecho $HOME
wcCount words/lines/chars-w words, -l lines, -c bytes
sortSort lines-r reverse, -n numeric, -k 2 by field
cutExtract columns-d ':' delimiter, -f 1 field
grepSearch patterns-i ignore case, -r recursive, -n line nums, -c count, -v invert
findSearch for files-name, -type f/d, -size +1M
sedStream editorsed 's/old/new/g' file
awkColumn processingawk '{print $1}' file
psShow processesps aux all processes
killSignal processkill -9 PID force kill
chmodChange permissionschmod 755 file
chownChange ownerchown user:group file
tarArchive files-cvzf create, -xvzf extract
gzipCompressgunzip to decompress
duDisk usage-h human-readable
dfDisk free space-h human-readable

File Permissions

Why do permissions exist? Imagine you work in an office building. Not everyone should have access to every room. The intern shouldn't be able to edit the CEO's files. The accountant shouldn't be able to delete the codebase. Unix permissions are like security badges — they control who can open which doors and what they can do once inside.

Every single file and folder in Unix has a permission badge attached to it. This badge answers three questions for three groups of people:

The three groups (WHO can access)
  • Owner (u) — the person who created the file. Like the author of a document.
  • Group (g) — a team of users. Like "the marketing department" — everyone in the group gets the same access.
  • Others (o) — everyone else on the system. Strangers.

For each group, Unix tracks three things (WHAT they can do):

PermissionSymbolOn a FileOn a Directory
ReadrCan see the contentsCan list files inside (ls)
WritewCan modify/editCan create/delete files inside
ExecutexCan run it as a programCan enter the directory (cd)
💡 Key insight
For directories, "execute" means "can enter." This confuses everyone at first. Think of it like: read = you can peek through the window and see what's inside. Execute = you can actually walk through the door. Without x on a directory, you can see file names but can't actually open any of them.

Reading Permission Strings

When you run ls -l, you see something like this:

# ls -l output:
-rwxr-xr--  1  darshan  staff  1234  Mar 18  script.sh
 ↑↑↑↑↑↑↑↑↑
 │└──┴──┴── owner(rwx) group(r-x) others(r--)
 └ file type: - = file, d = directory

Read it like a sentence: "This is a file (-). The owner (darshan) can read, write, and execute it (rwx). The group (staff) can read and execute but NOT write (r-x). Everyone else can only read (r--)."

Numeric (Octal) Permissions — The Math

Why numbers? Typing chmod 755 is faster than typing chmod u=rwx,g=rx,o=rx. Each permission has a number value, and you just add them up per group.

PermissionSymbolValueThink of it as...
Readr44 is the biggest — reading is the most basic access
Writew22 is in the middle
Executex11 is the smallest
No permission-0Zero = nothing

How to calculate: Add the values for each group separately. Three groups = three digits.

Walkthrough: What is 754?

7 = 4+2+1 = r+w+x = rwx (owner can do everything)
5 = 4+0+1 = r+-+x = r-x (group can read and execute, not write)
4 = 4+0+0 = r+-+- = r-- (others can only read)

Result: rwxr-xr--

chmod 755 file.sh   # rwxr-xr-x → owner:7(rwx), group:5(r-x), others:5(r-x)
chmod 644 file.txt  # rw-r--r-- → owner:6(rw-), group:4(r--), others:4(r--)
chmod 777 file      # rwxrwxrwx → everyone full access (DANGEROUS!)
chmod 400 secret    # r-------- → owner read-only, nobody else can touch it
Never use 777 in real life
chmod 777 = everyone can read, write, and execute. It's like leaving all doors unlocked with a "please come in and take whatever you want" sign. You'll see it in exams, but in real servers, this is a security disaster.

Symbolic Permissions — The English Way

Instead of math, you can use letters. The format is: WHO (+/- WHAT)

chmod u+x file     # u(owner) +(add) x(execute) → "give owner execute permission"
chmod g-w file     # g(group) -(remove) w(write) → "take write away from group"
chmod o+r file     # o(others) +(add) r(read) → "let others read"
chmod a+x file     # a(all) +(add) x(execute) → "everyone can execute"
💡 Exam tip
Memorize these three cold: 755 = rwxr-xr-x (typical script), 644 = rw-r--r-- (typical file), 777 = rwxrwxrwx (everything open). The math is always: r=4, w=2, x=1. Add per group.
Question

A file has permissions -rw-r-----. What is the octal value?

  1. 644
  2. 640
  3. 740
  4. 620
Owner: rw- = 4+2+0 = 6. Group: r-- = 4+0+0 = 4. Others: --- = 0+0+0 = 0. Answer: 640. The group can read but not write, and others have zero access.
Question

You want to make a shell script executable by the owner, but keep it read-only for everyone else. Which command?

  1. chmod 777 script.sh
  2. chmod 744 script.sh
  3. chmod 644 script.sh
  4. chmod 700 script.sh
744 = Owner: rwx (7 = read+write+execute), Group: r-- (4 = read only), Others: r-- (4 = read only). The owner can execute it, everyone else can only read. Option D (700) would also make it executable for owner, but others can't even read it.

grep — Pattern Searching

Why does grep exist? Imagine you have a 10,000-line server log and something crashed at 3 AM. Are you going to read all 10,000 lines? No. You want to search for "error" or "failed" and see only those lines. That's grep — it's Ctrl+F for the terminal, but way more powerful.

The name grep stands for "Global Regular Expression Print" — but just think of it as "search."

grep "error" log.txt          # find lines containing "error"
grep -i "error" log.txt        # case-insensitive (catches Error, ERROR, error)
grep -r "TODO" ./src/          # search ALL files inside src/ folder
grep -n "main" app.py          # show which LINE NUMBER each match is on
grep -c "error" log.txt        # just tell me HOW MANY lines matched
grep -v "error" log.txt        # show lines that DON'T contain "error"
Real scenario

Your app's log file has 5000 lines. You want to know how many times "NullPointerException" appeared, ignoring case:

grep -ic "nullpointerexception" server.log

Flags combine: -i (ignore case) + -c (count) = case-insensitive count.

FlagMeaningMemory hook
-iCase-insensitivei = Ignore case
-rRecursive (search folders)r = Recurse into directories
-nLine numbersn = Numbers
-cCount matchesc = Count
-vInvert matchv = inVert (show NON-matching lines)
Question

You want to find all lines in access.log that do NOT contain "200 OK". Which command?

  1. grep "200 OK" access.log
  2. grep -c "200 OK" access.log
  3. grep -v "200 OK" access.log
  4. grep -n "200 OK" access.log
-v inverts the match — it shows every line that does NOT contain the pattern. This is useful for filtering out noise and finding errors (anything that isn't "200 OK" is probably a problem).

Pipes and Redirection

Why do pipes exist? Remember the Unix philosophy — small tools chained together? Pipes are the chains. Each command does one thing well, and the pipe (|) connects them like LEGO bricks snapping together.

Think of it like a factory assembly line. Worker 1 cuts the wood, passes it to Worker 2 who sands it, who passes it to Worker 3 who paints it. Each worker is a command. The conveyor belt between them is the pipe.

Pipes (|) — The Assembly Line

The pipe takes whatever a command prints to the screen and feeds it as input to the next command. The data flows left to right.

ls -l | wc -l                  # ls lists files → wc counts the lines = number of files
cat file.txt | grep "error"    # cat shows file → grep filters only "error" lines
ps aux | grep "nginx"          # ps shows all processes → grep finds "nginx"
cat names.txt | sort | uniq  # show → sort alphabetically → remove duplicates
Reading a pipe chain

cat access.log | grep "404" | wc -l

Read it like: "Show the log file, then keep only lines with '404', then count how many." Result: number of 404 errors.

Redirection — Saving Output to Files

Why? By default, commands print output to your screen (called "stdout"). But what if you want to save it? Redirection lets you reroute the output into a file instead of the screen. Like redirecting a river into a canal.

OperatorWhat it doesAnalogyExample
>Write to file (overwrite)Erase the whiteboard and write new textls > files.txt
>>Write to file (append)Add new text at the bottom of the whiteboardecho "line" >> log.txt
<Read input from fileRead from a script instead of typing livesort < names.txt
2>Redirect error messagesSend complaints to a separate boxcmd 2> errors.txt
The #1 exam trap: > vs >>
> DESTROYS whatever was in the file and writes fresh. >> ADDS to the end without touching existing content. If you use > on an important file by mistake, the old data is gone forever. This distinction is asked in almost every Unix exam.
Question

What does cat file1.txt file2.txt | sort | uniq > result.txt do?

  1. Sorts file1 and saves to result.txt, ignoring file2
  2. Combines both files, sorts all lines, removes duplicates, saves to result.txt
  3. Counts unique lines in both files
  4. Compares file1 and file2 for differences
cat concatenates both files into one stream. sort alphabetizes all lines. uniq removes consecutive duplicate lines. > saves the final result to result.txt (overwriting it if it existed). This is a classic Unix pipeline — four small tools doing one big job.
Question

You run echo "hello" > greet.txt three times. What does greet.txt contain?

  1. hello hello hello
  2. hello (three lines)
  3. hello (one line)
  4. The file is empty
> overwrites every time. Each run erases the file and writes "hello" fresh. After three runs, you still just have one "hello". If you wanted three lines, you'd use >> (append).

Process Management

Why learn this? When your Java app hangs on a server, you can't just click the X button — there's no GUI. You need to find the frozen process and kill it from the terminal. Process management is how you become the Task Manager of a Linux server.

Think of processes like apps running on your phone. You can see which ones are running (ps), watch them in real-time (top), close them nicely (kill), or force-close a frozen one (kill -9).

Every running program is a process. Each process gets a unique number called a PID (Process ID) — like an employee ID badge. You use the PID to target a specific process.

Viewing Processes

ps               # show YOUR processes only
ps aux           # show ALL processes, ALL users (the one you'll use most)
top              # live dashboard — updates every second (press q to quit)

Killing Processes

Unix uses "signals" to communicate with processes. Think of it as sending a message:

kill 1234        # SIGTERM — "please shut down gracefully" (process CAN refuse)
kill -9 1234     # SIGKILL — "you're done NOW" (process CANNOT refuse)
Analogy: Asking someone to leave vs. calling security

kill PID = politely asking someone to leave. They can save their work and exit gracefully. They can also say "no."

kill -9 PID = calling security to physically remove them. No negotiation. No saving work. Instant termination. Use this only when regular kill doesn't work.

Background and Foreground

./script.sh &    # run in background — terminal stays free for other commands
Ctrl+C           # terminate the currently running foreground process
Ctrl+Z           # suspend (pause) the foreground process — it's still alive but frozen
bg               # resume a suspended process in the background
fg               # bring a background process to the foreground
💡 Exam tip
kill -9 = SIGKILL = force kill. This is the answer to "how do you force kill a process?" Also remember: & at the end = run in background.
Question

A process with PID 5678 is frozen and not responding to kill 5678. What should you do?

  1. kill -15 5678
  2. kill -9 5678
  3. stop 5678
  4. rm 5678
kill -9 sends SIGKILL, which the process cannot catch, ignore, or refuse. It's the "nuclear option" — the OS forcefully terminates the process. Regular kill sends SIGTERM (signal 15), which a frozen process might ignore.

tar and gzip

Why? You know how you right-click a folder in Windows and "Send to > Compressed (zipped) folder"? Unix has two separate steps for this: tar bundles files together (like putting papers in a folder), and gzip compresses the bundle (like vacuum-sealing the folder to make it smaller). Usually you do both at once.

tar = Tape ARchive. It was originally for saving files to tape drives. Today it bundles files into one archive.

tar -cvf archive.tar dir/      # Bundle dir/ into archive.tar (no compression)
tar -xvf archive.tar           # Unpack archive.tar
tar -cvzf archive.tar.gz dir/  # Bundle AND compress (z = gzip)
tar -xvzf archive.tar.gz       # Decompress AND unpack
gzip file.txt                  # compress single file → file.txt.gz (original deleted!)
gunzip file.txt.gz             # decompress → file.txt
tar flags — spell them out
c = Create | x = eXtract | v = Verbose (show progress) | f = File (always last, followed by filename) | z = gZip compression
💡 Memory trick
Creating: cvzf = "Create, Verbose, Zip, File" → tar -cvzf name.tar.gz folder/
Extracting: xvzf = "eXtract, Verbose, unZip, File" → tar -xvzf name.tar.gz
The only letter that changes is c (create) vs x (extract). Everything else stays the same.

sed — Stream Editor

Why does sed exist? You need to replace "http" with "https" in 500 config files. Or delete all blank lines from a log. Or change every occurrence of a username. You're NOT going to open each file and Ctrl+H manually. sed is the "Find and Replace" of the terminal — it processes text automatically, line by line, at machine speed.

Think of sed like a factory worker on an assembly line. Each line of the file rolls past on a conveyor belt. The worker applies one rule (replace this, delete that), and the modified line comes out the other end. The worker never sees the whole file at once — just one line at a time.

Substitution — The Core Operation

The most common sed command is s/old/new/ — substitute "old" with "new":

sed 's/old/new/' file.txt      # replace FIRST "old" on each line
sed 's/old/new/g' file.txt     # replace ALL "old" on each line (g = global)
sed -i 's/foo/bar/g' file.txt  # actually MODIFY the file (-i = in-place)
The /g trap — EXAM FAVORITE

Without /g: only the FIRST match on each line is replaced. If a line says "error error error", only the first "error" changes.

With /g: ALL matches on the line are replaced. All three "error"s change.

Also: without -i, sed just PRINTS the result — it does NOT modify the file. With -i, it edits the file directly.

Deleting and Printing Lines

sed '3d' file.txt              # delete line 3
sed '2,5d' file.txt            # delete lines 2 through 5
sed '/pattern/d' file.txt     # delete lines containing "pattern"
sed -n '3p' file.txt           # print ONLY line 3 (-n suppresses all other output)
sed -n '2,4p' file.txt         # print only lines 2 to 4
sed 's/[0-9]//g' file.txt     # remove all digits (replace each digit with nothing)
💡 Exam tip
Three things to remember: s/old/new/ = substitute. d = delete. -n with p = print only specific lines. Add /g for global (all matches). Add -i to edit the actual file.
Question

What does sed 's/error/warning/' log.txt do if a line contains "error: error occurred"?

  1. Replaces both "error" with "warning"
  2. Replaces only the first "error" with "warning"
  3. Deletes the line
  4. Does nothing — you need the -i flag
Without /g, sed only replaces the FIRST match on each line. The result would be "warning: error occurred" — the second "error" is untouched. To replace both, you'd need s/error/warning/g.
Question

Which command deletes all lines containing "DEBUG" from a file and saves the change?

  1. sed '/DEBUG/d' file.txt
  2. sed -i '/DEBUG/d' file.txt
  3. sed 's/DEBUG//g' file.txt
  4. grep -v "DEBUG" file.txt
Option A prints the result but doesn't save it. Option B uses -i (in-place) to actually modify the file. Option C removes the word "DEBUG" but keeps the rest of the line. Option D also works for filtering but doesn't modify the original file.

awk — THE Most Important FA Round 2 Topic (15 marks)

FA Round 2 — 15-20 marks
awk is the #1 subjective topic in FA Round 2 Unix section. Typical questions: "Given a table/file, find the max salary", "Print employees in a department", "Calculate total/average". You MUST know awk cold.

Why does awk exist? Imagine you have a CSV file with 10,000 employee records. You need to find everyone in the "Sales" department earning over 50,000 and calculate their average salary. In Java, that's 20+ lines of code (open file, read line by line, split by comma, check conditions...). In awk, it's ONE line. awk was built specifically for this kind of "look at data in columns, filter and calculate" work.

Think of awk as a spreadsheet that runs in the terminal. It reads your file row by row, automatically splits each row into columns ($1, $2, $3...), and lets you filter rows, do math, and print results — all in a single command. It's grep (search) + cut (columns) + a calculator, all rolled into one.

Basic Syntax

# awk 'pattern {action}' file
# If pattern matches → action runs. If no pattern → runs on every line.

awk '{print}' file.txt            # print every line (like cat)
awk '{print $0}' file.txt          # same — $0 = entire line
awk '{print $1}' file.txt          # print 1st column only
awk '{print $1, $3}' file.txt     # print 1st and 3rd columns
awk '{print $NF}' file.txt         # print LAST column (NF = number of fields)

Field Variables — Memorize These

VariableMeaningExample
$0Entire lineawk '{print $0}' — prints full line
$1, $2, $3...1st, 2nd, 3rd field (column)awk '{print $2}' — prints 2nd column
$NFLast fieldawk '{print $NF}' — prints last column
NFNumber of fields in current lineawk '{print NF}' — how many columns
NRCurrent line number (record number)awk '{print NR, $0}' — numbered lines
FSField separator (default: space/tab)awk -F: '{print $1}' — use : as separator
OFSOutput field separatorawk -v OFS="," '{print $1,$2}'
RSRecord separator (default: newline)Each line is one record
Key — $NF vs NF
NF = the NUMBER of fields (e.g., 5). $NF = the VALUE of the last field. $(NF-1) = second-to-last field. This distinction gets asked.

Field Separator (-F flag)

# Default separator is space/tab. Use -F to change it.

awk -F':' '{print $1}' /etc/passwd   # colon-separated (like /etc/passwd)
awk -F',' '{print $1,$2}' data.csv   # comma-separated (CSV file)
awk -F'\t' '{print $2}' file.tsv    # tab-separated
awk -F'|' '{print $3}' file.txt     # pipe-separated

Pattern Matching — Filter Lines

# Only process lines that match a condition

awk '/pattern/ {print}' file.txt        # lines containing "pattern"
awk '$3 > 50000 {print $1, $3}' emp.txt # salary (col3) > 50000
awk '$2 == "IT" {print $1}' emp.txt      # department (col2) is IT
awk 'NR >= 2 {print}' file.txt           # skip header (line 1), print rest
awk 'NR == 3 {print}' file.txt           # print only line 3
awk '$1 != "Name" {print}' file.txt     # skip lines where col1 is "Name"
awk '$3 > 30000 && $2 == "HR"' emp.txt  # multiple conditions with &&
awk '$3 > 30000 || $2 == "HR"' emp.txt  # OR condition

BEGIN and END Blocks

# BEGIN runs BEFORE processing any lines
# END runs AFTER processing all lines

awk 'BEGIN {print "=== Report ==="} {print $0} END {print "=== Done ==="}' file.txt

# Calculate total salary
awk 'BEGIN {total=0} {total += $3} END {print "Total:", total}' emp.txt

# Count lines
awk 'END {print NR, "lines"}' file.txt

# Calculate average salary
awk 'BEGIN {sum=0; count=0} {sum += $3; count++} END {print "Avg:", sum/count}' emp.txt
FA Round 2 — This WILL be asked
"Given a file with employee data, calculate the total/average salary" — use BEGIN to initialize variables, main block to accumulate, END to print result. This is the most common awk question pattern.

FA Exam Pattern: Employee Table Questions

Assume this file emp.txt (space-separated):

Name    Dept    Salary
Darshan IT      45000
Rahul   HR      38000
Priya   IT      52000
Amit    Finance 41000
Sneha   HR      47000
Kiran   IT      55000

Q1: Print all employee names and salaries

awk 'NR > 1 {print $1, $3}' emp.txt
# NR > 1 skips the header row
# Output: Darshan 45000, Rahul 38000, ...

Q2: Print employees in IT department

awk '$2 == "IT" {print $1, $3}' emp.txt
# Output: Darshan 45000, Priya 52000, Kiran 55000

Q3: Find the maximum salary

awk 'NR > 1 {if ($3 > max) {max = $3; name = $1}} END {print name, max}' emp.txt
# Output: Kiran 55000

Q4: Find the minimum salary

awk 'NR == 2 {min = $3; name = $1} NR > 2 {if ($3 < min) {min = $3; name = $1}} END {print name, min}' emp.txt
# Initialize min with first data row (NR==2), then compare rest
# Output: Rahul 38000

Q5: Calculate total salary

awk 'NR > 1 {total += $3} END {print "Total:", total}' emp.txt
# Output: Total: 278000

Q6: Calculate average salary

awk 'NR > 1 {sum += $3; count++} END {print "Average:", sum/count}' emp.txt
# Output: Average: 46333.3

Q7: Count employees per department

awk 'NR > 1 {dept[$2]++} END {for (d in dept) print d, dept[d]}' emp.txt
# Output: IT 3, HR 2, Finance 1
# dept[$2]++ creates an associative array (like a HashMap) counting by department

Q8: Print employees with salary above average

# Two-pass approach: first calculate average, then filter
awk 'NR > 1 {sum += $3; count++; names[NR]=$1; sals[NR]=$3} END {avg=sum/count; for (i in names) if (sals[i] > avg) print names[i], sals[i]}' emp.txt

Q9: Print employees sorted by salary (with sort pipe)

awk 'NR > 1 {print $3, $1}' emp.txt | sort -n
# Output: 38000 Rahul, 41000 Amit, 45000 Darshan, ...

Q10: Department-wise total salary

awk 'NR > 1 {dept[$2] += $3} END {for (d in dept) print d, dept[d]}' emp.txt
# Output: IT 152000, HR 85000, Finance 41000
Key Patterns to Memorize
  • Skip header: NR > 1
  • Filter by column: $2 == "value"
  • Sum a column: {total += $3} END {print total}
  • Average: {sum += $3; count++} END {print sum/count}
  • Max: {if ($3 > max) max = $3} END {print max}
  • Count per group: {arr[$2]++} END {for (k in arr) print k, arr[k]}
  • Sum per group: {arr[$2] += $3} END {for (k in arr) print k, arr[k]}

awk with printf — Formatted Output

# printf gives formatted output (like C's printf)
awk 'NR > 1 {printf "%-10s %-8s %d\n", $1, $2, $3}' emp.txt
# %-10s = left-aligned string, 10 chars wide
# %d = integer, %f = float, %s = string
# \n = newline (printf doesn't add one automatically)

# Print salary with 2 decimal places
awk 'NR > 1 {printf "%s: %.2f\n", $1, $3}' emp.txt

awk — if/else and Loops

# if/else inside awk
awk 'NR > 1 {if ($3 > 45000) print $1, "HIGH"; else print $1, "LOW"}' emp.txt

# Ternary operator
awk 'NR > 1 {print $1, ($3 > 45000 ? "HIGH" : "LOW")}' emp.txt

# for loop — print each field on a line
awk '{for (i = 1; i <= NF; i++) print $i}' file.txt

awk with Pipes — Combining Commands

# awk works beautifully with pipes

# Find top 3 salaries
awk 'NR > 1 {print $3, $1}' emp.txt | sort -rn | head -3

# Count unique departments
awk 'NR > 1 {print $2}' emp.txt | sort -u | wc -l

# grep + awk combo — find IT employees and their salaries
grep "IT" emp.txt | awk '{print $1, $3}'

# Process command output
ls -l | awk '{print $9, $5}'  # filename and size from ls -l
df -h | awk 'NR > 1 {print $1, $5}'  # disk usage %
FA Exam — How Questions Look
The exam gives you a file content (like emp.txt above) and asks:
"Write an awk command to find the employee with the highest salary"
"Write an awk command to calculate total salary of IT department"
"Write an awk command to print names of employees earning more than 40000"
All of these follow the patterns above. Practice with the emp.txt examples until they're automatic.

Quick Reference — awk Cheat Sheet

TaskCommand
Print column 1awk '{print $1}' file
Print last columnawk '{print $NF}' file
Custom separatorawk -F',' '{print $1}' file
Filter rowsawk '$3 > 100' file
String matchawk '$2 == "IT"' file
Pattern matchawk '/regex/' file
Skip headerawk 'NR > 1' file
Line numbersawk '{print NR, $0}' file
Sum columnawk '{s += $3} END {print s}' file
Averageawk '{s+=$3;c++} END {print s/c}' file
Max valueawk '{if($3>m)m=$3} END {print m}' file
Count per groupawk '{a[$2]++} END {for(k in a) print k,a[k]}' file
Sum per groupawk '{a[$2]+=$3} END {for(k in a) print k,a[k]}' file
Formatted outputawk '{printf "%-10s %d\n",$1,$3}' file

Shell Scripting Basics

Why write shell scripts? You've been typing commands one at a time. But what if you need to run the same 10 commands every morning? Or automate a backup every night? A shell script is just a text file full of commands that run automatically, one after another. It's like recording a macro in Excel — write the steps once, run them whenever you want.

Think of a shell script as a recipe card. The recipe lists steps in order. When you "run" the recipe (execute the script), the kitchen (the shell) follows each step from top to bottom.

The Shebang — Every Script Starts Here

The very first line of every shell script must be:

#!/bin/bash

This is called the shebang (or hashbang). It tells the OS: "Use the bash shell to run this script." Without it, the system might not know how to interpret your file. It's like writing "Language: English" at the top of a document — so the reader knows how to read it.

Variables — Storing Values

Variables in bash are simple but have one critical gotcha:

# CORRECT — no spaces around =
NAME="Darshan"
echo "Hello $NAME"    # prints: Hello Darshan

# WRONG — spaces around = causes an error!
# NAME = "Darshan"     ← this FAILS. Bash thinks NAME is a command.
The #1 shell scripting mistake
NAME="Darshan" works. NAME = "Darshan" FAILS. No spaces around the = sign. This is the opposite of every other programming language and trips up everyone.
# Read user input
read -p "Enter name: " USERNAME
echo "You entered: $USERNAME"

if-then-else — Making Decisions

Bash's if-else looks different from Java/Python. The structure is: if [ condition ]; then ... elif ... else ... fi. Note: fi is "if" spelled backwards — it closes the if block.

if [ $AGE -gt 18 ]; then
    echo "Adult"
elif [ $AGE -eq 18 ]; then
    echo "Just 18"
else
    echo "Minor"
fi
Why -gt instead of > ?
In bash, > means "redirect to file" (remember the redirection section?). So bash uses word-based operators for number comparisons: -gt (greater than), -lt (less than), etc. Think of them as abbreviations: greater than, less than, equal, not equal, greater-or-equal, less-or-equal.

Loops — Repeating Actions

# for loop — iterate over a list
for i in 1 2 3 4 5; do
    echo "$i"
done

# while loop — repeat while condition is true
COUNT=1
while [ $COUNT -le 5 ]; do
    echo "$COUNT"
    COUNT=$((COUNT + 1))   # $(( )) does arithmetic in bash
done
💡 Key syntax
$(( )) is how bash does math. $((5 + 3)) = 8. Without it, bash treats numbers as text. Also: loops end with done, not a closing brace.

case Statement — The Switch of Bash

Like Java's switch. Pattern matches a value and runs the matching block. ;; is like break. *) is the default case.

case $1 in
    start)  echo "Starting..." ;;
    stop)   echo "Stopping..." ;;
    *)      echo "Usage: $0 {start|stop}" ;;
esac

esac is "case" spelled backwards — it closes the case block. (fi closes if, esac closes case.)

Comparison Operators

Bash uses different operators for numbers vs. strings. This is a common exam trap.

NumbersMeaningStringsMeaning
-eqequal=equal
-nenot equal!=not equal
-gtgreater than-zstring is empty
-ltless than-nstring is non-empty
-gegreater or equal
-leless or equal
Why two systems?

For numbers: [ 5 -gt 3 ] means "is 5 greater than 3?" (yes)

For strings: [ "$NAME" = "Darshan" ] means "is NAME exactly Darshan?" (yes)

You can't use = for number comparison or -eq for string comparison. The exam will try to trick you with this.

File Test Operators

These check properties of files. Incredibly useful in scripts — "does this file exist before I try to read it?"

OperatorTrue if...Use case
-f fileRegular file existsCheck before reading
-d fileDirectory existsCheck before cd-ing into it
-e fileFile exists (any type)General existence check
-r fileReadableCheck before cat/grep
-w fileWritableCheck before writing to it
-x fileExecutableCheck before running a script
-s fileSize > 0 (not empty)Check if file has content
Real example: Safe script
if [ -f "data.csv" ]; then
    echo "Processing data..."
    awk -F',' 'NR > 1 {sum += $3} END {print sum}' data.csv
else
    echo "Error: data.csv not found!"
fi

Always check if a file exists before operating on it. This prevents ugly error messages.

Question

What is wrong with this script? NAME = "Darshan"; echo $NAME

  1. Missing shebang line
  2. Spaces around = in variable assignment
  3. Should use echo "$NAME" with quotes
  4. Nothing — it works fine
Spaces around = in variable assignment cause an error. Bash interprets NAME as a command name, not a variable. The correct syntax is NAME="Darshan" with NO spaces. This is the single most common shell scripting mistake.
Question

Which operator checks if a number is greater than or equal to another in bash?

  1. >=
  2. -ge
  3. -gte
  4. -greq
-ge = greater-or-equal. Bash uses -eq, -ne, -gt, -lt, -ge, -le for numeric comparisons. The >= operator doesn't work in [ ] test brackets.

Special Path Symbols

These are shortcuts for navigating the file system. Think of them as bookmarks:

SymbolMeaningReal-world analogy
~Home directory (/home/darshan)Your house — cd ~ always takes you home
.Current directory"Right here" — where you're standing now
..Parent directory (one level up)"Go upstairs" — the folder containing this one
/Root directoryThe ground floor — the very top of the file tree
-Previous directory (where you just were)"Go back" — like the back button in a browser
💡 Exam tip
~ = home directory is the most commonly asked. Also remember: cd .. goes up one level, cd ../.. goes up two levels.

crontab — Scheduling

Why? You wrote a backup script. Great. Now who runs it at 2 AM every night? You're not setting an alarm to wake up and type the command. crontab is the alarm clock of Unix — it runs commands automatically on a schedule, whether you're asleep, on vacation, or doing anything else.

The format has 5 time fields followed by the command to run:

# ┌───── minute (0-59)
# │ ┌───── hour (0-23)
# │ │ ┌───── day of month (1-31)
# │ │ │ ┌───── month (1-12)
# │ │ │ │ ┌───── day of week (0-6, 0=Sunday)
# │ │ │ │ │
# * * * * * command

0 2 * * * /home/darshan/backup.sh    # 2:00 AM, every day
*/5 * * * * /scripts/check.sh        # every 5 minutes
0 9 * * 1 /scripts/weekly.sh         # 9:00 AM every Monday (1=Monday)
30 18 1 * * /scripts/monthly.sh      # 6:30 PM on the 1st of every month
💡 Reading crontab
* = "every" (every minute, every hour, etc.). */5 = "every 5th." Read left to right: minute, hour, day, month, weekday. So 0 9 * * 1 = "at minute 0, hour 9, any day, any month, on Monday."
Question

What does the cron expression 0 0 * * 0 mean?

  1. Every minute on Sunday
  2. Midnight every day
  3. Midnight every Sunday
  4. Noon every Saturday
Minute=0, Hour=0 (midnight), Day=* (any), Month=* (any), Weekday=0 (Sunday). So it runs at midnight every Sunday. Remember: 0 in the weekday field = Sunday.

Practice Questions — Unix

Question 1
Which command lists all files including hidden files?
  1. ls -l
  2. ls -a
  3. ls -R
  4. ls -h
✅ B) ls -a shows ALL files including hidden ones (starting with dot).
Question 2
What permissions does chmod 755 assign?
  1. rwxrwxrwx
  2. rw-r--r--
  3. rwxr-xr-x
  4. rwxrwxr-x
✅ C) rwxr-xr-x. Owner=7(rwx), Group=5(r-x), Others=5(r-x).
Question 3
Which command counts words in a file?
  1. wc -l file
  2. wc -c file
  3. wc -w file
  4. count file
✅ C) wc -w counts words. -l = lines, -c = bytes.
Question 4
What does ~ represent?
  1. Root directory
  2. Current directory
  3. Parent directory
  4. Home directory
✅ D) Home directory of the current user. Root = /, Current = ., Parent = ..
Question 5
How to force kill process with PID 4521?
  1. kill 4521
  2. kill -15 4521
  3. kill -9 4521
  4. stop 4521
✅ C) kill -9 sends SIGKILL — force kill, cannot be caught or ignored.
Question 6
What is the numeric value of rw-r--r--?
  1. 755
  2. 644
  3. 600
  4. 666
✅ B) 644. Owner: rw-=6, Group: r--=4, Others: r--=4.
Question 7
What does ls | wc -l do?
  1. Lists files and counts characters
  2. Counts files in current directory
  3. Counts words in a file named ls
  4. Lists files sorted by count
✅ B) ls outputs filenames, pipe sends to wc -l which counts lines = number of files.
Question 8
Which grep flag makes search case-insensitive?
  1. grep -n
  2. grep -v
  3. grep -i
  4. grep -c
✅ C) -i = ignore case.
Question 9
Difference between > and >>?
  1. > appends; >> overwrites
  2. > reads; >> writes
  3. > overwrites; >> appends
  4. Both do the same
✅ C) > overwrites the file. >> appends to the end.
Question 10
Output of: x=5; if [ $x -gt 3 ]; then echo "yes"; else echo "no"; fi
  1. no
  2. yes
  3. 5
  4. Error
✅ B) yes. 5 > 3 is true, so "yes" is printed.
Question 11
Which searches recursively for "password" in /etc?
  1. grep "password" /etc
  2. grep -r "password" /etc
  3. find /etc -name "password"
  4. grep -n "password" /etc
✅ B) grep -r = recursive grep through all files in directory.
Question 12
Which directory stores temporary files cleared on reboot?
  1. /var
  2. /etc
  3. /tmp
  4. /home
✅ C) /tmp stores temporary files cleared on reboot.
Question 13
Output of: for i in 1 2 3; do echo $i; done
  1. 1 2 3 (one line)
  2. 1, 2, 3 on separate lines
  3. i i i
  4. Error
✅ B) Each echo prints on a new line: 1, 2, 3 on separate lines.
Question 14
Which creates a gzip-compressed tar archive of project/?
  1. tar -xvf project.tar.gz project/
  2. tar -cvzf project.tar.gz project/
  3. gzip project/
  4. tar -tvf project.tar.gz
✅ B) tar -cvzf: c=Create, v=Verbose, z=gzip, f=File.
Question 15
Permissions -rwxr-x--- in octal?
  1. 777
  2. 754
  3. 750
  4. 755
✅ C) 750. Owner: rwx=7, Group: r-x=5, Others: ---=0.
HackerRank Gotchas — Unix Questions
  • Case-insensitive matching — use tolower() in AWK or IGNORECASE=1. Almost every question needs this
  • Exact output format"Total Salary = 135000" must match character for character. Copy the format from the problem
  • "Not found" message — if no rows match, you must print the fallback message (e.g., "No Employee Found")
  • Command-line argument — filename comes as $1 when the script runs. Use it, don't hardcode a filename
  • Skip header row — CSV files usually have a header. Use NR > 1 in AWK to skip it

Real FA Subjective Question — Walkthrough

This is an actual FA Round 2 practice question from HackerRank.

Problem: Total Salary by Division

Employee data is in a CSV file: EmpName,Role,Division,Salary

Calculate the total salary of employees in the "Sales" division (case-insensitive). Print Total Salary = [sum]. If no match, print No Employee Found.

File name is passed as command-line argument $1.

Solution:

awk -F"," '
BEGIN { total = 0; found = 0 }
NR > 1 && tolower($3) == "sales" {      # skip header, case-insensitive match
    total += $4;                           # $4 = Salary column
    found = 1
}
END {
    if (found)
        print "Total Salary = " total      # exact format from problem
    else
        print "No Employee Found"          # exact fallback message
}
' $1                                     # $1 = filename from command-line
Key details in this solution
  • -F"," — sets comma as field separator for CSV
  • NR > 1 — skips the header row (line 1)
  • tolower($3) == "sales" — case-insensitive comparison on Division column
  • found flag — tracks whether any match was found, for the "No Employee Found" case
  • $1 at the end — reads the file whose name was passed as argument
  • Exact output strings — copied from the problem statement, not typed from memory
FA Subjective Practice
Practice more questions like this on HackerRank:
→ HackerRank FA Subjective Mock Practice