6. Theory of Computation and Computer Graphics (ACtE06)
6.1 Introduction to Finite Automata
Finite Automata (FA) and Finite State Machines (FSM) are mathematical models of computation used to design algorithms and understand the limits of computation. They consist of a finite set of states and transitions between these states triggered by input symbols.
- State: A configuration of the machine at a particular instant.
- Transition: A move from one state to another, usually triggered by reading an input symbol.
- Start State: The initial state of the automaton where processing begins.
- Final States (Accepting States): A subset of states indicating successful computation or acceptance of an input string.
DFA (Deterministic Finite Automata)
A DFA is a finite automaton where for each state and input symbol, there is exactly one transition to a next state.
Formal Definition: A DFA is a 5-tuple (Q, Σ, δ, q₀, F) where:
Qis a finite set of states.Σis a finite set of input symbols (alphabet).δ: Q × Σ → Qis the transition function, mapping a state-symbol pair to a single next state.q₀ ∈ Qis the start state.F ⊆ Qis the set of final (accepting) states.
Example: A DFA that accepts all strings over {a, b} ending with 'ab'.
A state diagram would show states like q0 (start), q1 (seen 'a'), q2 (seen 'ab', final). Transitions would be defined for each state and input symbol, ensuring determinism.
NFA (Non-deterministic Finite Automata)
An NFA is a finite automaton where for each state and input symbol, there can be zero, one, or multiple transitions to next states. It also allows ε-transitions (transitions without consuming an input symbol).
Formal Definition: An NFA is a 5-tuple (Q, Σ, δ, q₀, F) where:
Qis a finite set of states.Σis a finite set of input symbols (alphabet).δ: Q × (Σ ∪ {ε}) → P(Q)is the transition function, mapping a state-symbol pair (or state-epsilon) to a set of possible next states.P(Q)denotes the power set ofQ.q₀ ∈ Qis the start state.F ⊆ Qis the set of final (accepting) states.
Example: An NFA that accepts all strings over {a, b} containing 'ab' as a substring. An NFA can guess when 'ab' starts, making it simpler to design than a DFA for this specific language.
Equivalence of DFA and NFA
Despite their differences, DFAs and NFAs are equivalent in their language recognition power. Any language accepted by an NFA can also be accepted by some DFA, and vice-versa.
Subset Construction Algorithm: This algorithm converts an NFA into an equivalent DFA. The states of the resulting DFA are sets of states from the original NFA.
- Start with the ε-closure of the NFA's start state as the DFA's start state.
- For each DFA state (which is a set of NFA states) and each input symbol, compute the ε-closure of all NFA states reachable from the current set of NFA states by that input symbol. This new set becomes a new DFA state.
- Repeat until no new DFA states are generated.
- Any DFA state containing at least one final state of the NFA becomes a final state in the DFA.
Minimization of Finite State Machines
Minimization aims to reduce the number of states in a DFA while preserving the language it accepts. A minimal DFA is unique (up to isomorphism).
Table Filling Algorithm (or Myhill-Nerode Theorem based):
- Create a table of all unordered pairs of states
(q_i, q_j)whereq_i ≠ q_j. - Mark all pairs
(q_i, q_j)where one is a final state and the other is not. These are distinguishable. - Iteratively mark a pair
(q_i, q_j)if for some input symbola ∈ Σ, the pair(δ(q_i, a), δ(q_j, a))is already marked. - Repeat step 3 until no new pairs can be marked.
- All unmarked pairs are equivalent. Merge these equivalent states to form the minimized DFA.
Partitioning: This method starts by partitioning states into two groups: final states and non-final states. It then refines these partitions based on transitions until no further refinement is possible.
Regular Expressions
Regular Expressions (REs) are a powerful notation for describing regular languages. They are built using a set of operators:
- Concatenation (
.or implicit):ABmeans stringAfollowed by stringB. Example:a.bmatches "ab". - Union (
+or|):A+Bmeans either stringAor stringB. Example:a+bmatches "a" or "b". - Kleene Star (
*):A*means zero or more occurrences of stringA. Example:a*matches "", "a", "aa", "aaa", etc.
Example: (a+b)*abb describes all strings over {a, b} that end with "abb".
Equivalence of Regular Expression and Finite Automata
Kleene's Theorem states that regular expressions and finite automata are equivalent in their expressive power. That is, any language described by a regular expression can be recognized by a finite automaton, and any language recognized by a finite automaton can be described by a regular expression.
- RE to NFA (Thompson's Construction): An algorithm to construct an NFA with ε-transitions for any given regular expression. It builds NFAs for basic expressions (single symbols, ε) and then recursively combines them for concatenation, union, and Kleene star.
- NFA to RE (State Elimination Method): An algorithm to convert an NFA into an equivalent regular expression by progressively eliminating states and updating the regular expressions on the remaining transitions.
Pumping Lemma for Regular Language
The Pumping Lemma for Regular Languages is a powerful tool used to prove that certain languages are NOT regular.
Statement: If L is a regular language, then there exists some integer p ≥ 1 (the pumping length) such that any string s ∈ L with |s| ≥ p can be divided into three parts s = xyz satisfying the following conditions:
|y| ≥ 1(the middle partycannot be empty).|xy| ≤ p(the combined length ofxandyis at mostp).- For all
i ≥ 0, the stringxyⁱz ∈ L(pumpingyzero or more times keeps the string in the language).
Proof Technique: Proof by contradiction. Assume the language is regular, apply the Pumping Lemma, and then find a string s in the language that violates one of the conditions when y is pumped.
Example of Non-Regular Language: The language L = {aⁿbⁿ | n ≥ 0} (strings with an equal number of 'a's followed by 'b's) is not regular. If we assume it is regular and apply the Pumping Lemma to a string s = aᵖbᵖ, pumping the y part (which must contain only 'a's or a mix of 'a's and 'b's or only 'b's) will lead to an unequal number of 'a's and 'b's, violating the language definition.
6.2 Introduction to Context-Free Language
Context-Free Languages (CFLs) are a class of languages that can be described by Context-Free Grammars (CFGs) and recognized by Pushdown Automata. They are more powerful than regular languages and are widely used in programming language syntax specification.
Introduction to Context Free Grammar (CFG)
A CFG is a formal grammar that generates all and only the strings of a context-free language. Productions in a CFG have a single non-terminal on the left-hand side.
Formal Definition: A CFG is a 4-tuple (V, T, P, S) where:
Vis a finite set of variables (non-terminals).Tis a finite set of terminals (alphabet,V ∩ T = ∅).Pis a finite set of production rules of the formA → β, whereA ∈ Vandβ ∈ (V ∪ T)*.S ∈ Vis the start symbol.
Example: A grammar for arithmetic expressions: E → E + E | E * E | (E) | id.
Derivation Trees
Derivation trees (or parse trees) graphically represent the process of deriving a string from a start symbol using the production rules of a grammar.
- Bottom-up approach: Starts from the input string and works backwards to reach the start symbol (e.g., in parsing).
- Top-down approach: Starts from the start symbol and applies production rules to derive the input string (e.g., in generation).
- Leftmost derivation: At each step, the leftmost non-terminal is replaced.
- Rightmost derivation: At each step, the rightmost non-terminal is replaced.
Language of a Grammar
The language generated by a CFG, denoted L(G), is the set of all terminal strings that can be derived from the start symbol S by applying the production rules. L(G) = {w ∈ T* | S ⇒* w}.
Parse Tree and its Construction
A parse tree is a graphical representation of a derivation. The root is the start symbol, internal nodes are non-terminals, and leaf nodes are terminals, forming the derived string from left to right. Each internal node and its children correspond to a production rule.
Construction Example: For grammar S → AB, A → a, B → b, to parse "ab":
S
/ \
A B
/ \
a b
Ambiguous Grammar
A grammar is ambiguous if there exists at least one string in its language that has more than one leftmost derivation, more than one rightmost derivation, or more than one parse tree. Ambiguity can cause problems in parsing and semantic interpretation.
Examples: The grammar E → E + E | E * E | id is ambiguous for the string id + id * id, as it can have two different parse trees (one favoring addition, one favoring multiplication).
Removing Ambiguity: Often involves rewriting the grammar to enforce operator precedence and associativity (e.g., introducing new non-terminals for different precedence levels).
Chomsky Normal Form (CNF)
CNF is a simplified form of CFG where all production rules are of the form A → BC or A → a, where A, B, C are non-terminals and a is a terminal. B and C cannot be the start symbol.
Conversion Steps with Examples:
- Eliminate start symbol from RHS.
- Eliminate ε-productions.
- Eliminate unit productions (
A → B). - Replace terminals on RHS with non-terminals (e.g.,
A → aBbecomesA → C_a B, C_a → a). - Break down long RHS productions (e.g.,
A → BCDbecomesA → BX, X → CD).
Greibach Normal Form (GNF)
GNF is another simplified form where all production rules are of the form A → aβ, where A is a non-terminal, a is a terminal, and β is a string of zero or more non-terminals.
Conversion Steps with Examples: More complex than CNF conversion, often involves techniques like substitution and elimination of left recursion.
Backus-Naur Form (BNF)
BNF is a metasyntax used to express context-free grammars. It's widely used to describe the syntax of programming languages. It uses symbols like ::= (is defined as), | (or), and angle brackets < > for non-terminals.
Example: <expression> ::= <term> | <expression> "+" <term>.
Push Down Automata (PDA)
A PDA is a finite automaton augmented with a stack. This stack provides an infinite memory capacity, allowing PDAs to recognize CFLs.
Formal Definition: A PDA is a 7-tuple (Q, Σ, Γ, δ, q₀, Z₀, F) where:
Qis a finite set of states.Σis the input alphabet.Γis the stack alphabet.δ: Q × (Σ ∪ {ε}) × Γ → P(Q × Γ*)is the transition function. It takes a state, an input symbol (or ε), and the top stack symbol, and returns a set of (next state, string to push onto stack) pairs.q₀ ∈ Qis the start state.Z₀ ∈ Γis the initial stack symbol.F ⊆ Qis the set of final states.
Transition Function: A transition (q, a, X) → (p, Y) means that if the PDA is in state q, reads input a, and X is on top of the stack, it can move to state p and replace X with Y on the stack (Y can be ε for pop, or Z for no change, or YZ for push Y then Z).
Equivalence of Context Free Language and PDA
A language is context-free if and only if there exists a Pushdown Automaton that accepts it. This equivalence is fundamental to the theory of CFLs.
- For every CFG, there exists an equivalent PDA that accepts the language generated by the CFG.
- For every PDA, there exists an equivalent CFG that generates the language accepted by the PDA.
Pumping Lemma for Context Free Language
Similar to the regular language pumping lemma, this lemma is used to prove that a language is NOT context-free.
Statement: If L is a context-free language, then there exists some integer p ≥ 1 (the pumping length) such that any string s ∈ L with |s| ≥ p can be divided into five parts s = uvxyz satisfying the following conditions:
|vx| ≥ 1(at least one ofvorxmust be non-empty).|vxy| ≤ p(the combined length ofvxyis at mostp).- For all
i ≥ 0, the stringuvⁱxyⁱz ∈ L(pumpingvandxzero or more times keeps the string in the language).
Usage: Proof by contradiction, similar to the regular pumping lemma.
Example: The language L = {aⁿbⁿcⁿ | n ≥ 0} is not context-free. Pumping v and x in a string aᵖbᵖcᵖ will inevitably lead to an imbalance in the counts of 'a', 'b', or 'c', thus violating the language's definition.
Properties of Context Free Language
CFLs have several important closure properties:
- Closure under Union: If
L₁andL₂are CFLs, thenL₁ ∪ L₂is a CFL. - Closure under Concatenation: If
L₁andL₂are CFLs, thenL₁L₂is a CFL. - Closure under Kleene Star: If
Lis a CFL, thenL*is a CFL. - Closure under Intersection with Regular Language: If
L₁is a CFL andL₂is a regular language, thenL₁ ∩ L₂is a CFL. - Closure under Inverse Homomorphism: If
Lis a CFL, andhis a homomorphism, thenh⁻¹(L)is a CFL.
CFLs are NOT closed under intersection or complementation.
6.3 Turing Machine
The Turing Machine (TM), proposed by Alan Turing, is a mathematical model of computation that defines an abstract machine manipulating symbols on a strip of tape according to a table of rules. It is considered the most powerful model of computation, capable of simulating any algorithm.
Introduction to Turing Machines (TM)
A TM consists of an infinite tape divided into cells, a read/write head, and a finite control unit that stores the current state.
Formal Definition: A Turing Machine is a 7-tuple (Q, Σ, Γ, δ, q₀, B, F) where:
Qis a finite set of states.Σis the input alphabet (not containing the blank symbolB).Γis the tape alphabet (Σ ⊆ Γ, andB ∈ Γ).δ: Q × Γ → Q × Γ × {L, R}is the transition function. It maps a (current state, tape symbol under head) pair to a (next state, symbol to write, head movement: Left/Right).q₀ ∈ Qis the start state.B ∈ Γis the blank symbol.F ⊆ Qis the set of final (accepting) states.
Notations of Turing Machine
- State Diagram: A graphical representation using nodes for states and directed edges for transitions, labeled with
(read_symbol, write_symbol, move_direction). - Transition Table: A tabular representation of the transition function, with rows for current states and columns for tape symbols.
Acceptance of a string by Turing Machine
A TM accepts a string if, starting from the initial configuration with the string on the tape, it eventually enters an accepting state. A TM can also:
- Reject: If it halts in a non-accepting state.
- Halt: When there is no defined transition for the current state and tape symbol.
- Loop: If it never halts (neither accepts nor rejects).
Turing Machine as a Language Recognizer
A TM recognizes a language L if it halts and accepts all strings in L and halts and rejects all strings not in L. If it loops for some strings not in L, it is said to "semidecide" or "recognize" the language. If it always halts for all inputs, it "decides" the language.
Turing Machine as a Computing Function
Turing machines can compute mathematical functions. For example, a TM can be designed to take two numbers on its tape, compute their sum, and leave the result on the tape.
Turing Machine as an enumerator of strings of a language
An enumerator TM prints out all strings of a language, one by one, possibly in an infinite sequence. It does not take input but generates output.
Turing Machine with Multiple Tracks
A multi-track TM has a single tape but each cell is divided into several tracks. It is equivalent in power to a single-tape TM.
Turing Machine with Multiple Tapes
A multi-tape TM has several independent tapes, each with its own read/write head. It is also equivalent in power to a single-tape TM, though it can often perform computations more efficiently.
Non-Deterministic Turing Machines
A Non-Deterministic Turing Machine (NTM) allows for multiple possible transitions from a given state and tape symbol. NTMs are equivalent in power to deterministic TMs; any language recognized by an NTM can also be recognized by a DTM, although the DTM might take exponentially longer.
Church-Turing Thesis
The Church-Turing Thesis states that any algorithmic process that can be carried out by a human or any physical computing device can be simulated by a Turing Machine. It asserts that the TM is the most general model of computation.
Universal Turing Machine (UTM)
A Universal Turing Machine is a Turing Machine that can simulate any other arbitrary Turing Machine. It takes as input a description of a TM (encoded as a string) and an input string for that TM, and then simulates the behavior of the described TM on its input.
- Encoding of Turing Machine: A TM's components (states, alphabet, transitions) can be systematically encoded into a binary string.
- UTM Operation: The UTM reads the encoded TM and its input, then uses its own states and tape to mimic the steps of the encoded TM.
Computational Complexity
Computational complexity studies the resources (time and space) required by algorithms to solve problems.
- Time Complexity: The amount of time an algorithm takes to run as a function of the length of the input. Measured by the number of elementary operations.
- Space Complexity: The amount of memory an algorithm uses as a function of the length of the input. Measured by the number of tape cells used.
Time and Space complexity of a Turing Machine
For a Turing Machine, time complexity is typically measured by the number of steps (transitions) it takes to halt, and space complexity by the number of tape cells visited or used.
Intractability
Intractability refers to problems that cannot be solved within reasonable time bounds (e.g., polynomial time) by any known algorithm.
- P vs NP: P is the class of problems solvable in polynomial time by a deterministic TM. NP is the class of problems whose solutions can be verified in polynomial time by a deterministic TM (or solved in polynomial time by a non-deterministic TM). The P vs NP problem is a major unsolved question in computer science, asking if P = NP.
- NP-complete problems: These are the "hardest" problems in NP. If any NP-complete problem can be solved in polynomial time, then all problems in NP can be solved in polynomial time (meaning P=NP). Examples include Satisfiability (SAT), Traveling Salesperson Problem (TSP).
Reducibility (polynomial time reducibility)
A problem A is polynomial-time reducible to problem B (denoted A ≤_p B) if an instance of A can be transformed into an instance of B in polynomial time, such that solving B also solves A. This concept is crucial for classifying problems into complexity classes like NP-complete.
6.4 Introduction to Computer Graphics
Computer graphics is the field of generating images with the aid of computers. It deals with the creation, manipulation, and display of visual information.
Overview of Computer Graphics
- Raster Graphics: Images are represented as a grid of pixels (picture elements). Each pixel has a color and intensity value. Examples: JPEG, PNG images.
- Vector Graphics: Images are represented using mathematical descriptions of geometric primitives like points, lines, curves, and polygons. They are resolution-independent. Examples: SVG, fonts.
Graphics Hardware
Display Technology
- CRT (Cathode Ray Tube): Older technology, uses an electron gun to fire electrons at a phosphor-coated screen.
- LCD (Liquid Crystal Display): Uses liquid crystals to block or pass light from a backlight.
- LED (Light Emitting Diode): Uses LEDs for backlighting (LCDs) or as individual pixels (OLED). Offers better contrast and energy efficiency.
Architecture of Raster-Scan Displays
In a raster-scan display, the electron beam sweeps across the screen from top to bottom, left to right, illuminating pixels to form an image. A frame buffer stores the pixel data for each scan line.
Vector Displays
Also known as calligraphic or random-scan displays. The electron beam directly draws lines and curves by moving from point to point, rather than scanning the entire screen. Produces sharp, line-based images but cannot fill areas easily.
Display Processors (GPU concept)
A Display Processor (Graphics Processing Unit - GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are highly parallel and efficient at tasks like 3D rendering and image processing.
Output devices
- Monitor: Primary visual output device.
- Printer: Produces hard copies of images.
- Plotter: Specialised printer for large-format, high-precision vector graphics.
Input devices
- Mouse: Pointing device for cursor control.
- Keyboard: For text and command input.
- Light Pen: Pointing device that detects light from the screen.
- Joystick: For controlling movement in games or simulations.
- Scanner: Digitizes physical images or documents.
Graphics Software and Software standards
- OpenGL (Open Graphics Library): A cross-platform, cross-language API for rendering 2D and 3D vector graphics. Widely used in CAD, virtual reality, scientific visualization, and video games.
- DirectX: A collection of APIs for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. Direct3D is its 3D graphics component.
6.5 Two-Dimensional Transformation
2D transformations are fundamental operations in computer graphics used to manipulate the position, orientation, and size of objects within a 2D plane. These transformations are typically represented using matrices for efficient computation.
2D Translation
Moves an object to a new position.
x' = x + tx
y' = y + ty
Matrix Form (Homogeneous Coordinates):
[x'] [1 0 tx] [x] [y'] = [0 1 ty] [y] [1 ] [0 0 1 ] [1]where
(tx, ty) are the translation factors.
2D Rotation
Rotates an object about the origin by an angle θ.
Rotation Matrix:
[x'] [cosθ -sinθ 0] [x] [y'] = [sinθ cosθ 0] [y] [1 ] [0 0 1] [1]
2D Scaling
Changes the size of an object.
x' = x * Sx
y' = y * Sy
Scaling Matrix:
[x'] [Sx 0 0] [x] [y'] = [0 Sy 0] [y] [1 ] [0 0 1] [1]where
(Sx, Sy) are the scaling factors.
2D Reflection
Mirrors an object across an axis or the origin.
- X-axis Reflection Matrix:
[1 0 0] [0 -1 0] [0 0 1] - Y-axis Reflection Matrix:
[-1 0 0] [0 1 0] [0 0 1] - Origin Reflection Matrix:
[-1 0 0] [0 -1 0] [0 0 1]
2D Shear transformation
Distorts an object by shifting points in one direction proportional to their distance from an axis.
- Shear in X (relative to y-axis):
[1 Shx 0] [0 1 0] [0 0 1]x' = x + Shx * y,y' = y - Shear in Y (relative to x-axis):
[1 0 0] [Shy 1 0] [0 0 1]x' = x,y' = y + Shy * x
2D Composite Transformation
A sequence of multiple transformations can be combined into a single composite transformation matrix by multiplying the individual transformation matrices. This allows for efficient application of complex transformations.
Example: Translate, then Rotate, then Scale: M = T * R * S.
2D Viewing Pipeline
The 2D viewing pipeline is the sequence of steps involved in transforming 2D world coordinates to 2D screen coordinates for display.
World Coordinates → Viewing Coordinates → Screen Coordinates
- World Coordinates: The original coordinate system where objects are defined.
- Viewing Transformation: Transforms world coordinates into a view coordinate system (defined by a window). This involves translation and rotation to align the viewing window.
- Screen Transformation (Viewport Transformation): Maps the viewing window to a viewport on the physical display device (screen).
World to screen viewing transformation
This involves mapping a specified rectangular region in world coordinates (the "window") to a rectangular region on the screen (the "viewport"). This transformation includes scaling and translation components.
Clipping: Cohen-Sutherland line clipping
An algorithm for clipping lines against a rectangular window. It quickly determines whether a line segment is entirely inside, entirely outside, or partially inside the clipping window.
Region Codes: Each endpoint of a line is assigned a 4-bit code indicating its position relative to the window (Top, Bottom, Right, Left).
- Bit 1: Above window (Y > Ymax)
- Bit 2: Below window (Y < Ymin)
- Bit 3: Right of window (X > Xmax)
- Bit 4: Left of window (X < Xmin)
Algorithm Steps:
- Compute region codes for both endpoints P1 and P2.
- If both codes are 0000, the line is entirely inside, accept.
- If the bitwise AND of both codes is not 0000, the line is entirely outside, reject.
- Otherwise, the line is partially inside. Choose an endpoint outside the window, intersect the line with a window boundary, and replace the outside point with the intersection point. Repeat until the line is either accepted or rejected.
Clipping: Liang-Barsky Line Clipping
A more efficient line clipping algorithm that uses parametric equations of a line. It calculates the parameter t values (from 0 to 1) where the line intersects the clipping window boundaries.
Parametric Approach: A line segment from (x1, y1) to (x2, y2) is given by x(t) = x1 + t(x2 - x1) and y(t) = y1 + t(y2 - y1) for 0 ≤ t ≤ 1.
The algorithm calculates t values for intersections with each of the four clipping boundaries and maintains an interval [t_enter, t_exit]. The final segment is from t_enter to t_exit if t_enter < t_exit.
6.6 Three-Dimensional Transformation
3D transformations extend 2D concepts to three dimensions, allowing manipulation of objects in 3D space. They are also represented by matrices, typically 4x4 for homogeneous coordinates.
3D Translation, Rotation, Scaling, Reflection, Shear transformation
These transformations are analogous to their 2D counterparts but operate on three coordinates (x, y, z).
- 3D Translation Matrix:
[1 0 0 tx] [0 1 0 ty] [0 0 1 tz] [0 0 0 1 ] - 3D Rotation Matrices (around X, Y, or Z axis): More complex, involving sines and cosines for the respective axis. For example, rotation around Z-axis:
[cosθ -sinθ 0 0] [sinθ cosθ 0 0] [0 0 1 0] [0 0 0 1] - 3D Scaling Matrix:
[Sx 0 0 0] [0 Sy 0 0] [0 0 Sz 0] [0 0 0 1] - 3D Reflection Matrices: Reflection across xy, yz, or xz planes. E.g., reflection across xy-plane (z-coordinate negated):
[1 0 0 0] [0 1 0 0] [0 0 -1 0] [0 0 0 1] - 3D Shear Transformation Matrices: Shear along x, y, or z axes relative to other planes. E.g., shear in X relative to YZ plane:
[1 Shxy Shxz 0] [0 1 0 0] [0 0 1 0] [0 0 0 1]
3D composite transformation
Similar to 2D, multiple 3D transformations can be concatenated by multiplying their respective 4x4 matrices to form a single composite transformation matrix.
3D viewing pipeline
The 3D viewing pipeline involves several stages to transform 3D world coordinates into 2D screen coordinates for display:
- Modeling Transformation: Places objects in the world.
- Viewing Transformation: Positions and orients the virtual camera in the world.
- Projection Transformation: Projects 3D objects onto a 2D view plane.
- Clipping: Removes parts of objects outside the view frustum.
- Viewport Transformation: Maps the projected 2D image to the screen viewport.
Projection concepts
Projection transforms 3D objects into a 2D image. There are two main types:
- Orthographic Projection (Parallel Projection): Projects objects onto a view plane with parallel projection lines. Preserves parallel lines and relative proportions, useful for engineering drawings. There is no perspective distortion.
- Perspective Projection: Projects objects onto a view plane using projection lines that converge to a single point (center of projection). Creates a sense of depth and realism, where objects further away appear smaller.
Projection matrices for each type
Each projection type has a corresponding transformation matrix that transforms 3D homogeneous coordinates into projected 2D homogeneous coordinates (which are then normalized for display).
- Orthographic Projection Matrix: Defines a rectangular viewing volume (orthographic frustum). The matrix maps this volume to a canonical view volume (e.g., [-1,1] in x,y,z).
- Perspective Projection Matrix: Defines a frustum (truncated pyramid) viewing volume. The matrix transforms this frustum into a canonical view volume, introducing the perspective effect by making the W-component of homogeneous coordinates dependent on Z.