COMP1927 - Graphs

graphs.jpg

Introduction

A graph is defined as a set of vertices connected by edges. (Essentially objects linked together in various ways).

Applications

Graphs can be used to visualise and implement a whole range of problems and systems. Essentially anything that can be seen as a network with nodes and connections can become a graph.

Communication

Vertices: Computers, telephones
Edges: Telephone lines, network cables

Mechanical

Vertices: Joint
Edges: Springs, rods, beams

Transportation

Vertices: Street intersections, airports, stations
Edges: Roads, airway routes, train tracks

Software Systems

Vertices: Functions
Edges: Function calls

The Internet

Vertices: Web pages
Edges: Links

Social Relationships

Vertices: People
Edges: Friendships, relationships, familial contacts

Chemical Compounds

Vertices: Molecules
Edges: Bonds

Neural Networks

Vertices: Neurons
Edges: Synapses

Terminology

Assumption: Graphs have no parallel edges and no self-loops. (i.e. there can only be one edge between a and b, where a != b)

Completeness

A Graph is complete if there is an edge from each vertex to every other vertex. Hence the number of edges is:

(1)
\begin{equation} |E| = V(V-1)/2 \end{equation}

Which makes sense; the first node you start with has n-1 edges. The next node has n-2 edges uncounted. The third node has n-3 edges. And so on. Hence it is the sum of 1 to n-1.

A complete subgraph is known as a clique

Density

The ratio of edges to vertices can very from 0 edges to a complete graph. This can effect the choice of data structures used to implement the graph, and the algorithms used to access it.

Hence we classify graphs as somewhere along the scale of sparse to dense.

Sparse graphs are more on the order of V edges, whereas Dense graphs are more on the order of V(V-1)/2.

Trees

A tree is defined as a connected graph with no cycles. Whereby connected means that every node can be reached from every other node (not necessarily with an edge between them, but with a path of multiple edges).

A spanning tree is a tree that contains all vertices in the graph. Hence there is at least one path that connects every single vertex within the graph.

Subgraphs

A subgraph is essentially a subset of the edges set, and every vertex involved with said edges.
E.g. edges 1-2, 2-4 = vertices 1, 2, 4

An induced subgraph takes a subset of the vertices (rather than the edges) and includes all edges that connect two of the vertices within the induced subgraph.

Miscellaneous Definitions

Connected Graph
There is a path that contains every node.
Cycle
A cycle is essentially a loop; a path that starts and ends at the same node.
Adjacency
Two vertices are adjacent if they are connected with an edge.
(Vertex) Degree
The number of edges connected to a vertex.
Path
A path is a set of edges that can be joined together to traverse a range of vertices. E.g. 1-2, 2-4, 4-3, 3-6 is a path from 1 through to 6.
Multi-Graph
A multi-graph has multiple edges between the same vertices (potentially with different weights - see below).

Hamiltonian (Vertices)

A Hamiltonian path is any path that visits every vertex exactly once.
Any Hamiltonian path that starts and ends at the same vertex is classed as a Hamiltonian cycle.

Example

Hamiltonian paths are a special case of the Travelling Salesman Problem, where vertices connected by an edge have distance 1, and those not have an infinite distance between them.

Eulerian (Edges)

An Eulerian path is any path that visits every edge exactly once, without starting and ending at the same vertex
Any Eulerian path that does start and end at the same vertex is classed as an Eulerian cycle.

Example

GraphRep?action=AttachFile&do=get&target=konigsbergtown.png

The most famous example of an Eulerian path is the Konigsberg Bridge problem; it is impossible to cross every bridge exactly once.

Theorems for Determining Eulerian Paths

It is, naturally, only possible to have a Hamiltonian or Eulerian path in a connected graph.

More specifically, an Eulerian cycle will exist if and only if all vertices have even degree, and an Eulerian path will exist if and only if exactly 2 vertices have an odd degree, with the rest even.

Relationships Between Eulerian and Hamiltonian Paths/Cycles

Knowing if a graph has a Hamiltonian/Eulerian path/cycle gives us no information about the other type of path.

Directed/Undirected Graphs

An undirected graph (the usual assumption) is one where an edge from /a to b// is also an edge from b to a. The relation of edges is symmetric.

Conversely, a graph is directed if each edge has a direction attached. Hence a to b is distinct from b to a.

Weighted Graphs

In a weighted graph each edge has a value, or weight, assigned to it. This can be used to specify things such as distance between nodes.

Fun Facts

It is impossible to construct a graph with an overall odd vertex degree, as every edge must start and end somewhere.
Hence each edge increases the degree by 2, and hence keeps it even.
As such, there must always be an even number of vertices of odd degree.

Implementation

Implementing graphs isn't done in quite the same way as, say, a tree. Instead of storing the actual graph in memory with nodes malloc'd and pointers between them as edges, we store information about the graph.

It's a little non-intuitive, but essentially we store data about the edges and vertices in two main forms:

Adjacency Matrix

An adjacency matrix is an n-by-n array where each cell ([x][y]) stores true or false for whether there is an edge from x to y.

In an undirected graph this matrix is symmetrical along the [0][0]-[n][n] diagonal.

Code can be found on the Adjacency Matrix page.

Adjacency List

An adjacency list is an array of n cells, each of which contains a linked list. The list contains a node for each vertex where there is an edge from [i] to v.

Code can be found on the Adjacency List page.

Comparison Between Matrices and Lists

Adjacency Matrix Adjacency List
Space V2 V + E
Initialise V2 V
Insert Edge 1 1
Find/Remove Edge 1 V

As you can see the adjacency list saves significantly on space and initialisation costs, but takes linear rather than costant time to query the edges.