How to Find Communities in a Network

In 2002, Michelle Girvan and Mark Newman proposed an algorithm that discovers hidden groups by systematically removing the bridges between them.

scroll to begin

The Network

Nodes

Edges

Communities

A network

Fifteen people, connected by friendships. The raw graph gives us nodes and edges — but no labels, no group assignments. Just structure.

Hidden clusters

Within this network, some groups are tightly knit — densely connected internally, with only a few links to outsiders. These are communities.

But they aren't labeled. Can an algorithm discover them?

Think about traffic

Imagine sending a message between every pair of nodes along the shortest route. Some edges carry far more traffic than others.

The edges between communities sit on many shortest paths. The edges within communities sit on fewer.

Edge betweenness

This count — the number of shortest paths crossing an edge — is its betweenness centrality. Thicker lines carry more traffic.

The three bridge edges are clearly the busiest.

Cut the busiest edge

Remove the edge with the highest betweenness. It was the primary bridge between two groups.

The network is still connected — traffic reroutes through the remaining bridges.

Cut again

Recompute betweenness on the updated network. The remaining bridges now carry even more traffic. Remove the next one.

A group breaks away.

One more

Remove the last bridge. Three clusters drift apart — each one a dense, self-contained community.

Communities found

The algorithm has discovered three groups using only the connection pattern — no metadata, no labels.

This is the Girvan-Newman algorithm: iteratively remove high-betweenness edges and watch communities emerge.

How do we know when to stop?

Removing edges reveals communities, but how many communities is the right number? Remove too few edges and the groups are still merged. Remove too many and you shatter the network into isolated nodes.

The answer is modularity — a score that measures how well a partition captures community structure. It peaks at the best split.

Modularity across edge removals

The algorithm removes one edge at a time. Modularity rises as communities separate, peaks at the natural partition, then falls as structure is lost.

15-node graph, 27 edges. Peak modularity marks the 3-community partition.

A famous test case

In the 1970s, sociologist Wayne Zachary observed a karate club that split into two factions after a dispute between the instructor and the club president. He recorded every friendship between the 34 members.

Girvan-Newman, given only the friendship network, correctly recovers the two factions. It's become the standard benchmark for community detection algorithms.

Zachary's Karate Club

34 members, 78 friendships. Colors show the communities discovered by Girvan-Newman at peak modularity.

Node 0 = instructor (Mr. Hi). Node 33 = club president. The algorithm finds the real split.

Where this appears

Community detection is used to find functional modules in protein interaction networks, identify research clusters in citation graphs, detect echo chambers in social media, and partition supply chains into regional hubs.

The Girvan-Newman algorithm is too slow for very large networks — betweenness recomputation is expensive — but its principle remains foundational: the boundaries between groups are where the information flows.

Try it yourself

Choose a network and step through the algorithm. Watch betweenness redistribute as edges are removed and communities form.

Network — 15 nodes, 27 edges, 1 community

Modularity

Communities over time