This is done by a two-part clustering process. The patent pool is first coarsely clustered based on their classification codes, with patents crossing technology areas found in between the clusters, a bit like a Venn diagram (see the orange and light blue dots below).
Clusters are more distant from each other based on if their class or subclass is further away.
This is a top-level description of how the hills are created and categorized from each other. The finer categorization through labels within the hills and lagoons is done by semantic analysis of the keywords found within sections of the patents (Title/Abstract etc).