Deep generative models can emulate the perceptual properties of complex image datasets, providing a latent representation of the data. However, manipulating such representation to perform meaningful and controllable transformations in the data space remains challenging without some form of supervision. While previous work has focused on exploiting statistical independence to disentangle latent factors, we argue that such requirement is unnecessary and propose instead a non-statistical framework that relies on identifying a modular organization of the network, in line with the organization of the human visual system. Our experiments support that modularity between groups of channels is achieved to a certain degree on a state of the art generative model (BigGAN). This allowed producing targeted interventions on complex image datasets, opening a way to applications such as computationally efficient style transfer and the automated validation of the robustness of pattern recognition systems to contextual changes.