In this post I just want to share a couple of thoughts on Maven Repository Managers – what they are, and why you would want to use one. Let’s start out with a couple of definitions:
A Maven Repository is a collection of artifacts and metadata stored in a defined hierarchy. It could be as simple as just a collection of files in a file system with a particular directory structure, or on a simple web server. This might work well for a single developer’s local repository (in the case of the files), or for read only artifacts (in the case of the web server) but it is not so ideal for a group of developers who are both consuming and producing artifacts.
A Maven Repository Manager is a piece of software that allows you to host and manage a number of Maven Repositories. Why would you want more than one? Usually people like to keep their work in progress in a separate repository to their completed artifacts – perhaps they have different backup policies or access controls. Also, you can use a Maven Repository Manager as a proxy for external Maven Repositories (like Maven Central for example). This allows you to cache artifacts and share them among your developers – reducing the build times and reducing your download costs too. It is also handy if you need to be able to build without a connection to the Internet.
The diagram above shows the relationship between the various repositories. Maven Central is the main repository. It contains mostly free and open source artifacts. It is basically read only to the normal Maven user. There are also other Maven repositories on the Internet that hold both open source and commercial artifacts.
The box labeled ‘Internal Repository Manager’ represents a Maven Repository Manager deployed inside an organization. Each developer will also have their own local repository.
There are a number of open source and commercial products available that can act as the Maven Repository Manager. Here are a few of the common ones:
Typically you would set up a number of repositories, for example you might have repositories for:
- Mirrors/proxies/caches of external repositories like Maven Central, java.net, etc.,
- Artifacts that are being developed by your organization, this is often called a ‘snapshot’ repository, and
- Finished artifacts that were developed by your organization, this is often called an ‘internal’ repository.
You may also want to create a repository that holds artifacts that developers can use as dependencies when working on projects. If you want to keep tight control over the artifacts that can be used, for example to ensure your project does not use third party components with incompatible license terms, or to ensure everyone is using the same version of an important dependency, then forcing dependencies to come from your internal repository is a good approach to exercise this control.
When you are using continuous integration, you may end up with a lot of builds going on, and the built artifacts will typically be stored in your snapshot repository. This can lead to a large amount of data in a fairly short time, even in a relatively modest development organization. So snapshot maintenance is important. Most of the Maven Repository Managers will allow you to configure how many snapshot versions of an artifact to keep, and will automatically purge old snapshots to conserve space.