12.5.4. dependencySets 元素

套件中最常见的需求之一是包含项目的依赖。files和fileSets只是处理你项目中的文件,而依赖文件不存在于你的项目中。项目依赖的构件需要在构建过程中由Maven解析。依赖构件是抽象的,它们缺少明确的位置,它们使用一组Maven坐标来进行解析。相比较使用file和fileSets需要一个具体的资源路径,我们使用一组Maven坐标和依赖范围来包含或者排除依赖。

最简单的dependencySet是一个简单空元素:

<assembly>
  ...
  <dependencySets>
    <dependencySet/>
  </dependencySets>
  ...
</assembly>

上述的dependencySet会匹配你项目的所有运行时依赖(运行时范围隐式的包含编译范围依赖),并且它会将这些依赖添加到套件归档的根目录中。如果当前项目的主构件存在,它同时会复制该主构件至套件归档的根目录。

Note

等等?我之前认为dependencySet是用来包含我项目的依赖的,而不是项目的主构件。这一反直觉的副作用是Assembly插件版本2.1的一个bug,但是这个bug被广泛使用了,由于Maven强调向后兼容性,这一反直觉的,错误的行为就必须在2.1和2.2版中保持下来。但是,你可以控制这一行为,只要设置useProjectArtifact为false即可。

虽然没有任何配置的默认依赖集合十分有用,但该元素同时也支持很多配置选项,能让你定制其行为以适应你的环境。例如,第一件你想对依赖集合做的事情可能就是排除当前项目的构件,你会想设置userProjectArtifact为false(再次强调,由于历史原因,该配置的默认值为true)。这让你能够把项目输出和项目依赖分开管理。还有,你可能会将unpack标记为true(默认为false)来拆解依赖构件。当unpack被设置成true时,Assembly插件会组合所有匹配依赖的拆解内容至归档的根目录。

从这里你就可以看到,你有很多选择来控制依赖集合。下一节我们讨论如何定义依赖集合的输出位置,如何通过范围来包含和排除依赖。最后,我们会扩展依赖集的拆解功能,研究一些拆解依赖的高级选项。

12.5.4.1. 自定义依赖输出目录

有两个配置选项可以用来协调定义依赖文件在套件归档中的位置:outputDirectory和outputFileNameMapping。你可能想要使用依赖构件自身的属性来定义其在套件中的位置。比如说你想要将依赖放到与其groupId对应的目录中。这时候,你可以使用dependencySet的outputDirectory元素,并提供如下的配置:

<assembly>
  ...
  <dependencySets>
    <dependencySet>
      <outputDirectory>org.sonatype.mavenbook</outputDirectory>
    </dependencySet>
  </dependencySets>
  ...
</assembly>

这会使得所有单独的依赖被放到与其groupId对应的子目录中。

如果你想要更进一步的自定义,并移除所有依赖的版本号。你可以使用outputFileNameMapping来自定义每个输出文件的文件名,如:

<assembly>
  ...
  <dependencySets>
    <dependencySet>
      <outputDirectory>org.sonatype.mavenbook</outputDirectory>
      <outputFileNameMapping>
        ${module.artifactId}.${module.extension}
      </outputFileNameMapping>
    </dependencySet>
  </dependencySets>
  ...
</assembly>

这个例子中,依赖commons:commons-codec版本1.3最后会成为文件commons/commons-codec.jar。

12.5.4.2. 依赖输出位置的属性插值

正如在套件插值一小节所介绍的那样,outputDirectory和outputFileNameMapping不会和套件描述符的其它内容一样接受插值,这是因为它们的原始值必须使用额外的,构件特定的表达式解析器进行解释。

对于这两个元素可用的构件表达式只有微小的差别。对两者来说,所有POM中及套件描述符其它部分中可用的${project.*}, ${pom.*}, 和${*}表达式,这里也能用。对于outputFileNameMapping元素来说,解析表达式的过程如下:

  1. 如果表达式匹配模式${artifact.*}:

    1. 基于依赖的Artifact实例进行匹配(解析:groupId, artifactId, version, baseVersion, scope, classifier, 和file.*)

    2. 基于依赖的ArtifactHander实例进行匹配(解析:expression)

    3. 基于和依赖Artifact相关的Project实例进行匹配(解析:主要是POM属性)

  2. 如果表达式匹配模式${pom.*}或者${project.*}:

    1. 基于当前构建的项目实例(MavenProject)进行解析。

  3. 如果表达式匹配模式${dashClassifier?},而且Artifact实例包含一个非空的classfier,则解析成classifier前置一个破折号(-classfier)。否则,解析成一个空字符串。

  4. 尝试基于当前构建的项目实例解析表达式。

  5. 尝试基于当前项目的POM属性解析表达式。

  6. 尝试基于系统属性解析表达式。

  7. 尝试基于操作系统环境变量解析表达式。

outputDirectory也以差不多的方式进行插值,区别在于,对于outputDirectory没有可用的${artifact.*}信息,而只有特定构件的${project.*}实例信息。因此,上述罗列的的相关条目(上述处理过程列表中的1a, 1b和3)就无效了。

我们怎么知道何时使用outputDirectory,何时使用outputFileNameMapping呢?当依赖被拆解的时候,只有outputDirectory会被用来计算输出路径。当依赖以完整的文件被管理时(不拆解),outputDirectory和outputFileNameMapping两者可以同时使用。在同时的时候,其结果等价于:

<archive-root-dir>/<outputDirectory>/<outputFileNameMapping>

在没有outputDirectory的时候,它便不被使用。当没有outputFileNameMapping的时候,其默认值为:

content-zh-0.5${dashClassifier?}.${artifact.extension}

12.5.4.3. Including and Excluding Dependencies by Scope

In Chapter 9, 项目对象模型, it was noted that all project dependencies have one scope or another. Scope determines when in the build process that dependency normally would be used. For instance, test-scoped dependencies are not included in the classpath during compilation of the main project sources; but they are included in the classpath when compiling unit test sources. This is because your project’s main source code should not contain any code specific to testing, since testing is not a function of the project (it’s a function of the project’s build process). Similarly, provided-scoped dependencies are assumed to be present in the environment of any eventual deployment. However, if a project depends on a particular provided dependency, it is likely to require that dependency in order to compile. Therefore, provided-scoped dependencies are present in the compilation classpath, but not in the dependency set that should be bundled with the project’s artifact or assembly.

Also from Chapter 9, 项目对象模型, recall that some dependency scopes imply others. For instance, the runtime dependency scope implies the compile scope, since all compile-time dependencies (except for those in the provided scope) will be required for the code to execute. There are a number of complex relationships between the various dependency scopes which control how the scope of a direct dependency affects the scope of a transitive dependency. In a Maven Assembly descriptor, we can use scopes to apply different settings to different sets of dependencies accordingly.

For instance, if we plan to bundle a web application with Jetty to create a completely self-contained application, we’ll need to include all provided-scope dependencies somewhere in the jetty directory structure we’re including. This ensures those provided dependencies actually are present in the runtime environment. Non-provided, runtime dependencies will still land in the WEB-INF/lib directory, so these two dependency sets must be processed separately. These dependency sets might look similar to the following XML.

Example 12.9. Defining Dependency Sets Using Scope

<assembly>
  ...
  <dependencySets>
    <dependencySet>
      <scope>provided</scope>
      <outputDirectory>lib/content-zh</outputDirectory>
    </dependencySet>
    <dependencySet>
      <scope>runtime</scope>
      <outputDirectory>
        webapps/${webContextName}/WEB-INF/lib
      </outputDirectory>
    </dependencySet>
  </dependencySets>
  ...
</assembly>

Provided-scoped dependencies are added to the lib/ directory in the assembly root, which is assumed to be a libraries directory that will be included in the Jetty global runtime classpath. We’re using a subdirectory named for the project’s artifactId in order to make it easier to track the origin of a particular library. Runtime dependencies are included in the WEB-INF/lib path of the web application, which is located within a subdirectory of the standard Jetty webapps/ directory that is named using a custom POM property called webContextName. What we've done in the previous example is separate application-specific dependencies from dependencies which will be present in a Servlet contains global classpath.

However, simply separating according to scope may not be enough, particularly in the case of a web application. It’s conceivable that one or more runtime dependencies will actually be bundles of standardized, non-compiled resources for use in the web application. For example, consider a set of web application which reuse a common set of Javascript, CSS, SWF, and image resources. To make these resources easy to standardize, it’s a common practice to bundle them up in an archive and deploy them to the Maven repository. At that point, they can be referenced as standard Maven dependencies - possibly with a dependency type of zip - that are normally specified with a runtime scope. Remember, these are resources, not binary dependencies of the application code itself; therefore, it’s not appropriate to blindly include them in the WEB-INF/lib directory. Instead, these resource archives should be separated from binary runtime dependencies, and unpacked into the web application document root somewhere. In order to achieve this kind of separation, we’ll need to use inclusion and exclusion patterns that apply to the coordinates of a specific dependency.

In other words, say you have three or four web application which reuse the same resources and you want to create an assembly that puts provided dependencies into lib/, runtime dependencies into webapps/<contextName>/WEB-INF/lib, and then unpacks a specific runtime dependency into your web application's document root. You can do this because the Assembly allows you to define multiple include and exclude patterns for a given dependencySet element. Read the next section for more development of this idea.

12.5.4.4. Fine Tuning: Dependency Includes and Excludes

A resource dependency might be as simple as a set of resources (CSS, Javascript, and Images) in a project that has an assembly which creates a ZIP archive. Depending on the particulars of our web application, we might be able to distinguish resource dependencies from binary dependencies solely according to type. Most web applications are going to depend on other dependencies of type jar, and it is possible that we can state with certainty that all dependencies of type zip are resource dependencies. Or, we might have a situation where resources are stored in jar format, but have a classifier of something like resources. In either case, we can specify an inclusion pattern to target these resource dependencies and apply different logic than that used for binary dependencies. We’ll specify these tuning patterns using the includes and excludes sections of the dependencySet.

Both includes and excludes are list sections, meaning they accept the sub-elements include and exclude respectively. Each include or exclude element contains a string value, which can contain wildcards. Each string value can match dependencies in a few different ways. Generally speaking, three identity pattern formats are supported:

groupId:artifactId - version-less key

You would use this pattern to match a dependency by only the groupId and the artifactId

groupId:artifactId:type[:classifier] - conflict id

The pattern allows you to specify a wider set of coordinates to create a more specific include/exclude pattern.

groupId:artifactId:type[:classifier]:version - full artifact identity

If you need to get really specific, you can specify all the coordinates.

All of these pattern formats support the wildcard character ‘*’, which can match any subsection of the identity and is not limited to matching single identity parts (sections between ‘:’ characters). Also, note that the classifier section above is optional, in that patterns matching dependencies that don’t have classifiers do not need to account for the classifier section in the pattern.

In the example given above, where the key distinction is the artifact type zip, and none of the dependencies have classifiers, the following pattern would match resource dependencies assuming that they were of type zip:

*:zip

The pattern above makes use of the second dependency identity: the dependency’s conflict id. Now that we have a pattern that distinguishes resource dependencies from binary dependencies, we can modify our dependency sets to handle resource archives differently:

Example 12.10. Using Dependency Excludes and Includes in dependencySets

<assembly>
  ...
  <dependencySets>
    <dependencySet>
      <scope>provided</scope>
      <outputDirectory>lib/content-zh</outputDirectory>
    </dependencySet>
    <dependencySet>
      <scope>runtime</scope>
      <outputDirectory>
        webapps/${webContextName}/WEB-INF/lib
      </outputDirectory>
      <excludes>
        <exclude>*:zip</exclude>
      </excludes>
    </dependencySet>
    <dependencySet>
      <scope>runtime</scope>
      <outputDirectory>
        webapps/${webContextName}/resources
      </outputDirectory>
      <includes>
        <include>*:zip</include>
      </includes>
      <unpack>true</unpack>
    </dependencySet>
  </dependencySets>
  ...
</assembly>

In Example 12.10, “Using Dependency Excludes and Includes in dependencySets, the runtime-scoped dependency set from our last example has been updated to exclude resource dependencies. Only binary dependencies (non-zip dependencies) should be added to the WEB-INF/lib directory of the web application. Resource dependencies now have their own dependency set, which is configured to include these dependencies in the resources directory of the web application. The includes section in the last dependencySet reverses the exclusion from the previous dependencySet, so that resource dependencies are included using the same identity pattern (i.e. *:zip). The last dependencySet refers to the shared resource dependency and it is configured to unpack the shared resource dependency in the document root of the web application.

Example 12.10, “Using Dependency Excludes and Includes in dependencySets was based upon the assumption that our shared resources project dependency had a type which differed from all of the other dependencies. What if the share resource dependency had the same type as all of the other dependencies? How could you differentiate the dependency? In this case if the shared resource dependency had been bundled as a JAR with the classifier resources, you can change to the identity pattern and match those dependencies instead:

*:jar:resources

Instead of matching on artifacts with a type of zip and no classifier, we’re matching on artifacts with a classifier of resources and a type of jar.

Just like the fileSets section, dependencySets support the useStrictFiltering flag. When enabled, any specified patterns that don’t match one or more dependencies will cause the assembly - and consequently, the build - to fail. This can be particularly useful as a safety valve, to make sure your project dependencies and assembly descriptors are synchronized and interacting as you expect them to. By default, this flag is set to false for the purposes of backward compatibility.

12.5.4.5. Transitive Dependencies, Project Attachments, and Project Artifacts

The dependencySet section supports two more general mechanisms for tuning the subset of matching artifacts: transitive selection options, and options for working with project artifacts. Both of these features are a product of the need to support legacy configurations that applied a somewhat more liberal definition of the word “dependency”. As a prime example, consider the project’s own main artifact. Typically, this would not be considered a dependency; yet older versions of the Assembly plugin included the project artifact in calculations of dependency sets. To provide backward compatibility with this “feature”, the 2.2 releases (currently at 2.2-beta-2) of the Assembly plugin support a flag in the dependencySet called useProjectArtifact, whose default value is true. By default, dependency sets will attempt to include the project artifact itself in calculations about which dependency artifacts match and which don’t. If you’d rather deal with the project artifact separately, set this flag to false.

Tip

The authors of this book recommend that you always set useProjectArtifact to false.

As a natural extension to the inclusion of the project artifact, the project’s attached artifacts can also be managed within a dependencySet using the useProjectAttachments flag (whose default value is false). Enabling this flag allows patterns that specify classifiers and types to match on artifacts that are “attached” to the main project artifact; that is, they share the same basic groupId/artifactId/version identity, but differ in type and classifier from the main artifact. This could be useful for including JavaDoc or source jars in an assembly.

Aside from dealing with the project’s own artifacts, it’s also possible to fine-tune the dependency set using two transitive-resolution flags. The first, called useTransitiveDependencies (and set to true by default) simply specifies whether the dependency set should consider transitive dependencies at all when determining the matching artifact set to be included. As an example of how this could be used, consider what happens when your POM has a dependency on another assembly. That assembly (most likely) will have a classifier that separates it from the main project artifact, making it an attachment. However, one quirk of the Maven dependency-resolution process is that the transitive-dependency information for the main artifact is still used when resolving the assembly artifact. If the assembly bundles its project dependencies inside itself, using transitive dependency resolution here would effectively duplicate those dependencies. To avoid this, we simply set useTransitiveDependencies to false for the dependency set that handles that assembly dependency.

The other transitive-resolution flag is far more subtle. It’s called useTransitiveFiltering, and has a default value of false. To understand what this flag does, we first need to understand what information is available for any given artifact during the resolution process. When an artifact is a dependency of a dependency (that is, removed at least one level from your own POM), it has what Maven calls a "dependency trail", which is maintained as a list of strings that correspond to the full artifact identities (groupId:artifactId:type:[classifier:]version) of all dependencies between your POM and the artifact that owns that dependency trail. If you remember the three types of artifact identities available for pattern matching in a dependency set, you’ll notice that the entries in the dependency trail - the full artifact identity - correspond to the third type. When useTransitiveFiltering is set to true, the entries in an artifact’s dependency trail can cause the artifact to be included or excluded in the same way its own identity can.

If you’re considering using transitive filtering, be careful! A given artifact can be included from multiple places in the transitive-dependency graph, but as of Maven 2.0.9, only the first inclusion’s trail will be tracked for this type of matching. This can lead to subtle problems when collecting the dependencies for your project.

Warning

Most assemblies don’t really need this level of control over dependency sets; consider carefully whether yours truly does. Hint: It probably doesn't.

12.5.4.6. Advanced Unpacking Options

As we discussed previously, some project dependencies may need to be unpacked in order to create a working assembly archive. In the examples above, the decision to unpack or not was simple. It didn’t take into account what needed to be unpacked, or more importantly, what should not be unpacked. To gain more control over the dependency unpacking process, we can configure the unpackOptions element of the dependencySet. Using this section, we have the ability to choose which file patterns to include or exclude from the assembly, and whether included files should be filtered to resolve expressions using current POM information. In fact, the options available for unpacking dependency sets are fairly similar to those available for including files from the project directory structure, using the file sets descriptor section.

To continue our web-application example, suppose some of the resource dependencies have been bundled with a file that details their distribution license. In the case of our web application, we’ll handle third-party license notices by way of a NOTICES file included in our own bundle, so we don’t want to include the license file from the resource dependency. To exclude this file, we simply add it to the unpack options inside the dependency set that handles resource artifacts:

Example 12.11. Excluding Files from a Dependency Unpack

<asembly>
  ...
  <dependencySets>
    <dependencySet>
      <scope>runtime</scope>
      <outputDirectory>
        webapps/${webContextName}/resources
      </outputDirectory>
      <includes>
        <include>*:zip</include>
      </includes>
      <unpack>true</unpack>
      <unpackOptions>
        <excludes>
          <exclude>**/LICENSE*</exclude>
        </excludes>
      </unpackOptions>
    </dependencySet>
  </dependencySets>
  ...
</assembly>

Notice that the exclude we’re using looks very similar to those used in fileSet declarations. Here, we’re blocking any file starting with the word LICENSE in any directory within our resource artifacts. You can think of the unpack options section as a lightweight fileSet applied to each dependency matched within that dependency set. In other words, it is a fileSet by way of an unpacked dependency. Just as we specified an exclusion pattern for files within resource dependencies in order to block certain files, you can also choose which restricted set of files to include using the includes section. The same code that processes inclusions and exclusions on fileSets has been reused for processing unpackOptions.

In addition to file inclusion and exclusion, the unpack options on a dependency set also provides a filtering flag, whose default value is false. Again, this should be familiar from our discussion of file sets above. In both cases, expressions using either the Maven syntax of ${property} or the Ant syntax of @property@ are supported. Filtering is a particularly nice feature to have for dependency sets, though, since it effectively allows you to create standardized, versioned resource templates that are then customized to each assembly as they are included. Once you start mastering the use of filtered, unpacked dependencies which store shared resources, you will be able to start abstracting repeated resources into common resource projects.

12.5.4.7. Summarizing Dependency Sets

Finally, it’s worth mentioning that dependency sets support the same fileMode and directoryMode configuration options that file sets do, though you should remember that the directoryMode setting will only be used when dependencies are unpacked.