Files
bower/README.md
2013-04-19 01:59:28 +01:00

15 KiB

Bower rewrite

Why?

Bower codebase is becoming unmanageable, especially at its core. Main issues are:

  • No separation of concerns. The overall codebase has grown in a patch fashion, which has lead to a bloated and tight coupled solution.
  • Monolithic Package.js that handles all package types (both local and remote Git, URL, local files, etc).
  • Package.js has a big nesting level of callbacks, causing confusion and making the code hard to read.
  • Some commands, such as install and update, have incorrect behaviour (#200, #256)
    • This is directly related with the current implementation of bower core: Package.js and Manager.js
  • Programmatic usage needs improvement
    • Unable to spawn multiple commands in parallel in different folders
    • Some commands simply do not fire the end event
    • Others fire the error event many times
    • Some commands should fire more meaningful events (e.g.: install should fire each installed package)

Main goals

  • Ease the process of gathering more contributors.
  • Clear architecture and separation of concerns.
  • Installation/update speedup.
  • Named endpoints on the CLI install.
  • Offline installation of packages, thanks to the cache.
  • Ability to easily add package types (SVN, etc).
  • Support for commit hashes and branches in targets for Git endpoints.
  • Improved output after installation/update.
  • Integrate with update-notifier and yeomen insight.

Implementation details

Term dictionary

  • Canonical package: A folder containing all the files that belong to a package. May include a bower.json file inside. (typically what gets installed)
  • Source: URL, git endpoint, etc.
  • Target: semver range, commit hash, branch (indicates a version).
  • Endpoint: source#target
  • Named endpoint: name@endpoint#target
  • UoW: Unit of Work
  • Components folder: The folder in which components are installed (bower_components by default).
  • Package meta: A data structure similar to the one found in bower.json, which might also contain additional information. This is usually stored in a .bower.json file.

Overall strategy

Really nicely drawn architecture diagram

Bower is composed of the following components:

  • CLI: Command line interface for Bower.
  • .bowerrc: Allows for customisations of Bower behaviour at the project/user level.
  • bower.json: Main purpose is to declare the component dependencies and other component related information.
  • Manager: Main coordinator, responsible for:
    • Checking which packages are already installed in the current bower folder.
    • Deciding which version of the dependencies should be fetched from the PackageRepository, while keeping every dependant compatible (note that the Manager is server aware).
    • Tracking which dependencies have been fetched, which ones failed to fetch, and which ones are being fetched.
    • Requesting the PackageRepository to fail-fast, in case it realises there is no resolution for the current dependency tree.
  • PackageRepository: Abstraction to the underlying complexity of heterogeneous source types. Responsible for:
    • Storing new entries in ResolveCache.
    • Queueing resolvers into the UoW, if no suitable entry is found in the ResolveCache.
  • ResolveCache: Keeps a cache of previously resolved endpoints. Lookup can be done using an endpoint.
  • UnitOfWork: Work coordinator, responsible for:
    • Keeping track of which resolvers are being resolved.
    • Limiting amount of parallel resolutions.
  • ResolverFactory: Parses an endpoint and returns a Resolver capable of resolving the source type.
  • Resolver: Base resolver, which can be extended by concrete resolvers, like UrlResolver, GitRemoteResolver, etc.

You can find additional details about each of these components below, in Architecture components details.

Resolve process

Here's an overview of the dependency resolve process:

  1. INSTALL/UPDATE - A set of named endpoints and/or endpoints is requested to be installed/updated, and these are passed to the Manager.

  2. ANALIZE COMPONENTS FOLDER - Manager starts by reading the components folder and understanding which packages are already installed.

  3. ENQUEUE ENDPOINTS - For each endpoint that should be fetched, the Manager enqueues the named endpoints/endpoints in the PackageRepository. Some considerations:

    • If a package should be fetched or not depends on the following conditions:
      • What operation is being done (install/update).
      • If package is already installed.
      • If Manager has already enqueued that named endpoint/endpoint in the current runtime (regardless of the fetch being currently in progress, already complete, or failed).
      • Additional flags (force, etc).
  4. FABRICATE RESOLVERS - For each of the endpoints, the PackageRepository requests the ResolverFactory for suitable resolvers, capable of handling the source type. Some considerations:

    • The factory method takes the source string as its main argument, and is also provided with target, and package name if possible (all this information is extracted from the named endpoint/endpoint).
    • This method is asynchronous, in order to allow for I/O operations to happen, without blocking the whole process (e.g., querying registry, etc).
    • There is a runtime internal cache of sources that have already been analysed, and what type of Resolver resulted from that analysis. This speeds up the decision process, particularly for aliases (registered packages), and published packages, which would required HTTP requests.
  5. LOOKUP CACHE - PackageRepository looks up the ResolveCache using the endpoint, for a cached canonical package that complies to the endpoint target. Some considerations:

    • The lookup is performed using an endpoint that is fetched from the Resolver. This allows the resolver to guarantee that the endpoint has been normalised (twitter/bootstrap -> git://github.com/twitter/bootstrap.git, etc).
    • The ResolveCache is semver aware. What this means, is that if you try to lookup ~1.2.1, and the cache has a entries for versions 1.2.3 and 1.2.4, it will give a hit with 1.2.4.
  6. CACHE HIT VALIDATION - At this stage, and only for the cache hits, the PackageRepository will question the Resolver if there is any version higher than the one fetched from cache that also complies with the endpoint target. Some considerations:

    • This step is ignored in case a flag like offline is passed.
    • How the Resolver checks this, depends on the Resolver type. (e.g. GitRemoteResolver would fetch the git refs with git ls-remote --tags --heads, and check if there is a higher version that complies with the target).
    • This check should be as quick as possible. If the process of checking a new version is too slow, it's preferable to just assume there is a new version.
    • If there is no way to check if there is a higher version, assume that there is.
    • If the Resolver indicates that the cached version is outdated, then it is treated as a cache miss.
  7. RESOLVE CACHE MISSES - Any cache miss needs to be resolved, so the PackageRepository requests each of the remaining resolvers to resolve, and waits.

  8. CACHE RESOLVED PACKAGES - As the resolvers complete the resolution, the PackageRepository stores the canonic packages in the ResolveCache, along with the source, version, and any additional information that the Resolver provides. This allows resolvers to store additional details about the fetched package to be used for future cache hit validations (e.g. store HTTP expiration headers in the case of the UrlPackage).

  9. RETURN PACKAGE TO MANAGER - The PackageRepository returns the canonical package to the Manager.

  10. EVALUATE RESOLVED PACKAGE DEPENDENCIES - The Manager checks if the returned canonical packages have a bower.json file describing additional dependencies and, if so, continue in point #3. If there are no more unresolved dependencies, finish up the installation procedure.


Architecture components details

Manager

TODO

PackageRepository

Constructor

PackageRepository()

Public methods

PackageRepository#get(endpoint): Promise

Enqueues an endpoint to be fetched, and returns a promise of a canonical package.

PackageRepository#abort(): Promise

Aborts any queued package lookup as soon as possible, and returns a promise that everything has been aborted.

Protected methods

CONTINUE HERE

ResolverFactory

Simple function that takes a named endpoint/endpoint with options and creates an instance of a concrete Resolver that obeys the base Resolver interface.

function createResolver(endpoint, options) -> Promise

This function will perform transformations/normalisations to the endpoint, like expanding shorthand andendpoints. The function is async to allow querying the Bower registry, etc.

ResolveCache

TODO

Resolver

The Resolver class extends EventEmitter. Think of it as an abstract class that implements the resolver interface as well as serving as a base for other resolver types.

Resolvers are responsible for the following:

  • Based on an endpoint, fetch the contents of the package into a temporary folder (step is implemented by the _resolveSelf() method).
  • After the package is fetched, the bower.json/component.json (deprecated) file is read, validated and normalised (fill in properties) into a package meta object. If the file does not exist, a base one is inferred. Note that this should be done using a node module that is common for both the Bower client and the server.
  • Update any relevant information based on the package meta (e.g. this step may emit a name_change).
  • Attach any additional meta data to the package meta. (e.g. the UrlResolver might store some HTTP response headers, to aid the hasNew() decision later on).
  • Applying the ignore constraint based on the package meta. Files are effectively removed in this step.
  • Storing the package meta into a .bower.json hidden file.
Events
  • name_change: fired when the name of the package has changed
  • action: fired to inform the current action being performed by the resolver
  • warn: fired to inform a warning, e.g.: deprecation
Constructor

Resolver(source, options)

Options:

  • name - the name (if none is passed, one will be guessed from the source)
  • target - the target (defaults to *)
  • config - the config to use (defaults to the global config)

Public functions

Resolver#getName(): String

Returns the name.

Resolver#getSource(): String

Returns the source.

Resolver#getTarget(): String

Returns the target.

Resolver#getTempDir(): String

Returns the local temporary folder into which the package is being fetched. The files will remain here until the folder is moved when installing.

Resolver#hasNew(canonicalPackage): Promise

Checks if there is a version more recent than the provided canonicalPackage (folder) that complies with the resolver target.

Resolver#resolve(): Promise

Resolves the resolver, and returns a promise of a canonical package. The resolve process is as follows:

  • calls _createTempDir() and waits.
  • When done, calls _resolveSelf() and waits.
  • When done, calls _readJson() and waits (validation and normalisation also happens here).
  • When done, calls _decoratePkgMeta(), giving the resolver the chance to attach additional information about the resolved package (HTTP headers, etc).
  • When done, calls both, and waits:
    • _applyPkgMeta(meta)
    • _savePkgMeta(meta)
  • When done, resolves the promise with the temp dir, which is now a canonical package.

Resolver#getPackageMeta(): Object

Get the package meta. Essentially, it's what you'll find in .bower.json. Throws an error if the resolver is not yet resolved.


Protected functions

Resolver#_createTempDir(): Promise

Creates a temporary dir.

Resolver#_readJson(): Promise

Reads bower.json/component.json, possibly by using a dedicated read-json node module that will be available in the Bower organisation.

This method also normalises the package meta, filling in any missing information, inferring when possible.

Resolver#_decoratePkgMeta(meta): Promise

Decorates the package meta with any additional information that might be relevant to be stored. A UrlResolver could, for example, store some HTTP headers, that would be useful when comparing versions, in the hasNew() method.

Resolver#_applyPkgMeta(meta): Promise

Since the package meta might contain some information that has implications to the canonical state of the package, this is where these rules are enforced.

  • Checks if the resolver name is different from the json one. If so and if the name was "guessed", the name of the package will be updated and a name_change event will be emitted.
  • Deletes files that are specified in the ignore property of the json from the temporary directory.

Resolver#_savePkgMeta(meta): Promise

Stores the package meta into a .bower.json file inside the root of the package.


Abstract functions that must be implemented by concrete resolvers.

Resolver#_resolveSelf(): Promise

The actual process of fetching the package files. This method must he implemented by concrete resolvers. For instance, the UrlResolver would download the contents of a URL into the temporary directory in this stage.

Resolver types

The following resolvers will extend from Resolver.js and obey its interface.

  • LocalResolver extends Resolver (dependencies pointing to files of folders in the own system)
  • UrlResolver extends Resolver (dependencies pointing to downloadable resources)
  • GitFsResolver extends Resolver (git dependencies available in the local file system)
  • GitRemoteResolver extends Resolver or GitFsResolver (remote git dependencies)
  • PublishedResolver extends Resolver (? makes sense if bower supports a publish model, just like npm).

The ResolverFactory knows these types, and is able to fabricate suitable resolvers based on the source type.

This architecture makes it very easy for the community to create others package types, for instance, a MercurialFsResolver, MercurialResolver, SvnResolver, etc.

Unit of work

The work coordinator, responsible for keeping track of which resolvers are being resolved. The number of parallel resolutions may be limited and configured.


Constructor

UnitOfWork(options)

Options:

  • maxConcurrent: maximum number of concurrent resolvers running (defaults to 5)

Public methods.

UnitOfWork#enqueue(resolver): Promise

Enqueues a resolver to be ran. The promise is fulfilled when the resolver successfully resolved or is rejected if it failed to resolve.

UnitOfWork#has(resolver): Boolean

Checks if a resolver is already in the unit of work.

UnitOfWork#abort(): Promise

Aborts the current work being done, by removing any resolvers waiting to resolve. Note that resolvers running can't be aborted. Returns a promise that is fulfilled when the current running resolvers finish the resolve process. Any Resolver that was in queue to be ran is aborted immediately.