Bower rewrite
Why?
Bower codebase is becoming unmanageable, especially at its core. Main issues are:
- No separation of concerns. The overall codebase has grown in a patch fashion, which has lead to a bloated and tight coupled solution.
- Monolithic Package.js that handles all package types (both local and remote
Git, URL, local files, etc). - Package.js has a big nesting level of callbacks, causing confusion and making the code hard to read.
- Some commands, such as install and update, have incorrect behaviour (#200, #256)
- This is directly related with the current implementation of bower core: Package.js and Manager.js
- Programmatic usage needs improvement
- Unable to spawn multiple commands in parallel in different folders
- Some commands simply do not fire the
endevent - Others fire the
errorevent many times - Some commands should fire more meaningful events (e.g.: install should fire each installed package)
Main goals
- Ease the process of gathering more contributors.
- Clear architecture and separation of concerns.
- Installation/update speedup.
- Named endpoints on the CLI install.
- Offline installation of packages, thanks to the cache.
- Ability to easily add package types (
SVN, etc). - Support for commit hashes and branches in targets for
Gitendpoints. - Improved output after installation/update.
- Integrate with update-notifier and yeomen insight.
Implementation details
Term dictionary
- Canonical package: A folder containing all the files that belong to a package. May include a
bower.jsonfile inside. (typically what gets installed) - Source: URL, git endpoint, etc.
- Target:
semverrange, commit hash, branch (indicates a version). - Endpoint: source#target
- Named endpoint: name@endpoint#target
- UoW: Unit of Work
- Components folder: The folder in which components are installed (
bower_componentsby default).
Overall strategy
Bower is composed of the following components:
CLI: Command line interface for Bower..bowerrc: Allows for customisations of Bower behaviour at the project/user level.bower.json: Main purpose is to declare the component dependencies and other component related information.Manager: Main coordinator, responsible for:- Checking which packages are already installed in the current
bower folder. - Deciding which version of the dependencies should be fetched from the
PackageRepository, while keeping every dependant compatible (note that theManagerisserveraware). - Tracking which dependencies have been fetched, which ones failed to fetch, and which ones are being fetched.
- Requesting the
PackageRepositoryto fail-fast, in case it realises there is no resolution for the current dependency tree.
- Checking which packages are already installed in the current
PackageRepository: Abstraction to the underlying complexity of heterogeneous source types. Responsible for:- Storing new entries in
ResolveCache. - Queueing resolvers into the
UoW, if no suitable entry is found in theResolveCache.
- Storing new entries in
ResolveCache: Keeps a cache of previously resolved endpoints. Lookup can be done using an endpoint.UnitOfWork: Work coordinator, responsible for:- Keeping track of which resolvers are being resolved.
- Limiting amount of parallel resolutions.
ResolverFactory: Parses an endpoint and returns aResolvercapable of resolving the source type.Resolver: Base resolver, which can be extended by concrete resolvers, likeUrlResolver,GitRemoteResolver, etc.
You can find additional details about each of these components below, in Architecture components details.
Resolve process
Here's an overview of the dependency resolve process:
-
INSTALL/UPDATE - A set of named endpoints and/or endpoints is requested to be installed/updated, and these are passed to the
Manager. -
ANALIZE COMPONENTS FOLDER -
Managerstarts by reading the components folder and understanding which packages are already installed. -
ENQUEUE ENDPOINTS - For each endpoint that should be fetched, the
Managerenqueues the named endpoints/endpoints in thePackageRepository. Some considerations:- If a package should be fetched or not depends on the following conditions:
- What operation is being done (install/update).
- If package is already installed.
- If
Managerhas already enqueued that named endpoint/endpoint in the current runtime (regardless of the fetch being currently in progress, already complete, or failed). - Additional flags (force, etc).
- If a package should be fetched or not depends on the following conditions:
-
FABRICATE RESOLVERS - For each of the endpoints, the
PackageRepositoryrequests theResolverFactoryfor suitable resolvers, capable of handling the source type. Some considerations:- The factory method takes the source string as its main argument, and is also provided with target, and package name if possible (all this information is extracted from the named endpoint/endpoint).
- This method is asynchronous, in order to allow for I/O operations to happen, without blocking the whole process (e.g., querying registry, etc).
- There is a runtime internal cache of sources that have already been analysed, and what type of
Resolverresulted from that analysis. This speeds up the decision process, particularly for aliases (registered packages), and published packages, which would required HTTP requests.
-
LOOKUP CACHE -
PackageRepositorylooks up theResolveCacheusing the endpoint, for a cached canonical package that complies to the endpoint target. Some considerations:- The lookup is performed using an endpoint that is fetched from the
Resolver. This allows the resolver to guarantee that the endpoint has been normalised (twitter/bootstrap -> git://github.com/twitter/bootstrap.git, etc). - The
ResolveCacheissemveraware. What this means, is that if you try to lookup~1.2.1, and the cache has a entries for versions1.2.3and1.2.4, it will give a hit with1.2.4.
- The lookup is performed using an endpoint that is fetched from the
-
CACHE HIT VALIDATION - At this stage, and only for the cache hits, the
PackageRepositorywill question theResolverif there is any version higher than the one fetched from cache that also complies with the endpoint target. Some considerations:- This step is ignored in case a flag like
offlineis passed. - How the
Resolverchecks this, depends on theResolvertype. (e.g.GitRemoteResolverwould fetch the git refs withgit ls-remote --tags --heads, and check if there is a higher version that complies with the target). - This check should be as quick as possible. If the process of checking a new version is too slow, it's preferable to just assume there is a new version.
- If there is no way to check if there is a higher version, assume that there is.
- If the
Resolverindicates that the cached version is outdated, then it is treated as a cache miss.
- This step is ignored in case a flag like
-
RESOLVE CACHE MISSES - Any cache miss needs to be resolved, so the
PackageRepositoryrequests each of the remaining resolvers to resolve, and waits. -
CACHE RESOLVED PACKAGES - As the resolvers complete the resolution, the
PackageRepositorystores the canonic packages in theResolveCache, along with the source, version, and any additional information that theResolverprovides. This allows resolvers to store additional details about the fetched package to be used for future cache hit validations (e.g. store HTTP expiration headers in the case of theUrlPackage). -
RETURN PACKAGE TO MANAGER - The
PackageRepositoryreturns the canonical package to theManager. -
EVALUATE RESOLVED PACKAGE DEPENDENCIES - The
Managerchecks if the returned canonical packages have abower.jsonfile describing additional dependencies and, if so, continue in point #3. If there are no more unresolved dependencies, finish up the installation procedure.
Architecture components details
Manager
TODO
PackageRepository
TODO
ResolverFactory
Simple function that takes a named endpoint/endpoint with options and creates an instance of a Resolver that obeys the base Resolver interface.
function createResolver(endpoint, options) -> Promise
This function could perform transformations/normalisations to the tuple endpoint.
For instance, if endpoint is a shorthand it would expand it.
The function is actually async to allow query the bower registry to know the real endpoint.
ResolveCache
TODO
Resolver
The Resolver class extends EventEmitter.
Think of it as an abstract class that implements the resolver interface as well as serving as a base for other resolver types.
Events
name_change: fired when the name of the package has changedaction: fired to inform the current action being performed by the resolverwarn: fired to inform a warning, e.g.: deprecation
Constructor
Resolver(source, options)
Options:
name- the name (if none is passed, one will be guessed from the endpoint)target- the target (defaults to *)config- the config to use (defaults to the global config)
Public functions
Resolver#getName() -> String
Returns the name.
Resolver#getSource() -> String
Returns the source.
Resolver#getTarget() -> String
Returns the target.
Resolver#getTempDir() -> String
Returns the temporary directory that the resolver can use to resolve itself.
Resolver#hasNew(oldVersion, oldResolution) -> Promise
Checks if there is a new version. Takes the old version and resolution to be used when comparing.
Resolves to a boolean when done.
Resolver#resolve() -> Promise
Resolves the resolver. The resolve process obeys a very explicit flow:
- calls #_createTempDir and waits
- When done, calls #_resolveSelf and waits
- When done, calls #_readJson and waits
- When done, calls #_parseJson and waits
- When done, resolves the promise with the resolution.
Resolver#getJson() -> Object
Get the bower.json of the resolved package.
Throws an error if the resolver is not yet resolved.
Protected functions
Resolver#_createTempDir() -> Promise
Creates a temporary dir.
Resolver#_readJson() -> Promise
Reads bower.json, possibly by using a dedicated read-json package that will be available in the Bower organization. It will ensure everything is valid.
Resolver#_parseJson(json) -> Promise
Parses the json:
- Checks if the resolver name is different from the json one. If so and if the name was "guessed", the name of the package will be updated and a
name_changeevent will be emitted. - Deletes files that are specified in the
ignoreproperty of the json from the temporary directory.
Abstract functions that must be implemented by concrete resolvers.
Resolver#_resolveSelf() -> Promise
Resolves self. This method should be implemented by the concrete resolvers. For instance, the UrlResolver would download the contents of a URL into the temporary directory.
Types of Resolvers
The following resolvers will extend from Resolver.js and will obey its interface.
LocalResolverextendsResolver(dependencies pointing to files of folders in the own system)UrlResolverextendsResolver(dependencies pointing to downloadable resources)GitFsResolverextendsResolver(git dependencies available in the local file system)GitRemoteResolverextendsResolverorGitFsResolver(remote git dependencies)PublishedResolverextendsResolver(? makes sense if bower supports a publish model, just likenpm).
These type of resolvers will be known and created (instantiated) by the ResolverFactory.
This architecture will make it very easy for the community to create others package types, for instance, a MercurialLocalPackage, MercurialRemotePackage, SvnResolver, etc.
Unit of work
TODO
