Files
bower/README.md
2013-04-19 01:06:20 +01:00

271 lines
14 KiB
Markdown

# Bower rewrite
## Why?
Bower codebase is becoming unmanageable, especially at its core.
Main issues are:
- __No separation of concerns. The overall codebase has grown in a patch fashion, which has lead to a bloated and tight coupled solution.__
- __Monolithic Package.js that handles all package types (both local and remote `Git`, URL, local files, etc).__
- __Package.js has a big nesting level of callbacks, causing confusion and making the code hard to read.__
- Some commands, such as install and update, have incorrect behaviour ([#200](https://github.com/twitter/bower/issues/200), [#256](https://github.com/twitter/bower/issues/256))
- This is directly related with the current implementation of bower core: Package.js and Manager.js
- Programmatic usage needs improvement
- Unable to spawn multiple commands in parallel in different folders
- Some commands simply do not fire the `end` event
- Others fire the `error` event many times
- Some commands should fire more meaningful events (e.g.: install should fire each installed package)
## Main goals
- Ease the process of gathering more contributors.
- Clear architecture and separation of concerns.
- Installation/update speedup.
- Named endpoints on the CLI install.
- Offline installation of packages, thanks to the cache.
- Ability to easily add package types (`SVN`, etc).
- Support for commit hashes and branches in targets for `Git` endpoints.
- Improved output after installation/update.
- Integrate with update-notifier and yeomen insight.
## Implementation details
### Term dictionary
- **Canonical package:** A folder containing all the files that belong to a package. May include a `bower.json` file inside. (typically what gets installed)
- **Source:** URL, git endpoint, etc.
- **Target:** `semver` range, commit hash, branch (indicates a version).
- **Endpoint:** source#target
- **Named endpoint:** name@endpoint#target
- **UoW:** Unit of Work
- **Components folder:** The folder in which components are installed (`bower_components` by default).
- **Package meta:** A data structure similar to the one found in `bower.json`, which might also contain additional information. This is usually stored in a `.bower.json` file.
### Overall strategy
![Really nicely drawn architecture diagram](http://f.cl.ly/items/44271M0R1O012H2m4234/resolve_diagram.png "Don't over think it! We already did! :P")
Bower is composed of the following components:
- `CLI`: Command line interface for Bower.
- `.bowerrc`: Allows for customisations of Bower behaviour at the project/user level.
- `bower.json`: Main purpose is to declare the component dependencies and other component related information.
- `Manager`: Main coordinator, responsible for:
- Checking which packages are already installed in the current `bower folder`.
- Deciding which version of the dependencies should be fetched from the `PackageRepository`, while keeping every dependant compatible (note that the `Manager` is `server` aware).
- Tracking which dependencies have been fetched, which ones failed to fetch, and which ones are being fetched.
- Requesting the `PackageRepository` to fail-fast, in case it realises there is no resolution for the current dependency tree.
- `PackageRepository`: Abstraction to the underlying complexity of heterogeneous source types. Responsible for:
- Storing new entries in `ResolveCache`.
- Queueing resolvers into the `UoW`, if no suitable entry is found in the `ResolveCache`.
- `ResolveCache`: Keeps a cache of previously resolved endpoints. Lookup can be done using an endpoint.
- `UnitOfWork`: Work coordinator, responsible for:
- Keeping track of which resolvers are being resolved.
- Limiting amount of parallel resolutions.
- `ResolverFactory`: Parses an endpoint and returns a `Resolver` capable of resolving the source type.
- `Resolver`: Base resolver, which can be extended by concrete resolvers, like `UrlResolver`, `GitRemoteResolver`, etc.
You can find additional details about each of these components below, in [Architecture components details](#architecture-components-details).
#### Resolve process
Here's an overview of the dependency resolve process:
1. **INSTALL/UPDATE** - A set of named endpoints and/or endpoints is requested to be installed/updated, and these are passed to the `Manager`.
2. **ANALIZE COMPONENTS FOLDER** - `Manager` starts by reading the *components folder* and understanding which packages are already installed.
3. **ENQUEUE ENDPOINTS** - For each endpoint that should be fetched, the `Manager` enqueues the *named endpoints*/endpoints in the `PackageRepository`. Some considerations:
- If a package should be fetched or not depends on the following conditions:
- What operation is being done (install/update).
- If package is already installed.
- If `Manager` has already enqueued that *named endpoint*/endpoint in the current runtime (regardless of the fetch being currently in progress, already complete, or failed).
- Additional flags (force, etc).
4. **FABRICATE RESOLVERS** - For each of the endpoints, the `PackageRepository` requests the `ResolverFactory` for suitable resolvers, capable of handling the source type. Some considerations:
- The factory method takes the source string as its main argument, and is also provided with target, and package name if possible (all this information is extracted from the *named endpoint*/endpoint).
- This method is asynchronous, in order to allow for I/O operations to happen, without blocking the whole process (e.g., querying registry, etc).
- There is a runtime internal cache of sources that have already been analysed, and what type of `Resolver` resulted from that analysis. This speeds up the decision process, particularly for aliases (registered packages), and published packages, which would required HTTP requests.
5. **LOOKUP CACHE** - `PackageRepository` looks up the `ResolveCache` using the endpoint, for a cached *canonical package* that complies to the endpoint target. Some considerations:
- The lookup is performed using an endpoint that is fetched from the `Resolver`. This allows the resolver to guarantee that the endpoint has been normalised (twitter/bootstrap -> git://github.com/twitter/bootstrap.git, etc).
- The `ResolveCache` is `semver` aware. What this means, is that if you try to lookup `~1.2.1`, and the cache has a entries for versions `1.2.3` and `1.2.4`, it will give a hit with `1.2.4`.
6. **CACHE HIT VALIDATION** - At this stage, and only for the cache hits, the `PackageRepository` will question the `Resolver` if there is any version higher than the one fetched from cache that also complies with the endpoint target. Some considerations:
- This step is ignored in case a flag like `offline` is passed.
- How the `Resolver` checks this, depends on the `Resolver` type. (e.g. `GitRemoteResolver` would fetch the git refs with `git ls-remote --tags --heads`, and check if there is a higher version that complies with the target).
- This check should be as quick as possible. If the process of checking a new version is too slow, it's preferable to just assume there is a new version.
- If there is no way to check if there is a higher version, assume that there is.
- If the `Resolver` indicates that the cached version is outdated, then it is treated as a cache miss.
7. **RESOLVE CACHE MISSES** - Any cache miss needs to be resolved, so the `PackageRepository` requests each of the remaining resolvers to resolve, and waits.
8. **CACHE RESOLVED PACKAGES** - As the resolvers complete the resolution, the `PackageRepository` stores the canonic packages in the `ResolveCache`, along with the source, version, and any additional information that the `Resolver` provides. This allows resolvers to store additional details about the fetched package to be used for future *cache hit validations* (e.g. store HTTP expiration headers in the case of the `UrlPackage`).
9. **RETURN PACKAGE TO MANAGER** - The `PackageRepository` returns the canonical package to the `Manager`.
10. **EVALUATE RESOLVED PACKAGE DEPENDENCIES** - The `Manager` checks if the returned canonical packages have a `bower.json` file describing additional dependencies and, if so, continue in point #3. If there are no more unresolved dependencies, finish up the installation procedure.
-----
### Architecture components details
#### Manager
TODO
#### PackageRepository
TODO
#### ResolverFactory
Simple function that takes a *named endpoint*/endpoint with options and creates an instance of a concrete `Resolver` that obeys the base `Resolver` interface.
```js
function createResolver(endpoint, options) -> Promise
```
This function will perform transformations/normalisations to the endpoint, like expanding shorthand andendpoints.
The function is async to allow querying the Bower registry, etc.
#### ResolveCache
TODO
#### Resolver
The `Resolver` class extends `EventEmitter`.
Think of it as an abstract class that implements the resolver interface as well as serving as a base for other resolver types.
Resolvers are responsible for the following:
- Based on an endpoint, fetch the contents of the package into a temporary folder (step is implemented by the `_resolveSelf()` method).
- After the package is fetched, the `bower.json`/`component.json` (deprecated) file is read, validated and normalised (fill in properties) into a `package meta` object. If the file does not exist, a base one is inferred. Note that this should be done using a node module that is common for both the Bower client and the server.
- Update any relevant information based on the `package meta` (e.g. this step may emit a `name_change`).
- Attach any additional meta data to the `package meta`. (e.g. the `UrlResolver` might store some `HTTP` response headers, to aid the `hasNew()` decision later on).
- Applying the `ignore` constraint based on the `package meta`. Files are effectively removed in this step.
- Storing the `package meta` into a `.bower.json` hidden file.
##### Events
- `name_change`: fired when the name of the package has changed
- `action`: fired to inform the current action being performed by the resolver
- `warn`: fired to inform a warning, e.g.: deprecation
##### Constructor
`Resolver(source, options)`
Options:
- `name` - the name (if none is passed, one will be guessed from the source)
- `target` - the target (defaults to *)
- `config` - the config to use (defaults to the global config)
------------
##### Public functions
`Resolver#getName()`: String
Returns the name.
`Resolver#getSource()`: String
Returns the source.
`Resolver#getTarget()`: String
Returns the target.
`Resolver#getTempDir()`: String
Returns the local temporary folder into which the package is being fetched. The files will remain here until the folder is moved when installing.
`Resolver#hasNew(canonicalPackage)`: Promise
Checks if there is a version more recent than the provided `canonicalPackage` (folder) that complies with the resolver target.
`Resolver#resolve()`: Promise
Resolves the resolver, and returns a promise of a canonical package.
The resolve process is as follows:
- calls `_createTempDir()` and waits.
- When done, calls `_resolveSelf()` and waits.
- When done, calls `_readJson()` and waits (validation and normalisation also happens here).
- When done, calls `_decoratePkgMeta()`, giving the resolver the chance to attach additional information about the resolved package (`HTTP` headers, etc).
- When done, calls both, and waits:
- `_applyPkgMeta(meta)`
- `_savePkgMeta(meta)`
- When done, resolves the promise with the *temp dir*, which is now a canonical package.
`Resolver#getPackageMeta()`: Object
Get the `package meta`. Essentially, it's what you'll find in `.bower.json`.
Throws an error if the resolver is not yet resolved.
-----------
##### Protected functions
`Resolver#_createTempDir()`: Promise
Creates a temporary dir.
`Resolver#_readJson()`: Promise
Reads `bower.json`/`component.json`, possibly by using a dedicated `read-json` node module that will be available in the Bower organisation.
This method also normalises the `package meta`, filling in any missing information, inferring when possible.
`Resolver#_decoratePkgMeta(meta)`: Promise
Decorates the `package meta` with any additional information that might be relevant to be stored. A `UrlResolver` could, for example, store some `HTTP` headers, that would be useful when comparing versions, in the `hasNew()` method.
`Resolver#_applyPkgMeta(meta)`: Promise
Since the `package meta` might contain some information that has implications to the *canonical* state of the package, this is where these rules are enforced.
- Checks if the resolver name is different from the json one. If so and if the name was "guessed", the name of the package will be updated and a `name_change` event will be emitted.
- Deletes files that are specified in the `ignore` property of the json from the temporary directory.
`Resolver#_savePkgMeta(meta)`: Promise
--------
Abstract functions that must be implemented by concrete resolvers.
##### Resolver#_resolveSelf() -> Promise
Resolves self. This method should be implemented by the concrete resolvers. For instance, the UrlResolver would download the contents of a URL into the temporary directory.
#### Types of Resolvers
The following resolvers will extend from `Resolver.js` and will obey its interface.
- `LocalResolver` extends `Resolver` (dependencies pointing to files of folders in the own system)
- `UrlResolver` extends `Resolver` (dependencies pointing to downloadable resources)
- `GitFsResolver` extends `Resolver` (git dependencies available in the local file system)
- `GitRemoteResolver` extends `Resolver` or `GitFsResolver` (remote git dependencies)
- `PublishedResolver` extends `Resolver` (? makes sense if bower supports a publish model, just like `npm`).
These type of resolvers will be known and created (instantiated) by the `ResolverFactory`.
This architecture will make it very easy for the community to create others package types, for instance, a `MercurialLocalPackage`, `MercurialRemotePackage`, `SvnResolver`, etc.
#### Unit of work
TODO