mirror of
https://github.com/NationalSecurityAgency/ghidra.git
synced 2026-01-08 05:34:00 -05:00
GP-4489: Add psutil and protobuf to downloads, dist. Build py packages for dist.
This commit is contained in:
206
DevGuide.md
206
DevGuide.md
@@ -62,6 +62,11 @@ Build Javadoc:
|
||||
gradle createJavadocs
|
||||
```
|
||||
|
||||
Build Python3 packages for the Debugger:
|
||||
```
|
||||
gradle buildPyPackage
|
||||
```
|
||||
|
||||
Build Ghidra to `build/dist` in an uncompressed form. This will be a distribution intended only to
|
||||
run on the platform on which it was built.
|
||||
```
|
||||
@@ -182,13 +187,18 @@ If you'd like some details of our fine tuning, take a look at [building_fid.txt]
|
||||
|
||||
## Debugger Development
|
||||
|
||||
We have recently changed the Debugger's back-end architecture.
|
||||
We no longer user JNA to access native Debugger APIs.
|
||||
We only use it for pseudo-terminal access.
|
||||
Instead, we use Python3 and a protobuf-based TCP connection for back-end integration.
|
||||
|
||||
### Additional Dependencies
|
||||
|
||||
In addition to Ghidra's normal dependencies, you may want the following:
|
||||
|
||||
* WinDbg for Windows x64
|
||||
* GDB 8.0 or later for Linux amd64/x86_64
|
||||
* LLDB 13.0 for macOS
|
||||
* GDB 13 or later for Linux
|
||||
* LLDB 10 or later for macOS
|
||||
|
||||
The others (e.g., JNA) are handled by Gradle via Maven Central.
|
||||
|
||||
@@ -199,121 +209,137 @@ These all currently reside in the `Ghidra/Debug` directory, but will likely be r
|
||||
`Framework` and `Feature` directories later. Each project is listed "bottom up" with a brief
|
||||
description and status.
|
||||
|
||||
* ProposedUtils - a collection of utilities proposed to be moved to other respective projects
|
||||
* AnnotationValidator - an experimental annotation processor for database access objects
|
||||
* ProposedUtils - a collection of utilities proposed to be moved to other respective projects.
|
||||
* AnnotationValidator - an experimental annotation processor for database access objects.
|
||||
* Framework-TraceModeling - a database schema and set of interfaces for storing machine state over
|
||||
time
|
||||
time.
|
||||
* Framework-AsyncComm - a collection of utilities for asynchronous communication (packet formats
|
||||
and completable-future conveniences).
|
||||
* Framework-Debugging - specifies interfaces for debugger models and provides implementation
|
||||
conveniences.
|
||||
conveniences. This is mostly deprecated.
|
||||
* Debugger - the collection of Ghidra plugins and services comprising the Debugger UI.
|
||||
* Debugger-rmi-trace - the wire protocol, client, services, and UI components for Trace RMI, the new back-end architecture.
|
||||
* Debugger-agent-dbgeng - the connector for WinDbg (via dbgeng.dll) on Windows x64.
|
||||
* Debugger-agent-dbgmodel - an experimental connector for WinDbg Preview (with TTD, via
|
||||
dbgmodel.dll) on Windows x64.
|
||||
* Debugger-agent-dbgmodel-traceloader - an experimental "importer" for WinDbg trace files.
|
||||
* Debugger-agent-gdb - the connector for GDB (8.0 or later recommended) on UNIX.
|
||||
* Debugger-swig-lldb - the Java language bindings for LLDB's SBDebugger, also proposed upstream.
|
||||
* Debugger-agent-lldb - the connector for LLDB (13.0 required) on macOS, UNIX, and Windows.
|
||||
dbgmodel.dll) on Windows x64. This is deprecated, as most of these features are implemented in Debugger-agent-dbgeng for the new architecture.
|
||||
* Debugger-agent-dbgmodel-traceloader - an experimental "importer" for WinDbg trace files. This is deprecated.
|
||||
* Debugger-agent-gdb - the connector for GDB (13 or later recommended) on UNIX.
|
||||
* Debugger-swig-lldb - the Java language bindings for LLDB's SBDebugger, also proposed upstream. This is deprecated. We now use the Python3 language bindings for LLDB.
|
||||
* Debugger-agent-lldb - the connector for LLDB (10 or later recommended) on macOS, UNIX, and Windows.
|
||||
* Debugger-gadp - the connector for our custom wire protocol the Ghidra Asynchronous Debugging
|
||||
Protocol.
|
||||
* Debugger-jpda - an in-development connector for Java and Dalvik debugging via JDI (i.e., JDWP).
|
||||
Protocol. This is deprecated. It's replaced by Debugger-rmi-trace.
|
||||
* Debugger-jpda - an in-development connector for Java and Dalvik debugging via JDI (i.e., JDWP). This is deprecated and not yet replaced.
|
||||
|
||||
The Trace Modeling schema records machine state and markup over time.
|
||||
It rests on the same database framework as Programs, allowing trace recordings to be stored in a
|
||||
Ghidra project and shared via a server, if desired. Trace "recording" is a de facto requirement for
|
||||
displaying information in Ghidra's UI. However, only the machine state actually observed by the user
|
||||
(or perhaps a script) is recorded. For most use cases, the Trace is small and ephemeral, serving
|
||||
only to mediate between the UI components and the target's model. It supports many of the same
|
||||
markup (e.g., disassembly, data types) as Programs, in addition to tracking active threads, loaded
|
||||
modues, breakpoints, etc.
|
||||
It rests on the same database framework as Programs, allowing trace recordings to be stored in a Ghidra project and shared via a server, if desired.
|
||||
Trace "recording" is a de facto requirement for displaying information in Ghidra's UI.
|
||||
The back-end connector has full discretion over what is recorded by using Trace RMI.
|
||||
Typically, only the machine state actually observed by the user (or perhaps a script) is recorded.
|
||||
For most use cases, the Trace is small and ephemeral, serving only to mediate between the UI components and the target's model.
|
||||
It supports many of the same markup (e.g., disassembly, data types) as Programs, in addition to tracking active threads, loaded modues, breakpoints, etc.
|
||||
|
||||
Every model (or "adapter" or "connector" or "agent") implements the API specified in
|
||||
Framework-Debugging. As a general rule in Ghidra, no component is allowed to access a native API and
|
||||
reside in the same JVM as the Ghidra UI. This allows us to contain crashes, preventing data loss. To
|
||||
accommodate this requirement -- given that debugging native applications is almost certainly going
|
||||
to require access to native APIs -- we've developed the Ghidra Asynchronous Debugging Protocol. This
|
||||
protocol is tightly coupled to Framework-Debugging, essentially exposing its methods via RMI. The
|
||||
protocol is built using Google's Protobuf library, providing a potential path for agent
|
||||
implementations in alternative languages. GADP provides both a server and a client implementation.
|
||||
The server can accept any model which adheres to the specification and expose it via TCP; the client
|
||||
does the converse. When a model is instantiated in this way, it is called an "agent," because it is
|
||||
executing in its own JVM. The other connectors, which do not use native APIs, may reside in Ghidra's
|
||||
JVM and typically implement alternative wire protocols, e.g., JDWP. In both cases, the
|
||||
implementations inherit from the same interfaces.
|
||||
Every back end (or "adapter" or "connector" or "agent") employs the Trace RMI client to populate a trace database.
|
||||
As a general rule in Ghidra, no component is allowed to access a native API and reside in the same JVM as the Ghidra UI.
|
||||
This allows us to contain crashes, preventing data loss.
|
||||
To accommodate this requirement — given that debugging native applications is almost certainly going to require access to native APIs — we've developed the Trace RMI protocol.
|
||||
This also allows us to better bridge the language gap between Java and Python, which is supported by most native debuggers.
|
||||
This protocol is loosely coupled to Framework-TraceModeling, essentially exposing its methods via RMI, as well as some methods for controlling the UI.
|
||||
The protocol is built using Google's Protobuf library, providing a potential path for back-end implementations in alternative languages.
|
||||
We provide the Trace RMI server as a Ghidra component implemented in Java and the Trace RMI client as a Python3 package.
|
||||
A back-end implementation may be a stand-alone executable or script that accesses the native debugger's API, or a script or plugin for the native debugger.
|
||||
It then connects to Ghidra via Trace RMI to populate the trace database with information gleaned from that API.
|
||||
It should provide a set of diagnostic commands to control and monitor that connection.
|
||||
It should also use the native API to detect session and target changes so that Ghidra's UI consistently reflects the debugging session.
|
||||
|
||||
The Debugger services maintain a collection of active connections and inspect each model for
|
||||
potential targets. When a target is found, the service inspects the target environment and attempts
|
||||
to find a suitable opinion. Such an opinion, if found, instructs Ghidra how to map the objects,
|
||||
addresses, registers, etc. from the target namespace into Ghidra's. The target is then handed to a
|
||||
Trace Recorder which begins collecting information needed to populate the UI, e.g., the program
|
||||
counter, stack pointer, and the bytes of memory they refer to.
|
||||
The old system relied on a "recorder" to discover targets and map them to traces in the proper Ghidra language.
|
||||
That responsibility is now delegated to the back end.
|
||||
Typically, it examines the target's architecture and immediately creates a trace upon connection.
|
||||
|
||||
### Developing a new connector
|
||||
|
||||
So Ghidra does not yet support your favorite debugger?
|
||||
It is tempting, exciting, but also daunting to develop your own connector.
|
||||
Please finish reading this guide, and look carefully at the ones we have so far, and perhaps ask to
|
||||
see if we are already developing one. Of course, in time you might also search the internet to see
|
||||
if others are developing one. There are quite a few caveats and gotchas, the most notable being that
|
||||
this interface is still in quite a bit of flux. When things go wrong, it could be because of,
|
||||
without limitation: 1) a bug on your part, 2) a bug on our part, 3) a design flaw in the interfaces,
|
||||
or 4) a bug in the debugger/API you're adapting. We are still in the process of writing up this
|
||||
documentation. In the meantime, we recommend using the GDB and dbgeng.dll agents as examples.
|
||||
We believe the new system is much less daunting than the previous.
|
||||
Still, please finish reading this guide, and look carefully at the ones we have so far, and perhaps ask to see if we are already developing one.
|
||||
Of course, in time you might also search the internet to see if others are developing one.
|
||||
There are quite a few caveats and gotchas, the most notable being that this interface is still in some flux.
|
||||
When things go wrong, it could be because of, without limitation:
|
||||
|
||||
You'll also need to provide launcher(s) so that Ghidra knows how to configure and start your
|
||||
connector. Please provide launchers for your model in both configurations: as a connector in
|
||||
Ghidra's JVM, and as a GADP agent. If your model requires native API access, you should only permit
|
||||
launching it as a GADP agent, unless you give ample warning in the launcher's description. Look at
|
||||
the existing launchers for examples. There are many model implementation requirements that cannot be
|
||||
expressed in Java interfaces. Failing to adhere to those requirements may cause different behaviors
|
||||
with and without GADP. Testing with GADP tends to reveal those implementation errors, but also
|
||||
obscures the source of client method calls behind network messages. We've also codified (or
|
||||
attempted to codify) these requirements in a suite of abstract test cases. See the `ghidra.dbg.test`
|
||||
package of Framework-Debugging, and again, look at existing implementations.
|
||||
1. A bug on your part
|
||||
2. A bug on our part
|
||||
3. A design flaw in the interfaces
|
||||
4. A bug in the debugger/API you're adapting
|
||||
|
||||
We are still (yes, still) in the process of writing up this documentation.
|
||||
In the meantime, we recommend using the GDB and dbgeng agents as examples.
|
||||
Be sure to look at the Python code `src/main/py`!
|
||||
The deprecated Java code `src/main/java` is still included as we transition.
|
||||
|
||||
You'll also need to provide launcher(s) so that Ghidra knows how to configure and start your connector.
|
||||
These are just shell scripts.
|
||||
We use bash scripts on Linux and macOS, and we use batch files on Windows.
|
||||
Try to include as many common use cases as makes sense for the debugger.
|
||||
This provides the most flexibility to users and examples to power users who might create derivative launchers.
|
||||
Look at the existing launchers for examples.
|
||||
|
||||
For testing, please follow the examples for GDB.
|
||||
We no longer provide abstract classes that prescribe requirements.
|
||||
Instead, we just provide GDB as an example.
|
||||
Usually, we split our tests into three categories:
|
||||
|
||||
* Commands
|
||||
* Methods
|
||||
* Hooks
|
||||
|
||||
The Commands tests check that the user CLI commands, conventionally implemented in `commands.py`, work correctly.
|
||||
In general, do the minimum connection setup, execute the command, and check that it produces the expected output and causes the expected effects.
|
||||
|
||||
The Methods tests check that the remote methods, conventionally implemented in `methods.py`, work correctly.
|
||||
Many methods are just wrappers around CLI commands, some provided by the native debugger and some provided by `commands.py`.
|
||||
These work similarly to the commands test, except that they invoke methods instead of executing commands.
|
||||
Again, check the return value (rarely applicable) and that it causes the expected effects.
|
||||
|
||||
The Hooks tests check that the back end is able to listen for session and target changes, e.g., knowing when the target stops.
|
||||
*The test should not "cheat" by executing commands or invoking methods that should instead be triggered by the listener.*
|
||||
It should execute the minimal commands to setup the test, then trigger an event.
|
||||
It should then check that the event in turn triggered the expected effects, e.g., updating PC upon the target stopping.
|
||||
|
||||
Whenever you make a change to the Python code, you'll need to re-assemble the package's source.
|
||||
|
||||
```
|
||||
gradle assemblePyPackage
|
||||
```
|
||||
|
||||
This is required in case your package includes generated source, as is the case for Debugger-rmi-trace.
|
||||
If you want to create a new Ghidra module for your connector (recommended) use an existing one's `build.gradle` as a template.
|
||||
A key part is applying the `hasPythonPackage.gradle` script.
|
||||
|
||||
### Adding a new platform
|
||||
|
||||
If an existing connector exists for a suitable debugger on the desired platform, then adding it may
|
||||
be very simple. For example, both the x86 and ARM platforms are supported by GDB, so even though
|
||||
we're currently focused on x86 support, we've provided the opinions needed for Ghidra to debug ARM
|
||||
platforms (and several others) via GDB. These opinions are kept in the "Debugger" project, not their
|
||||
respective "agent" projects. We imagine there are a number of platforms that could be supported
|
||||
almost out of the box, except that we haven't written the necessary opinions, yet. Take a look at
|
||||
the existing ones for examples.
|
||||
If a connector already exists for a suitable debugger on the desired platform, then adding it may be very simple.
|
||||
For example, many platforms are supported by GDB, so even though we're currently focused on x86-64 (and to some extent arm64) support, we've provided the mappings for many.
|
||||
These mappings are conventionally kept in each connector's `arch.py` file.
|
||||
|
||||
In general, to write a new opinion, you need to know: 1) What the platform is called (including
|
||||
variant names) by the debugger, 2) What the processor language is called by Ghidra, 3) If
|
||||
applicable, the mapping of target address spaces into Ghidra's address spaces, 4) If applicable, the
|
||||
mapping of target register names to those in Ghidra's processor language. In most cases (3) and (4)
|
||||
are already implemented by default mappers, so you can use those same mappers in your opinion. Once
|
||||
you have the opinion written, you can try debugging and recording a target. If Ghidra finds your
|
||||
opinion applicable to that target, it will attempt to record, and then you can work out the kinds
|
||||
from there. Again, we have a bit of documentation to do regarding common pitfalls.
|
||||
In general, to update `arch.py`, you need to know:
|
||||
|
||||
1. What the platform is called (including variant names) by the debugger
|
||||
2. What the processor language is called by Ghidra
|
||||
3. If applicable, the mapping of target address spaces into Ghidra's address spaces
|
||||
4. If applicable, the mapping of target register names to those in Ghidra's processor language
|
||||
|
||||
In most cases (3) and (4) are already implemented by the included mappers.
|
||||
Naturally, you'll want to test the special cases, preferably in automated tests.
|
||||
|
||||
### Emulation
|
||||
|
||||
The most obvious integration path for 3rd-party emulators is to write a "connector." However, p-code
|
||||
emulation is now an integral feature of the Ghidra UI, and it has a fairly accessible API. Namely,
|
||||
for interpolation between machines states recorded in a trace, and extrapolation into future machine
|
||||
states. Integration of such emulators may still be useful to you, but we recommend trying the p-code
|
||||
emulator to see if it suits your needs for emulation in Ghidra before pursuing integration of
|
||||
another emulator.
|
||||
The most obvious integration path for 3rd-party emulators is to write a "connector."
|
||||
However, p-code emulation is an integral feature of the Ghidra UI, and it has a fairly accessible API.
|
||||
Namely, for interpolation between machines states recorded in a trace, and extrapolation into future machine states.
|
||||
Integration of such emulators may still be useful to you, but we recommend trying the p-code emulator to see if it suits your needs for emulation in Ghidra before pursuing integration of another emulator.
|
||||
We also provide out-of-the-box QEMU integration via GDB.
|
||||
|
||||
### Contributing
|
||||
|
||||
Whether submitting help tickets and pull requests, please tag those related to the debugger with
|
||||
"Debugger" so that we can triage them more quickly.
|
||||
|
||||
To set up your environment, in addition to the usual Gradle tasks, process the Protobuf
|
||||
specification for GADP:
|
||||
|
||||
```bash
|
||||
gradle generateProto
|
||||
```
|
||||
|
||||
If you already have an environment set up in Eclipse, please re-run `gradle prepDev eclipse` and
|
||||
import the new projects.
|
||||
When submitting help tickets and pull requests, please tag those related to the debugger with "Debugger" so that we can triage them more quickly.
|
||||
|
||||
|
||||
[java]: https://dev.java
|
||||
|
||||
Reference in New Issue
Block a user