Jacobian-Vector Product (GVP) Module#
The GVP module provides implementations of Jacobian-vector products (JVP), vector-Jacobian products (VJP), and related utilities.
JVP (Jacobian-Vector Product)#
- torch_secorder.core.gvp.jvp(func: Callable[[], Tensor], params: List[Tensor], v: Tensor | List[Tensor], create_graph: bool = False) Tensor | List[Tensor][source]#
Compute the Jacobian-vector product (JVP): J v.
- Parameters:
func – A callable that returns a tensor output (can be vector-valued).
params – List of parameters with respect to which to compute the Jacobian.
v – Vector to multiply with the Jacobian. Can be a single tensor or a list of tensors matching the structure of params.
create_graph – If True, graph of the derivative will be constructed, allowing to compute higher order derivative products.
- Returns:
The JVP (same shape as the output of func).
Computes the Jacobian-vector product for a given function and parameters.
Example:#
import torch
from torch_secorder.core.gvp import jvp
def func():
return torch.stack([x[0] ** 2, 3 * x[1] ** 2])
x = torch.tensor([1.0, 2.0], requires_grad=True)
v = torch.tensor([0.5, -1.0])
jvp_result = jvp(func, [x], v)
VJP (Vector-Jacobian Product)#
- torch_secorder.core.gvp.vjp(func: Callable[[], Tensor], params: List[Tensor], v: Tensor, create_graph: bool = False) Tensor | List[Tensor][source]#
Compute the vector-Jacobian product (VJP): v^T J.
- Parameters:
func – A callable that returns a tensor output (can be vector-valued).
params – List of parameters with respect to which to compute the Jacobian.
v – Vector to multiply with the Jacobian (should match the output shape of func).
create_graph – If True, graph of the derivative will be constructed, allowing to compute higher order derivative products.
- Returns:
The VJP (list of tensors matching the structure of params).
Computes the vector-Jacobian product for a given function and parameters.
Example:#
import torch
from torch_secorder.core.gvp import vjp
def func():
return torch.stack([x[0] ** 2, 3 * x[1] ** 2])
x = torch.tensor([1.0, 2.0], requires_grad=True)
v = torch.tensor([0.5, -1.0])
vjp_result = vjp(func, [x], v)
Model JVP#
- torch_secorder.core.gvp.model_jvp(model: Module, x: Tensor, v: Tensor | List[Tensor], create_graph: bool = False) Tensor[source]#
Compute the JVP for a model’s output with respect to its parameters.
- Parameters:
model – The PyTorch model.
x – Input tensor.
v – Vector to multiply with the Jacobian (should match the structure of model.parameters()).
create_graph – If True, graph of the derivative will be constructed.
- Returns:
The JVP (same shape as the model output).
A convenience wrapper for computing JVP with respect to a model’s parameters.
Example:#
import torch
import torch.nn as nn
from torch_secorder.core.gvp import model_jvp
model = nn.Linear(10, 1)
x = torch.randn(1, 10)
v = [torch.randn_like(p) for p in model.parameters()]
jvp_result = model_jvp(model, x, v)
Model VJP#
- torch_secorder.core.gvp.model_vjp(model: Module, x: Tensor, v: Tensor, create_graph: bool = False) Tensor | List[Tensor][source]#
Compute the VJP for a model’s output with respect to its parameters.
- Parameters:
model – The PyTorch model.
x – Input tensor.
v – Vector to multiply with the Jacobian (should match the output shape of model(x)).
create_graph – If True, graph of the derivative will be constructed.
- Returns:
The VJP (list of tensors matching the structure of model.parameters()).
A convenience wrapper for computing VJP with respect to a model’s parameters.
Example:#
import torch
import torch.nn as nn
from torch_secorder.core.gvp import model_vjp
model = nn.Linear(10, 1)
x = torch.randn(1, 10)
v = torch.randn(1, 1)
vjp_result = model_vjp(model, x, v)
Batch JVP#
- torch_secorder.core.gvp.batch_jvp(func: Callable[[], Tensor], params: List[Tensor], vs: Tensor | List[Tensor], create_graph: bool = False) Tensor[source]#
Compute a batch of Jacobian-vector products (JVPs).
- Parameters:
func – A callable that returns a tensor output (can be vector-valued).
params – List of parameters with respect to which to compute the Jacobian.
vs – Batch of vectors to multiply with the Jacobian. Should be a tensor of shape (batch, …) or a list of such tensors.
create_graph – If True, graph of the derivative will be constructed.
- Returns:
Tensor of shape (batch, …) with the JVPs for each vector in the batch.
Computes JVPs for a batch of vectors efficiently.
Example:#
import torch
from torch_secorder.core.gvp import batch_jvp
def func():
return torch.stack([x[0] ** 2, 3 * x[1] ** 2])
x = torch.tensor([1.0, 2.0], requires_grad=True)
vs = torch.stack([
torch.tensor([1.0, 0.0]),
torch.tensor([0.0, 1.0])
])
batch_result = batch_jvp(func, [x], vs)
Batch VJP#
- torch_secorder.core.gvp.batch_vjp(func: Callable[[], Tensor], params: List[Tensor], vs: Tensor, create_graph: bool = False) List[Tensor][source]#
Compute a batch of vector-Jacobian products (VJPs).
- Parameters:
func – A callable that returns a tensor output (can be vector-valued).
params – List of parameters with respect to which to compute the Jacobian.
vs – Batch of vectors to multiply with the Jacobian (should match the output shape of func, with batch dimension first).
create_graph – If True, graph of the derivative will be constructed.
- Returns:
List of tensors, each of shape (batch, …) matching the structure of params.
Computes VJPs for a batch of vectors efficiently.
Example:#
import torch
from torch_secorder.core.gvp import batch_vjp
def func():
return torch.stack([x[0] ** 2, 3 * x[1] ** 2])
x = torch.tensor([1.0, 2.0], requires_grad=True)
vs = torch.stack([
torch.tensor([1.0, 0.0]),
torch.tensor([0.0, 1.0])
])
batch_result = batch_vjp(func, [x], vs)
Full Jacobian#
- torch_secorder.core.gvp.full_jacobian(func: Callable[[], Tensor], params: List[Tensor], create_graph: bool = False) List[Tensor][source]#
Compute the full Jacobian matrix of func with respect to params.
- Parameters:
func – A callable that returns a tensor output (can be vector-valued).
params – List of parameters with respect to which to compute the Jacobian.
create_graph – If True, graph of the derivative will be constructed.
- Returns:
List of Jacobian tensors, one for each parameter, with shape (output_dim, param_shape).
Computes the full Jacobian matrix for a given function and parameters.
Example:#
import torch
from torch_secorder.core.gvp import full_jacobian
def func():
return torch.stack([x[0] ** 2, 3 * x[1] ** 2])
x = torch.tensor([1.0, 2.0], requires_grad=True)
jac = full_jacobian(func, [x])
Notes#
JVP (`jvp`, `model_jvp`, `batch_jvp`): These functions compute the product of the Jacobian matrix with a vector (or batch of vectors). This is generally more efficient than computing the full Jacobian when only the product is needed.
VJP (`vjp`, `model_vjp`, `batch_vjp`): These functions compute the product of a vector (or batch of vectors) with the transpose of the Jacobian matrix. This is also known as a reverse-mode differentiation and is the basis for backpropagation.
`create_graph` Parameter: When create_graph=True is set, a computational graph of the derivative itself is constructed. This allows for computing higher-order derivatives (e.g., Hessian-vector products from JVPs/VJPs).
`allow_unused=True`: This parameter in torch.autograd.grad is used to allow gradients to be computed for parameters that might not be part of the computational graph for a specific output. If a parameter does not affect the output, its gradient will be None, and the functions handle this by replacing None with zero tensors.
Batch Computations: The batch_jvp and batch_vjp functions provide an efficient way to compute JVPs and VJPs for multiple vectors in a single call, which can be beneficial for performance compared to looping through individual vector computations.
Full Jacobian (`full_jacobian`): While JVP and VJP are efficient for products, full_jacobian computes the entire Jacobian matrix. This can be memory-intensive for models with many inputs/outputs or parameters but is useful when the entire matrix is required for analysis.