The skeleton of the operation superclass can be found at include/compute/operation.h. There are several things that must be implemented in order to have a working operation. Below the process is layed out for the example operation sub
.
include/compute
and src/compute
named sub
.include/compute/sub
folder create subop.h
and in src/compute/sub
create subop.cpp
.SubOp
class, which extends Operation
, in the header file. Note that the class must be in the namespace magmadnn::op
. The method foo
should also be defined here, which returns a new operation class SubOp
.SubOp
in subop.cpp
. The new operation must implement the constructor, eval, _grad, and to_string methods. _eval
should return the evaluated tensor up to this point in the tree. (Note, you are allowed to create helper files within the folder sub/
, but their methods must live within the namespace magmadnn::internal
). _grad
is given the consumer operation, variable that gradient is w.r.t., and the grad w.r.t. the output. It should return a tensor pointer, where the tensor contains the gradient computation. Implement the function sub
in subop.cpp
. sub
must work with and be compiled for int
, float
, and double
. sub
is also expected to work for all memory types. See constructor, eval, grad, to_string, and func for more information on how to implement these.#include "sub/subop.h"
to include/compute/tensor_operations.h
. This allows the rest of the library to see the new operation.testing/
folder.See include/compute/add/ and src/compute/add/ for an actual example.
Operators can implement copy and no-copy options, determining whether to return a newly allocated tensor or write over one of the parameters. However, this is not required.
The constructor must call the parent constructor of Operation<T>
that takes a vector of operations. This sets the children of the operation and is used for memory releasing. output_shape
and mem_type
should also be set within the constructor. This allows shape checking and pre-allocation of tensors when the tree is created.
The constructor should also do any preprocessing that is possible, to remove computational burden from the performance critical eval
function.
An example of a constructor might look like,
The eval method is simply responsible for the evaluation of the operation. It should return a tensor pointer with the same shape and memory type as defined by the operation. An example eval function might look like,
Grad functions should write into the gradient tensor and return it. Every operation has a std::map<Operation<T>*, Tensor<T>*> _grad_cache
, which can help store these tensors keyed on input variables. For instance, the _grad
implementation of a log operation:
The to_string
method is fairly simple to implement. It defines a form to print out the operation, using the to_string
return values of the child operations. For instance, the operation add(a,b)
's implementation might look like
Every operation should be paired with a function that returns a new pointer to that operation. This allows for cleaner math expressions for the library user (i.e. auto x = op::add(op::matmul(A, x),c)
). An example of this function might look like