DeepSwift is a Swift library for differentiable programming without compiler magic. The “low level” API to write new differentiable types looks as follows:
This is really all you need in order to implement backpropagation. Here’s what the individual methods do:
inspectableApply should produce a “result” which should be equivalent to the apply method. Additionally, it can produce auxiliary data. In case of a loss function, this should be the derivative so we can feed it into backpropagation. In all other cases, auxiliary data should be considered a layer-internal type that nobody except the layer itself cares about.
adjustment is called after you produced a result using inspectableApply and you got feedback on how “good” your output was. This feedback is expressed as the parameter called gradient, and you can think of it as a requested change to the output you delivered. The job of the adjustment method is to translate this change to the output to a change of this layer and the input to this layer. Intuitively, you need to estimate here how much a small change in the input or in the layer parameters would have changed the output - and then compare that to the requested change in the output. Often, one can work out a reasonable mathematical answer for that using tools with intimidating names like “calculus” or “derivative”.
move is is responsible for actually carrying out the change.
The other methods have default implementations, but I recommend to implement at least backprop so you get a speedup when freezing this layer.
Note: The Layerprotocol requires Input and Output to conform to Movable - that is, they have to provide a move method too. The gradient in adjustment/backprop will be of type Output.Adjustment, and the returned backprop value will have type Input.Adjustment.
And here’s how the three mandatory methods work together:
If you want to implement pure layers, there’s a Function protocol that will not ask you to implement a move method. Instead, you are required to specify that the Adjustment type is NoAdjustment<Scalar> with an appropriate scalar type.
While DeepSwift leaves the design of concrete layers and optimizers and the implementation of efficient numerics to other (downstream) repos, DeepSwift provides the high level API that you need so you can write:
struct MyModel : Learner {
var body = SomeLayer() |> SomeOtherLayer() |> ThirdLayer()
}
Note
This API is in its infancy and therefore subject to rapid changes.
DeepSwift
DeepSwift is a Swift library for differentiable programming without compiler magic. The “low level” API to write new differentiable types looks as follows:
This is really all you need in order to implement backpropagation. Here’s what the individual methods do:
inspectableApply
should produce a “result” which should be equivalent to theapply
method. Additionally, it can produce auxiliary data. In case of a loss function, this should be the derivative so we can feed it into backpropagation. In all other cases, auxiliary data should be considered a layer-internal type that nobody except the layer itself cares about.adjustment
is called after you produced a result using inspectableApply and you got feedback on how “good” your output was. This feedback is expressed as the parameter calledgradient
, and you can think of it as a requested change to the output you delivered. The job of theadjustment
method is to translate this change to the output to a change of this layer and the input to this layer. Intuitively, you need to estimate here how much a small change in the input or in the layer parameters would have changed the output - and then compare that to the requested change in the output. Often, one can work out a reasonable mathematical answer for that using tools with intimidating names like “calculus” or “derivative”.move
is is responsible for actually carrying out the change.The other methods have default implementations, but I recommend to implement at least
backprop
so you get a speedup when freezing this layer.Note: The
Layer
protocol requiresInput
andOutput
to conform toMovable
- that is, they have to provide amove
method too. Thegradient
inadjustment
/backprop
will be of typeOutput.Adjustment
, and the returnedbackprop
value will have typeInput.Adjustment
.And here’s how the three mandatory methods work together:
If you want to implement pure layers, there’s a
Function
protocol that will not ask you to implement amove
method. Instead, you are required to specify that theAdjustment
type isNoAdjustment<Scalar>
with an appropriate scalar type.While DeepSwift leaves the design of concrete layers and optimizers and the implementation of efficient numerics to other (downstream) repos, DeepSwift provides the high level API that you need so you can write:
Note
This API is in its infancy and therefore subject to rapid changes.