This is another thing I learned while implementing convolutions for a convolutional neural net. See part 1 for the motivation.
- Part 1 is an introduction to the problem and how I used
- Part 2 is about
as_strided, I have a 4D representation of the image like this:
and I need to multiply each of those
3 x 3 matrices by its corresponding value in the kernel
and sum over those 9 values to create the value in the feature map.
One way is to use
numpy.tensordot. But there’s another tool:
When I was looking at StackOverflow for direction, a lot of them used
It took me a while to look into it because it looked cryptic.
np.einsum looks pretty scary. Matrix multiplication becomes:
np.einsum('ij,jk->ik', A, B)
ij,jk->ik, or the “einstein sum subscripts string,” tells
einsum what it should do.
The groups of letters are the operands and represent the arrays it should act on.
ij,jk->ik is defining a little function
array1, array2 -> output.
Each letter labels an axis.
ij is labeling the two axes of A.
I can read
ij,jk->ik as “takes a 2D matrix, another 2D matrix, and returns a third 2D matrix.”
Then there are the rules:
- repeating a letter on the left-hand side of the arrow means to multiply along those axes
- omitting a letter from the right-hand side means sum over this axis.
- the order of the letters in the output is the order of the array, so I can transpose too.
tbh, what ended up working best was not thinking too hard, labeling my input axes and my output axes and following the rules to update it.
In this case, I wanted to multiply the last two dimensions of the
expanded_inputs by its corresponding
value in the kernel. So
xyij,ij will do that because
ij is in the
The result should be of size
xy. This gives me the function
Higher dimensional tensors are no big deal
I actually needed to do this with even larger tensors, with dimensions for the items in a minibatch, the number of input feature maps, and output feature maps. Labeling the dimensions and following the rules made this a little less of a headache. For example, getting gradients during back propagation looked something like this: