I don't think you are comparing like with like.
In the left-most panel of the first image, you are seeing the weights in each kernel (one channel from one convolutional layer). These are yellow in the figure below. The size of the kernels is determined by the hyperparameters of the network; they might have a size like 3 × 3 or 31 × 31. I'm not 100% certain about the other two panels; the right-most looks more like convolutional products than filters.
In the second, you are certainly looking at the activations when given a particular input example. These images are the result of convolving the input with the kernels; this part is pink in the figure below. Their size depends on the input image size, the kernel size, and the convolution parameters.
From the article you linked to:
