2
$\begingroup$

We were discussing universal approximation theorems for neural networks and showed that the triangular function

$$ h(x) = \begin{cases} x+1, & x \in [-1,0] \\ 1-x, & x \in [0,1] \\ 0, & \text{otherwise} \end{cases} $$

can be written using the ReLU function $p(x) = \max(0, x)$ as

$$ h(x) = p(x+1) - 2p(x) + p(x-1). $$

The next step was to prove that any continuous function $f:[0,1] \to \mathbb{R}$ can be uniformly approximated by a shallow neural network with ReLU activation. Our instructor just referred to Faber–Schauder approximations and did not give a detailed proof, so I am left a bit confused as for how we would go about showing this.

My questions:

  1. Why can any piecewise linear (Faber–Schauder-type) approximation be represented as a sum of ReLU functions?
  2. Why then can any continous function be represented via such a Faber-Schauder approximation?
  3. Is the restriction to $[0,1]$ essential, or just a convention?

Any rigorous explanation or reference would be greatly appreciated.

$\endgroup$
1
  • $\begingroup$ Proofs using other approximations than Faber-Schauder are also welcome! $\endgroup$ Commented Aug 2 at 19:50

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.