We know optimization techniques search in the space of all the possible parameters for a parameter set that minimizes the cost function of the model. The most well-known loss functions, like MSE or Categorical Cross Entropy, has a global minimum value equal to zero, in the ideal case.
For example, the Gradient Descent, $\theta_j \leftarrow \theta_j - \alpha \frac{\partial}{\partial \theta_j}J(\theta)$, updates parameters based on the derivation of the calculated cost function value, $J(\theta)$.
I was wondering what will happen if we design a cost function that has a non-zero global minimum in its ideal case. Does it make a difference, e.g. in the convergence rate or other aspects of the optimization process, or not?

