I am trying to understand the math behind logistic regression. Going through a couple of websites, lectures and books, I tried to derive the cost function by thinking of it as the negative of the maximum likelihood. My derivation matches the cost function shown in this Wikipedia page https://en.wikipedia.org/wiki/Logistic_regression and in other places.
If the inputs are $x^{(i)}$ and outputs are $y^{(i)}$, where $(i)$ refers to the $i$th data point, then the cost as a function of weights $w$ seems to be
$$-\sum_{i=1}^m y^{(i)} \log\left(\frac{1}{1 + e^{-w^Tx^{(i)}}}\right)+\left(1-y^{(i)} \right) \log \left( 1-\frac{1}{1 + e^{-w^Tx^{(i)}}} \right) $$
I can simplify further to $$\sum_{i=1}^m -w^Tx^{(i)} y^{(i)} +\log (1+e^{-w^Tx^{(i)}})$$ However the expression shown in the SciKit Learn guide https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression is $$\sum_{i=1}^m \log ( 1 + e^{-w^Tx^{(i)}y^{(i)}}) $$
I have tried some algebra and am not able to derive their formulation. Am I missing something? It is highly possible that I haven't tried all the tricks there are in simplifying