The sensor may be illuminated by many different light sources other than the intended one, such as the sun, glare from a window or artificial light flickering at 50/60Hz. If the receiver "sees" ambient light, it might confuse it with light from the intended source, an unbroken beam. To distinguish the intended source from ambient light, the beam is switched at 1kHz, a frequency very different from other natural or artificial sources of "noise". A "safe" condition can only be assumed if the signal frequency is near 1kHz.
The sensor source impedance \$R_i\$ was absent from your simulation, but it is essential because it forms a low-pass filter with the 10nF capacitor C1. The cut-off frequency of such a filter is:
$$ f_{LP} = \frac{1}{2\pi RC} $$
\$R_i\$ changes depending on signal state, but the two possible frequencies would be:
$$ \begin{aligned} f_{LP1} &= \frac{1}{2\pi R_iC_1} \\ \\ &= \frac{1}{2\pi \times 100 \times 10nF} \\ \\ &= 160kHz \\ \\ f_{LP2} &= \frac{1}{2\pi \times 560 \times 10nF} \\ \\ &= 28kHz \end{aligned} $$
If the signal is indeed 1kHz, then it would make more sense to have a much lower cut-off frequency than 28kHz. These values tell me that this filter is probably intended to suppress general noise/pick-up from radio or local appliances like switching LED lighting supplies.
The second stage consists of \$C_2=100nF\$ and \$R_i+R_1+R_2=R_i + 10k\Omega + 4.7k\Omega\$, forming a high-pass filter. Noting that \$R_1\$ and \$R_2\$ are very large compared to \$R_i\$:
$$ \begin{aligned} f_{HP} &= \frac{1}{2\pi C_2(R_i+R_1+R_2)} \\ \\ &\approx \frac{1}{2\pi \times 100nF \times 15k\Omega} \\ \\ &\approx 100Hz \\ \\ \end{aligned} $$
Only signals at frequencies significantly greater than 100Hz will make it un-attenuated past this stage, which includes our 1kHz pulses. Everything else is considered noise, from sources such as 50/60Hz mains pickup, light flickering, or power supply fluctuations. Being a high-pass filter, any DC offset in the source signal is removed, and the signal now oscillates symmetrically above and below 0V ground.
This satisfies the requirement that a steady or slow-changing light source will not make it through this stage, and you won't register a "safe" condition in error. The only dominant signal to get this far should be the 1kHz you expect.
The base-emitter junction of the transistor is a diode that clamps base potential \$V_B\$ to +0.7V or lower. Another diode is connected in anti-parallel to that junction, clamping \$V_B\$ to −0.7V or greater. This "lower clamp" is necessary for two reasons:
It keeps the waveform symmetrical, so that its average is zero, and it spends as much time above zero as below. Without a lower clamp, negative excursions will be very deep compared to the +0.7V maximum, changing the signal's average potential, which we want to be in the middle ideally.
The transistor's base-emitter junction is a zener diode which can break down and become highly conductive when reverse biased. By ensuring \$V_B>-0.7V\$ that can never happen.
\$V_B\$ oscillates symmetrically above and below 0V, constrained to \$-0.7V < V_B < +0.7V\$.
You'll notice that \$R_1\$ and \$R_2\$ form a potential divider, with attenuation \$\frac{R_2}{R_1+R_2}\approx \frac{1}{3}\$. The position of \$R_1=10k\Omega\$ is important; the diode clamps will become like a short-circuit to ±0.7V as the source signal extends outside the bounds −0.7V to +0.7V. \$R_2\$ limits current in such cases, relieving the sensor of such a heavy load, and permitting sensor voltage to continue to oscillate unhindered, to its usual full extents.
The attenuation of signal potential by \$\frac{1}{3}\$ seems to be a necessary and unfortunate by-product of the need for 15kΩ total, while the larger resistance of the two must appear prior to the clamps. Your +3.8V pulse is attenuated to the point that the base reaches the +0.7V required to saturate the transistor, but quickly decays to below that threshold. As you noted, raising \$R_2\$ to 10kΩ improves the situation, by attenuating less. However, \$R_1+R_2\$ has increased to 20kΩ, which lowers the cut-off frequency. You should consider decreasing C2 to keep the cut-off near 100Hz. Personally I think you should increase cut-off to more like 200Hz, to better attenuate any 100Hz or 120Hz ripple from a linear power supply:
$$ \begin{aligned} 200Hz &= \frac{1}{2\pi C_2(R_i+R_1+R_2)} \\ \\ C_2 &\approx \frac{1}{2\pi \times 200Hz \times 20k\Omega} \\ \\ &\approx 39nF \\ \\ \end{aligned} $$

simulate this circuit – Schematic created using CircuitLab
Calculating base current is quite difficult, so a simulation will help:

It peaks at more than \$I_B=100uA\$, and falls rapidly, due to the high-pass filter. It will be useful to know what collector current \$I_C\$ would flow when the transistor is saturated:
$$ \begin{aligned} I_C &= \frac{3.3V}{R_3} \\ \\ &= 330\mu A \\ \\ \end{aligned} $$
Assuming transistor \$\beta=100\$, base current required for saturation would be:
$$ I_{B(SAT)} = \frac{I_{C(SAT)}}{\beta} = 3.3\mu A$$
\$I_B=100\mu A\$ is clearly more than sufficient, and the transistor will remain saturated until \$I_B\$ has fallen to 3μA or so, explaining why the output remains planted near 0V for almost 50% of the cycle:

Having said all that, the original circuit is behaving (in your simulation) exactly as designed and expected (though you did not include \$R_i\$). It's not great though, as you saw. The two filters naturally retard signal rise and fall, and tend to soften sharp corners. The signals you show are perfectly acceptable digital signals, if a little sluggish. The biggest worry would be a signal that doesn't make it all the way to the top/bottom before returning.
The duty-cycle is not 50%:50% because the transistor only begins to switch on when its base rises beyond +0.6V or so. Since the base waveform is centered on 0V, oscillating between ±0.7V, the transistor spends most off its time off (output high). This is not a problem, because as I explained before, the characteristic of this signal which one interprets to indicate "safe" is the presence of changes. The duty cycle of those changes is irrelevant. Whatever hardware/software is used to interpret the output, it should detect changes, and as long as it sees at least 2 changes each millisecond, then there's nothing blocking the light beam.
automatically opening/closing doors to avoid hard collisions that may result in damage or injuries, are there any functional safety requirements to be met for the PWM logic level shift? \$\endgroup\$