So my question is, how to prevent this problem? How to ensure that rogue messages sent from other users do not trigger my receiver and cause unwanted actions or taint the data I am collecting?
You identified the problem.
Luckily, depending on whether you use 3, 4 or 5 byte long addresses, there's 23·8 = 16.8 million, 24·8 = 4.29 billion, or 25·8 = 11.0 trillion addresses. So, if you picked your addresses equally likely from all possible addresses, the probability of an address conflict happening unintentionally is really damn near to 0: It's the birthday problem, i.e. when you have \$2^k\$ possible addresses, all addresses are equally likely chosen to be used by a device, and \$n\$ devices in your area, then the probability of all randomly chosen address being distinct is
$$\begin{align} P_{\text{no addr conflict}} &= \frac{2^k!}{{2^k}^n\cdot(2^k-n)!}\\ &=\frac{(2^k-n)\cdot(2^k-(n-1))\cdot(2^k-(n-2))\cdot\ldots\cdot (2^k-0) }{{2^k}^n}\\ &= \underbrace{\frac{2^k-n}{2^k}}_{\text{very close to }1} \cdot\underbrace{\frac{2^k-(n-1)}{2^k}}_{\text{ very very close to }1} \cdot\underbrace{\frac{2^k-(n-2)}{2^k}}_{\text{ damn near }1}\cdots \underbrace{\frac{2^k-(n-n)}{2^k}}_{=1}\\ &\approx 1\\ &\forall n \ll 2^k , \end{align}$$
which, for realistic values of \$n\$, and \$k=40\$, is really really practically "1". This is a very practical "practical": in fact, if you want to, say, press the probability of this happen to below 1 in 10000, you'd only need about 15000 active devices on the same channel to break that probabilistic bound. However, considering something else: at the point where any receiver can "hear" 15000 other devices, chances are the system you're trying to build suffers from something else: on-the-air contention, where simultaneously sent packets collide with each other¹.
That's why I have my doubts with your initial assertion that you can receive these devices for kilometers: it would be stupid for almost all users of them to operate them at maximum transmit power. Generally, if you want to build a many-devices low power wide area network (LPWAN), then these aren't the kind of devices you'll ever want to work with – they are meant for a computer mouse to communicate with a computer, not for 10000 street lights to be controlled by a city, or for 4000 electric scooters to be controlled by a venture-capital fueled loss leader company.
This means you either "production-side" program the addresses randomly chosen with a good random number generator (not UNIX rand, [but almost anything else]), or your devices have their own implementation of the random number generation (modern microcontrollers tend to even have built-in true random generators!), and a "pairing" address. That pairing address would be static, but your users would need to manually initiate the pairing process by pressing a button or something, then there would only be a short time window in which someone might interfere, and if you can, you'd usually set the sensitivity of your receiver low and reduce the transmit power, so that this really only works with closely co-located devices.
Now, that doesn't inhibit intentional meddling by someone else observing your packets and then "squatting" your address. There's no help against that but to authenticate your packets; an HMAC is the usual way to go about that.
¹ It's a common method to model this collision probability very similar to a address collision probability: say the average burst is 100 bit long, and you're using the 1 Mbps mode, then the transmitter is 100 µs active on air. Assuming every transmitter needs to send a packet every hour, we can "chunk up" this hour (= 3600s) into 100 µs slots, of which every burst occupies (because it's very unlikely that a packet "aligns" perfectly with the slot), thus effectively halving the available "slot space" from 3.6·10⁹ slots to 1.8·10⁹ slots. Now, as you'll notice, 1.8·10⁹ is already much less than 232 or 240; so, address collision is less of a concern than transmit collision. And hence, yeah, for all practical purposes, 32 bit addresses solve that for systems with this 1µs/3600s duty cycle, because it's much less likely that two transmitters collide on addresses than on transmit time, and when it's much more likely that they collide on transmit time, you need to care about that, because re-transmission can have cascading effects, bringing this slotted ALOHA with retransmissions system to its knees long before address collisions become likely.