0
$\begingroup$

I have a time-series of events that go through a black box over a period of time. I need to determine if there is seasonality within the black box using these events, that are either True or False. The black box may or may not experience periods of time where it is more likely and less likely to cause the event to be False, and at different probabilities.

For example, how would I determine if the black box is using the equation where $y$ is the probability of the event at time $x$ being True?

$$y = \frac{\sin\left(\frac{1}{x^{m}}\right)}{n}$$

How would I go about finding this using regular samples of $x$?

What I have so far:

I was thinking of simpler cases, such as $y = nx$ and of course, I can just calculate the probability within a rolling average window and find that the probability of probabilities will begin to trend, but when I start thinking of waveform or irregular functions I'm not so much certain that it works. For example, if I set my black box to be $y = \sin(x)$ and use a rolling average with exactly the right size I may be able to find it.

But when it comes to irregular waveforms like in the example above, or a combination of multiple, I'm not quite sure even something like SARIMA(X) but bayesian/binary (does that even exist?) could help as it is a predictive model, or at least I don't really know if I can use it to determine if there is a "seasonality" with overall trends in this black box.

I'm aware of the use of periodograms but don't exactly know how to use them in this case, maybe with a sort of moving average?

I have a theoretical black box that may be independent or trending/seasonal. and I have time-series data of True or False from this black box.

Anything I could use to analyze this data to determine if it is independent?

I am NOT a mathematician or a professional that involves this type of (or any) math so please edit this question if there are terms that I did not use correctly/described verbosely.

EDIT: @EthanBolker asked me to provide some data and some details about the source of the data: Here is an example of the (latest) data that I have taken in even intervals a little over 1000 points of either 1 (True) or 0 (False). The data is a the rate of "error" of a machine running in cycles. This machine has about 3 "phases" of operation per day where it is restarted for 5 minutes every 24? (unsure) cycles, 24/7 and runs one cycle every 1213500ms (about 20 minutes). This machine is running with a single output that is clearly good or bad and has 14 constituent machines with their own rate of failure that cannot be tested independently, and are each restarted at different periods of time, some of them weeks, days, hours, and other ones irregularly. Testing for periodicity or seasonality allows me to narrow down the possible causes of these errors and makes me very sad as I am unable to think of how to even begin. This machine has been running for over 10 years and has about 286887 events run so far.

111111101110101011100011010111111011111010001010000000001110110001000100100100010101100101001000000000001001001001010101000010100011001011011010000100111011011011111110100100100000111110100101101010010101010110101011111111110000101010010000101010101010101000011100011010010010010111111100011001001000000010000000010100010001000001101011011011011110011010010000111100100101001001011111010101011011011101011110010000110100101000111100100101001001011001100101101101101101010011001111101001101011001100111111000101000110010010011111010001000101001000010111000111010011001100111111010010011001101111111000100001111011011101001001000111101010011010011000111100101000010111010011111111010101011011101001011011111000000111110011111101001000001111100101100100100000000011010010010010010001101101101011011011110110101001011101111101001011000111101101110001100101101010101001010010101001101000111010011100110100100100000000111011001011100101101011101101111111000100000111011101011001001010000000001111101101111101010010011111111000010000111011011011101010100100101001010010100001110101111010101101011011000110111001011010011010111000110001101000100101001000000111000111010011011010010010000010011010010000001101101011110111000101001010101001101101111010001000011001110000011000010000111010111111001011011010010110101111101001000110010001010100110011101101001111111000011011101000010010100010010000000010110111000001101111011001101011011111111000011011101111111101001001010000111111111011110010000000111110000111111101011111000000111111100110100000000000000011101011101011000110110101101111000100011011001011011011111101010000111111101111000000111010101100100000001011011011110111101010000101101010110011110110101101101111101001101001101011010000011011101100000001000111101001000011111010101100010100101010010000011100111101111001111001111011010000110101111010000011010010110101101110011110011010111011111010000110110100000111110000100000000101010101011101111110111000100000001010100000111111011010

$\endgroup$
4
  • $\begingroup$ If you have two seasons, for example, what would you say is a null hypothesis you might test? $\endgroup$ Commented Sep 25 at 1:38
  • $\begingroup$ This is much too general a question. You should start by plotting your data to see whether they seem to follow a seasonal pattern. If you edit the question to show us the plot and tell us something about the source of the data we may be able to help. $\endgroup$ Commented Sep 25 at 1:51
  • $\begingroup$ @Aruralreader I was thinking if I took intervals from the two seasons, and calculated the ratio, I could find a ratio of odds, for example from season A, the number of trues over falses, A(1) /A(0) and divide by season B, B(1) / B(0) in order to find if these variables are statistically significant with some sort of t/z test? But then I'm kind of confused on what else to do after that. The null hypothesis would probably be that the ratio of probability would be the same between the seasons, but with ex. square root functions how can I be confident that this ratio difference is significant? $\endgroup$ Commented Sep 25 at 1:53
  • $\begingroup$ There could be some slight pattern with period $6$. For example, (counting from $1$) your data seems to have substantially more $1$s in position $3,9,15,\ldots, 1983$ than in positions $4,10,16,\ldots,1984$, namely $192$ against $138$, which you might describe as seasonality. $\endgroup$ Commented Oct 29 at 1:57

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.