It doesn't seem unusual to me. You write a test script that generates random numbers and saves them to foo.txt, then reads them back from foo.txt for the purpose of running one kind of test. Then you wrap that in a loop for every kind of test. You end up testing everything, but foo.txt only has the numbers from the last test.
To be clear, I'm not saying it's not a bad practice to do that. But the context is that it was a one-off job to do the testing, not a proper test framework, for which this kind of half-assed approach is reasonable.
True, I think I have something like that in multiple places for tests that are used to validate prod code. But our code won't lock people's devices down with no recourse.
I think what made me scratch my head is that they didn't consider that generating random phone numbers might lead to collisions. I've been hesitant to publish fake datasets that have randomly generated and validly formatted SIN/SSN numbers in it for that exact reason.
Even if the risk is low, I'd still think keeping indefinite logs of that would be a good idea.
Yes, the part about them overwriting the data from each cycle struck me as odd. I don't think that's good practice even if testing.