Puzzling proportions | Science | THE COUNTRY | EUROtoday

Get real time updates directly on you device, subscribe now.

Last week’s puzzling random numbers gave rise to fascinating feedback from my form readers. To the query: “Why is it more likely that the number of inhabitants of a city begins with 1 than with 9?”, Jonathan Arnold solutions (presumably from an English-speaking nation, judging by its identify and the absence of accents in your textual content):

“If we think about the population of an urban center of, for example, 800 or 900 inhabitants, it will be necessary to increase the population by 100 to 200 inhabitants to reach the figure of 1,000 inhabitants, a number that begins with a 1. From there , it will be necessary to increase the population by 1,000 inhabitants so that the first number goes from na n+1″.

And Xoaquín Fernández adds:

“If we apply Zipf’s law by population segments (from 10,000 to 99,999; from 100,000 to 999,999…) we could explain the result well. For each section the number of populations in the initial segment is greater than in the next. My explanation would emphasize agglomeration economies based on a more balanced initial distribution of the population; It would be a cumulative process, which would select, sometimes by chance, some point, and reinforce it.”

Zipf’s legislation was formulated in the course of the final century by the American linguist George Okay. Zipf, who utilized statistical evaluation to the examine of various languages ​​and located that the frequency of look of phrases follows a sample much like that established by the legislation. from Benford-Newcomb (however that is one other article).

For her half, Adelaida López brings up an fascinating anecdote in relation to the subject at hand:

“There are simple statistical tricks to detect certain types of fraud. For example, Hill (the first to mathematically formalize Benford’s law) proposed to his students the homework exercise of tossing a coin 200 times in the air and recording when it came up heads and when it came up tails. The laziest, and most cheaters, did not bother to actually toss the coin 200 times and randomly wrote down heads and tails in a fairly uniform manner, but it never occurred to any of them to write down heads or tails 6 times in a row, because intuitively they did not consider it likely that such long consecutive series were given, which is false when 200 real launches are made. “Because of that ruling, Hill detected the cheaters.”

In reality, the chance that flipping a coin 200 instances will finally lead to 6 heads or 6 tails in a row is about 96% (are you able to calculate it?), so a too even distribution of heads and tails was an indication ( virtually) positive I cheated.

Counting phrases, surnames, folks…

In future installments we’ll deal (it’s not majestic plural: I inform, as common, with the collaboration of my sagacious readers) of the aforementioned Zipf legislation (and the Pareto precept, with which it’s intently associated) and, As a warm-up, I suggest the next train: select any textual content of a sure size (a chapter of a e-book, a narrative, a protracted article…), write down the variety of instances the 5 most frequent phrases seem and attempt to set up a relationship between these frequencies. If onomastics entice you greater than linguistics, you are able to do the identical with the 5 most frequent surnames. Or with some other set that lends itself to ordering its subsets by the variety of components, such because the populations of probably the most populous cities in a rustic.

https://elpais.com/ciencia/2024-09-27/proporciones-desconcertantes.html