Methodology, sources and coverage

We have estimated population for a total of 21815 polity/year observations for 174 polities at their historical borders. We detail the sources and methods of estimation in Appendix I of , Federico, G., Tena Junguito, A. (2023). How many people on earth? World population 1800-1938. Working Papers in Economic History. 23-02 handle: while here we discuss only the general criteria.

First and foremost, the data base includes all polities, even if there are no solid data on population but only guesstimates or nothing at all. Some polities existed only for part of the period and their series are correspondingly shorter (we report the start and end dates of these series in Appendix I).  For instance, our data-base includes a series for Austria-Hungary until the dissolution of the empire in 1918 and then separate series for the successor states, Austria, Hungary, Poland, Czechoslovakia and the Kingdom of Serbs, Croats and Slovenians (later Yugoslavia), while Tyrol and Dalmatia are included in the Italian population. Referring to present-day (or 1995) borders would be a-historical and would need additional information (and often guesses) on the regional population, increasing the risk of errors.  Our estimates refer to the whole population, including native people, whom official sources often counted separately, as in Australia (Smith 1980), or omitted altogether, as in the United States censuses before 1890.  Their omission would severely bias the results, as the share of natives on total population collapsed for the joint effect of the decrease in their absolute number and of immigration of white settlers (e.g.  natives accounted for over 90% of the Australian population until 1825, a third in 1850 and a mere 2% on the eve of World War One). The series for white settlers only are bound to underestimate total population at any specific moment in time and to overestimate its growth.  Following most historical and present-day censuses (Thorvaldsen 2018 p.156), we prefer to estimate present (de facto) rather than resident (de jure) population because it minimizes the distortions from imperfect registration of domestic and international migrations. When possible, we compute mid-year population as average of two consecutive end-year figures.