Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • News & Views
  • Published: 16 December 2021

MACROECOLOGY

Drivers of language loss

  • Claire Bowern 1  

Nature Ecology & Evolution volume  6 ,  pages 132–133 ( 2022 ) Cite this article

700 Accesses

1 Citations

9 Altmetric

Metrics details

  • Language and linguistics
  • Macroecology

A macroecological view suggests some global drivers of language endangerment and continuity, but a focus on individual languages will be important to stem the tide of language loss.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

$29.99 / 30 days

cancel any time

Subscribe to this journal

Receive 12 digital issues and online access to articles

$119.00 per year

only $9.92 per issue

Rent or buy this article

Prices vary by article type

Prices may be subject to local taxes which are calculated during checkout

causes of language death research paper

Nettle, D. Linguistic Diversity (Oxford Univ. Press, USA, 1999).

Campbell, L. & Belew, A. Cataloguing the World’s Endangered Languages (Routledge, 2018).

Bromham, L. et al. Nat. Ecol. Evol. https://doi.org/10.1038/s41559-021-01604-y (2021).

Amano, T. et al. Proc. R. Soc. B 281 , 20141574 (2014).

Article   Google Scholar  

Austin, P. K. & Sallabank, J. The Cambridge Handbook of Endangered Languages (Cambridge Univ. Press, 2011).

Kandler, A., Unger, R. & Steele, J. Phil. Trans. R. Soc. B 365 , 3855–3864 (2010).

Kik, A. et al. Proc. Natl Acad. Sci. USA 118 , e2100096118 (2021).

Article   CAS   Google Scholar  

Lewis, M. P., Simons, G. F. & Fennig, C. D. Ethnologue: Languages of the World 17th edn (SIL International, 2013).

Fischer, S. D. in The Routledge Handbook of Historical Linguistics (ed. Bowern, C. & Evans, B.) Ch. 20, 443–465 (CRC Press, Routledge, 2015).

Hou, L. & Kusters, A. in The Routledge Handbook of Linguistic Ethnography (ed. Tusting, K.) Ch. 25 (CRC Press, Routledge, 2019).

Turner, M. K. & McDonald, B. M. J. Iwenhe Tyerrtye: What it Means to be an Aboriginal Person (IAD Press, 2010).

Hercus, L. A. & Sutton, P. This is What Happened: Historical Narratives by Aborigines (Australian Institute of Aboriginal Studies, 1986).

Meek, B. A. Annu. Rev. Anthropol. 48 , 95–115 (2019).

Download references

Author information

Authors and affiliations.

Department of Linguistics, Yale University, New Haven, CT, USA

Claire Bowern

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Claire Bowern .

Ethics declarations

Competing interests.

The author declares no competing interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Bowern, C. Drivers of language loss. Nat Ecol Evol 6 , 132–133 (2022). https://doi.org/10.1038/s41559-021-01621-x

Download citation

Published : 16 December 2021

Issue Date : February 2022

DOI : https://doi.org/10.1038/s41559-021-01621-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

causes of language death research paper

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Digital Language Death

* E-mail: [email protected]

Affiliation Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary

  • András Kornai

PLOS

  • Published: October 22, 2013
  • https://doi.org/10.1371/journal.pone.0077056
  • Reader Comments

Figure 1

Of the approximately 7,000 languages spoken today, some 2,500 are generally considered endangered. Here we argue that this consensus figure vastly underestimates the danger of digital language death, in that less than 5% of all languages can still ascend to the digital realm. We present evidence of a massive die-off caused by the digital divide.

Citation: Kornai A (2013) Digital Language Death. PLoS ONE 8(10): e77056. https://doi.org/10.1371/journal.pone.0077056

Editor: Eduardo G. Altmann, Max Planck Institute for the Physics of Complex Systems, Germany

Received: February 13, 2013; Accepted: August 30, 2013; Published: October 22, 2013

Copyright: © 2013 András Kornai. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: Work supported by OTKA grant #82333. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The author has declared that no competing interests exist.

Introduction

The biological metaphor of viewing languages as long-lived organisms goes back at least to Herder [1] , and has been clearly stated in The Descent of Man [2] :

The formation of different languages and of distinct species, and the proofs that both have been developed through a gradual process, are curiously parallel. (…) We find in distinct languages striking homologies due to community of descent, and analogies due to a similar process of formation. The manner in which certain letters or sounds change when others change is very like correlated growth. (…) Languages, like organic beings, can be classed in groups under groups; and they can be classed either naturally according to descent, or artificially by other characters. Dominant languages and dialects spread widely, and lead to the gradual extinction of other tongues.

While not without its detractors [3] , the biological metaphor has been widely accepted both in research concerning language death [4] , [5] and in guiding political action (see e.g. the United Nations Environment Programme Convention on Biological Diversity, [6] ). Here we investigate the phenomenon of digital ascent whereby languages enter the space of digitally mediated communication. We could extend the metaphor and talk about the digital hatching, pupation, or metamorphosis of languages, but would gain little by doing so, since we can only speculate about further, post-digital stages in the life cycle of languages.

In this paper, we bring the traditional methods of language vitality assessment to the digital realm. First we transfer the criteria themselves: instead of speaker population we look at the online population, instead of vigorous oral use we look at vigorous online use, and so forth, see Background (i)–(v). Second, we collect data from online sources that reveal the relevant variables or at least provide acceptable proxies for these, see Materials. Third, we introduce a four-way classification into digitally thriving (T), vital (V), heritage (H), and still (S) languages, roughly corresponding to the amount of digital communication that takes place in the language, and manually select prototypical seeds for these classes, see Methods. Finally, multinomial logistic classifiers are built on the seeds and are applied to the rest of the data, see Results. This four-stage method is shown to be robust, and remarkably independent of the manual choice of seeds, see Discussion. The Conclusions section interprets our main result, that the vast majority of the language population, over 8,000 languages, are digitally still, that is, no longer capable of digital ascent.

A language may not be completely dead until the death of its last speaker, but there are three clear signs of imminent death observable well in advance. First, there is loss of function, seen whenever other languages take over entire functional areas such as commerce. Next, there is loss of prestige , especially clearly reflected in the attitudes of the younger generation. Finally, there is loss of competence , manifested by the emergence of ‘semi-speakers’ who still understand the older generation, but adopt a drastically simplified (reanalyzed) version of the grammar. The phenomenon has been extensively documented e.g. in Menomini [7] , Gaelic [8] , and Dyrbal [9] .

In the digital age, these signs of incipient language death take on the following characteristics. Loss of function performed digitally increasingly touches every functional area from day to day communication (texting, email) to commerce, official business, and so on. Loss of prestige is clearly seen in the adage If it’s not on the web, it does not exist , and loss of competence boils down to the ability of raising digital natives [10] in your own language. Digital ascent is the opposite process, whereby a language increasingly acquires digital functions and prestige as its speakers increasingly acquire digital skills.

Language endangerment and language death, in the traditional sense, are widely investigated and actively combated phenomena. The modern EGIDS classification [11] extends the Graded Intergenerational Disruption Scale (GIDS) of Fishman [12] to the following 13 categories: 0. International; 1. National; 2 Provincial; 3 Wider communication; 4 Educational; 5 Developing; 6a Vigorous; 6b Threatened; 7 Shifting; 8a Moribund; 8b Nearly Extinct; 9 Dormant; 10 Extinct. Categories 7–8b are considered endangered in the UNESCO Atlas of the World’s Languages in Danger [13] , and categories 9–10 are considered extinct. Since these comprise only 17% of the world’s languages, with another 20% (category 6b) vulnerable, one may get the impression that the remaining 63% (these numbers are from [14] ) of the world’s languages are more or less in good shape. While this may be true in the traditional sense, the main finding of our paper will be that the vast majority (over 95%) of languages have already lost the capacity to ascend digitally.

Since digital(ized) data persists long after the last speaker is gone, we cannot simply equate failure to ascend with lack of online data. We will make a distinction between digital heritage status, where material is available for research and documentation purposes, but the language is not used by native speakers (L1) for communication in the digital world, and digitally still status, characterized by lack of even foreign user (L2) digital presence. It is of course very important to move languages from the still to the heritage stage, and there are significant efforts under way to bring data and metadata about languages online and to make both lexical resources and primary texts web-accessible, see the Materials section for an introduction to these. In the Results section we will see that such efforts, laudable as they are, actually contribute very little to the digital vitality of endangered languages. Just as the dodo is no less extinct for skeleta, drawings, or fossils being preserved in museums of natural history, online audio files of an elder tribesman reciting folk poetry will not facilitate digital ascent, and both still and heritage languages are digitally dead in the obvious sense of not serving the communication needs of a language community.

Digital ascent is a relatively new phenomenon, especially on the hundred year timescale common in studies of language death. Digital communication was not an important arena of language functionality until the spread of electronic document creation in the 1970s; the internet and email in the 1980s; the web and blogging in the 1990s; wikis and text messaging (SMS) in the 2000s. Our approach will nonetheless be conservative inasmuch as we simply adopt the standard conceptual framework, and the standard yardsticks, to the digital domain. We will also try to be maximally conservative in the sense that we will interpret the evidence favorably wherever we can, so as to minimize false alarms. There are five confluent factors we consider: (i) the size and demographic composition of the language community; (ii) the prestige of the language; (iii) the identity function of the language; (iv) the level of software support; and (v) wikipedia. The last two may superficially look peculiar to the digital domain, but as we shall see, they are just convenient proxies for assessing a traditional yardstick, the functional spread of the language.

(i) Community size

The primary traditional measure of vitality is the size and generational composition of the language community. In the digital realm, what we are interested in is the number of digital natives in the language. Since the phenomenon is new, the demographics are highly favorable: once the language community starts creating content by sending text messages, writing blogs, and building wikis, we can reasonably expect that the younger generation will follow suit, especially as digital fora like Facebook are increasingly becoming a means for parents and grandparents to stay in touch with their children. Therefore, we need to assess only the size of the wired community separately, and can assume its demographic composition to be uniformly good.

State censuses generally address the question of linguistic and national identity, and tribe sizes are well known within the community, so it is generally not hard to get at least a rough order of magnitude estimate on the number of speakers. However, in and of itself a large and sustainable population cannot guarantee digital ascent – what we need to consider is population actively engaged in digitally mediated interaction . Passive consumption of digital material, especially digital material in an encroaching language, is irrelevant, if not actively harmful to the survival of a threatened language. Michael Krauss’ famous remark “Television is a cultural nerve gas…odorless, painless, tasteless. And deadly.” [15] applies to the web just as well.

Since neither the size of the digitally enabled population nor the digital suitability/prestige of the language are measured by censuses or other regular surveys, we must resort to proxies in assessing digital vitality. The real issue is the amount of digitally mediated communication that takes place in the language. Ideally, we should capture all videoconference (Skype), cellphone, Twitter, Facebook, etc. communication and measure the proportion of material in the language in question. Modern language technology has already solved the problem of language identification, the Crúbadán Project [16] actually builds such software for each language. As this technology in no way relies on understanding the contents, privacy concerns are minimized and the barriers to the direct measurement of digital language vitality are primarily organizational: we need to put safeguards in place to make sure that the data will be anonymized, that the people whose communications are monitored give their permission, and so forth. Until such a comprehensive study is conducted, we must use the publicly available textual material as our proxy – this has the advantage that all such material was put there knowingly by their authors, so concerns of privacy are resolved in advance. The size of online holdings (excluding wikipedia, see (v) below) was assessed by web crawling. Our methods are described in [17] , and some of the results are made available for public download at http://hlt.sztaki.hu/resources/webcorpora.html .

(ii) Prestige

The second most important measure of vitality is prestige. Since digital communication is universally viewed as more prestigious than communication by traditional means, the intergenerational disruption actually acts in favor of digital ascent, provided the new generation has both the digital means and the interest in language use. In digitally vital languages this happens quite effortlessly and automatically, but languages the new generation no longer considers cool are caught in a pincer movement, with the old generation unable and unwilling to enter the digital world and the younger generation no longer considering the old language relevant. They may not be semi-speakers in the technical sense, as they retain full control over the grammar and vocabulary, but at the same time they may consider the language inappropriate for dealing with the digital realm. An almost laboratory pure example is provided by the two officially recognized varieties of Norwegian, Bokmål and Nynorsk. For many years, the two wikipedias were of roughly equal size, and the best estimates [18] put the proportion of language users at 7∶1. By now, the Bokmå l wikipedia is four times the size of the Nynorsk wikipedia, but Nynorsk is still in the top 50. With a sizeable population of speakers that enjoy a high standard of living, a nearly saturated personal computer market, and good access to broadband networks, based solely on census data and wikipedia statistics Nynorsk would appear a prime candidate for digital ascent. Yet crawling the.no domain demonstrates a striking disparity: we could find 1,620 m words (tokens) of Bokmål but only 26 m words in Nynorsk. Considering that official (government and local government) pages are published in both varieties, the actual proportion of user-generated Nynorsk content is well under 1%. In spite of a finely balanced official language policy propping up Nynorsk, the Norwegian population has already voted with their blogs and tweets to take only Bokmå l with them to the digital age.

The same phenomenon can be seen at the other side of the digital divide. As an example consider Mandinka, which is, besides Swahili, perhaps the single best known African language for the larger American audience, thanks to Alex Hailey’s Roots . With 1.35 m speakers, and official status in two countries (Senegal and The Gambia), Mandinka is neither endangered nor threatened in the traditional sense – SIL puts its EGIDS rating at 5 (developing) and notes the positive attitude speakers of all ages have toward the language. However, its failure to digitally ascend appears a foregone conclusion: literacy in the language is below 1%, and the wikipedia incubator [19] has not attracted a single native speaker.

(iii) Identity function

As we will primarily rely on written material, particular care needs to be taken to distinguish passive (read only) web presence such as lexicons, classical literature, or news services, from active use in a broad variety of two-way contexts such as social networks, business/commerce, live literature, etc. Language is for communication, and passive presence indicates only efforts at preservation, often by scholars actually outside the language community, not digital vitality. As an example consider Classical Chinese, a language with a sizeable wikipedia, nearly 3,000 articles, and a remarkable user community of over 30,000 L2 users. There are also significant text holdings elsewhere (see in particular http://ctext.org ). At the same time, the top-level question in [11] , which probes the identity function of a language clearly puts Classical Chinese in the Historical/Heritage category, there defined as follows:

Historical.

The language has no remaining speakers and no community which associates itself with the language as a language of identity. There are no remaining functions assigned to the language by any group (…).

There are no remaining L1 speakers, but there may be some emerging L2 speakers or the language may be used for symbolic and ceremonial purposes only.

(iv) Functional domains

Initially, digital word processing was restricted to large organizations and printing presses, but with the spread of PCs, desktop publishing became available at the household level. Similarly, the function of making public announcements, until recently restricted to the village worthy, became available to individuals, who can post on bulletin boards or (micro)blog. Altogether, the digital age ushered in, or made more accessible, many forms of communication hitherto restricted to small elites, and this is undoubtedly one of its main attractions. But for a language to spread to these new or newly democratized functional areas, one generally needs a bit of software. (The main exception is cellphone usage, which we had to ignore in this study for lack of data.) To quantify software support we use a simple three-stage hierarchy, roughly analogous to the questions probing literacy status in EGIDS, see the Methods section.

(v) Wikipedia

Since digital ascent means active use of the language in the digital realm, we need to identify at least one active online community that relies on the language as its primary means of communication. There may be small bulletin boards, mailing lists, Yahoo, or Google groups scattered around, but experience shows that Wikipedia is always among the very first active digital language communities, and can be safely used as an early indicator of some language actually crossing the digital divide. The reason is that children, as soon as they start using computers for anything beyond gaming, become aware of Wikipedia, which offers a highly supportive environment of like-minded users, and lets everyone pursue a goal, summarizing human knowledge, that many find not just attractive, but in fact instrumental for establishing their language and culture in the digital realm. To summarize a key result of this study in advance: No wikipedia, no ascent.

The need for creating a wikipedia is quite keenly felt in all digitally ascending languages. This is clearly demonstrated by the fact that currently there are 533 proposals in incubator stage, more than twice the number of actual wikipedias. In fact, the desire to get a working wikipedia off the ground is so strong as to incite efforts at gaming the ranking system used by wikipedia, which sorts the various language editions at http://meta.wikimedia.org/wiki/List_of_Wikipedias simply by number of articles. The most blatant of these Potemkin wikipedias is #37, Volapük, which is based almost entirely on machine-generated geographic entries such as Kitsemetsa Kitsemetsa binon vilag in grafän: Lääne-Viru, in Lestiyän. Kitsemetsa topon videtü 58°55′ N e lunetü 26°19′ L. ‘Kitsemetsa is a village in Lääne-Viru County, in Estonia. It is at at latitude 58°55′ N and longitude 26°19′ E.’ The Methods section discusses how the effects of such gaming can be removed.

All our data come from public repositories accessed between June 2012 and March 2013. A consolidated version of our main data table, 8,426 rows by 92 columns, is available as File S1 . Here we provide only a brief overview of the main data sources, see File S2 for further details. The data is intended to cover the entire population of the world’s languages – some lacunae may remain, but internal consistency checks suggest that our coverage is over 95%.

The primary registry of data about the world’s languages, now charged with maintaining the ISO 639 standard for language codes, is the Ethnologue database of the Summer Institute of Linguistics (SIL International), see http://www.ethnologue.com . The latest (2012/02/28) publicly available version of the database distinguishes 7,776 languages, among them 376 that died since 1950 when SIL started to maintain the list.

We consulted several other sources, and our own dataset is larger by about 10% for the following reasons. First, we didn’t discard ancient/reconstructed languages such as Classical Chinese or Proto-Indo-European and artificial/constructed languages like Peano’s Interlingua (Latin Sine Flexione), which are by design out of scope for the Ethnologue. Second, our sources cover several languages that have only been recently discovered and have not yet completed the registry process: an example would be Bagata, a language spoken by one of the Scheduled Tribes in Andhra Pradesh. Third, we considered language groupings with online activity like Akan and Bihari irrespective of whether they meet the SIL criteria for ‘macrolanguage’. Whenever we encountered languages with no ISO code, and no code on the Linguist List (see http://linguistlist.org ), we generated a non-authoritative internal code that begins with xx so as to maintain unique identifiers suitable for joining rows from different sources. For less commonly taught languages, we generally mention the ISO code (three lowercase letters) because the language names themselves are often subject to considerable spelling variation. Altogether, we have 7,879 ISO codes (the number is larger than the size of the February 2012 dump because the site now provides codes for many newly registered languages), with the balance coming from other sources, to which we now turn.

Perhaps the best organized of these is the Open Language Archives Community, ‘an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources’, see http://www.language-archives.org . OLAC has some data for 7,478 of the 7,776 languages with ISO codes. Neither OLAC nor Wikipedia will consider languages without ISO code, so the lack of ISO status could in principle be a handicap for digital ascent. In practice, however, our conclusions can only be strengthened by the inclusion of these unregistered languages since they are already at the margin, with EGIDS level 6b or worse, while failure to ascend affects many languages at EGIDS level 4 or even better.

The last source aiming at encyclopedic completeness is the Endangered Languages Project hosted at http://www.endangeredlanguages.com which consolidates data from the Catalogue of Endangered Languages (ELCat), produced by the University of Hawai’i at Manoa, and The Institute for Language Information and Technology (The Linguist List) at Eastern Michigan University. We accessed the database on 2013/03/15, when it contained data for 3,175 languages. ELP uses a different scale of vitality, with categories critically endangered; severely endangered; endangered; threatened; and vulnerable, which correlate well with the higher EGIDS categories but are independently assessed. Since ELP considers vital languages (which are generally EGIDS 6a or less) out of scope, the fact that a language has no ELP page is generally a good sign. with

Less encyclopedic, but very relevant to our purposes, is the website of the Crúbadán Project, see http://borel.slu.edu/crubadan , which collects language data for endangered languages on the web. Version 2 covered 1,322 languages 2013/03/15 when we accessed the data, Version 1 started with 1,003 in 2006. The Crúbadán Project, quite independent from us, but consistent with our methodology, chose not to harvest material from closed archives such as the Rosetta Project (see http://rosettaproject.org ) or metainformation such as the grammatical features collected in The World Atlas of Language Structures (see http://wals.info ), since these are in no way indicative of digital use by native speakers.

Another highly relevant website is Omniglot, ‘the online encyclopedia of writing systems and languages’, see http://www.omniglot.com . Literacy in the traditional sense is a clear prerequisite of digital literacy, and languages without mature writing systems are unlikely to digitally ascend. Note that there are only 696 languages listed in Omniglot, and many of these are ancient or constructed languages without a live community. Even more relevant to our purposes is the level of support for computer-mediated activity in a given language. Here our basic data comes from inspecting Microsoft and Apple products for two levels of language support: input and OS . Input-level support means the availability of some specific method, such as Kotoeri for Japanese, to enter text in the writing system used for the language. Without an input method, digital ascent is impossible, but the converse unfortunately does not hold: the existence of some input method by no means guarantees an easy way to create text in the language, let alone vigorous digital language use. OS-level support means that all interaction conveyed by the operating system, such as text in dropdown menus or error messages, are provided in the language in question.

There are many languages with standard input methods but no standardized orthography, and the next step up the digital ladder is a spellchecker. The Crúbadán Project also considers this a relevant factor, and lists explicitly whether a Free/Libre Open Source Software (FLOSS) spellchecker exists. We also looked at HunSpell (the largest family of FLOSS spellcheckers, see [20] ) for each language, and assessed its coverage by computing the percentage of words it recognizes in the wikipedia dump. Any number below 50% indicates the spellchecker is not mature.

Standardized orthography enables not just collective works like Wikipedia, itself an important indicator of digital vitality, but also the creation of larger documents. Again, the Crúbadán Project considers this a relevant factor, and lists whether the Bible and the Universal Declaration of Human Rights (UDHR) are available online. Collecting larger corpora, the lifeblood of modern language technology efforts, also requires standardized spelling. The relationship of digital language vitality and more sophisticated tools of modern computational linguistics such as parsers, speech and optical character recognition software, information extraction, and machine translation tools will be discussed in the next section.

The EGIDS scale already comes with a clear notion of ascent, from oral use only (category 6) to acquiring literacy (5) and ‘vigorous oral use (…) reinforced by sustainable literacy’ (4). Further steps up the traditional scale are predicated on the level of (official) use: ‘used in work and mass media without official status to transcend language differences across a region’ (3); ‘used in education, work, mass media, and government within officially recognized regions of a nation’ (2); ‘used in education, work, mass media, and government at the nationwide level’ (1); and ‘widely used between nations in trade, knowledge exchange, and international policy’ (0) [21] . In the digital realm, it is also literacy that provides the pivotal step, and we begin by describing the main stages of acquiring it.

Stage one is some kind of locale or i18n (computer shorthand for ‘internationalization’) support that enables the input (writing) and output (reading) of native characters. On the whole the Unicode standard, already covering more than a hundred scripts and with a well-established mechanism for adding new ones, provides a solid basis for bringing any language to the digital age, as long as it is written (signed languages will be discussed separately). When a language is listed in Omniglot, we can assume it is past stage one. A weaker condition is the availability of online text in OLAC, a stronger condition would be the availability of an input method.

For the second stage we need a variety of word-level tools such as dictionaries, stemmers, and spellcheckers. Here support is more spotty – even the most broadly used tool, HunSpell [20] , is available only for 129 languages, http://hlt.sztaki.hu/resources/hunspell . In spite of the uneven coverage and quality of these tools, they already represent a level of maturity that is very hard to match by an underresourced language. This is because spellcheckers enforce the unified literary standard of a koiné, with significant suppression of individual and dialectal variation. This stage was reached by English only in the 15th century (primarily as a result of the efforts of William Caxton), and many of the languages discussed here have neither undergone the painful process of koiné formation driven by internal needs nor want it to be imposed on them externally [22] .

The third stage requires phrase- and sentence-level tools that can only be built on some preexisting character- and word-level standard, such as part-of-speech taggers, named entity recognizers, chunkers, speech recognition, and machine translation. In the tables presented at http://www.meta-net.eu/whitepapers/key-results-and-cross-language-comparison not even English has ‘Excellent’ support in these higher areas, which are key to avoiding long-term function loss. We surveyed Google Translate to probe this increasingly important area of functionality, but we emphasize here that stage three has more to do with the line between our top two categories, thriving (T) and vital (V), while our primary concern is with the gap between vital and still (S) languages. We have not surveyed speech and character recognition software, not because they are any less important, but because their quality still improves at a fast pace, and languages that lack these today may well acquire them in a hundred years.

Let us now describe the resolution of the classification system proposed here. In contrast to the 8 categories used in GIDS and the 13 used in EGIDS, we will identify only four classes of languages we call digitally T hriving, V ital, H eritage, and S till, roughly corresponding to the volume of active language use in the digital realm. Accordingly, the decision tree presented in Fig. 1 of [11] will be drastically simplified: we will have a major decision, whether a language is actively used in the digital realm, and two supplementary distinctions. The primary goal of our work is to investigate the dead/alive distinction in the digital domain, with the finer distinctions between degrees of ascent (vital versus thriving) and degrees of death (still versus heritage) seen as secondary.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0077056.g001

causes of language death research paper

The method we follow here allows for discovery: we take some clear, prototypical examples from each class, and use a standard machine learning technique, maximum entropy classification (multinomial logistic regression) [23] , [24] to create a classifier that reproduces these seeds. Once the model is trained, we use it to classify the rest of the population. This way, not only the thresholds themselves, but the intrinsic error of threshold-based classification can be investigated based on the data. Further, we can check the effectiveness of the method both by internal criteria, such as the quality of the resulting classifier and its robustness under perturbation of the seeds, and by external criteria, such as comparison with other classification/clustering techniques.

Part of the simplification relative to EGIDS comes from the favorable demographics discussed above. For the traditional case, EGIDS makes an important distinction based on the last generation that has some proficient speakers: if these are the children, the language is threatened (category 6b); if the parents, the language is shifting (7); if the grandparents, it is moribund; and if the great-grandparents, it is nearly extinct (8b). In the digital case, once some speakers transition to the digital realm, their children and grandchildren automatically do so, and we feel justified in collapsing the higher numbers in EGIDS in a single category S. We also feel justified in collapsing the lowest numbers, 0 to 3, in a single category T, in that the questions EGIDS probes, whether a language has international, national, or regional scope, and whether it is official, make less sense in the digital realm that is by design international and unofficial.

As the examples of Classical Chinese, Sanskrit, or Latin show, even extinct languages can be digitally better resourced than many in the traditional sense thriving, but digitally impoverished languages. We will use the H category to account for those languages that are digitally archived, but not used for communication by native speakers. Their digital presence is read only , maintained by scholars. Wikipedia is supportive of heritage maintenance, but newly created wikipedias of extinct languages go to Wikia (the old ones are grandfathered and stay on Wikipedia proper). Since digital archives are here to stay, once a language has acquired heritage status it cannot lose it, and the global tide of digitization will hopefully move many languages from the still (lacking detectable digital presence) to the heritage (detectable but read-only digital presence) category. This movement, however, should not be mistaken for actual vitalization – as far as actual two-way communication in the language is concerned, both categories are digitally dead. The classical studies of language death lay down one absolutely unbreakable rule: no community, no survival . As Darwin, quoting Lyell, already notes “A language, like a species, when once extinct, never (…) reappears.” Modern Hebrew, a language viable both in the standard and in the digital sense, does not constitute a counterexample, inasmuch as neither its vocabulary nor its structure comes close to that of medieval Hebrew. As a matter of fact, new languages can be produced by children from unstructured input in a single generation [25] , [26] , but Modern Hebrew is best viewed as a representative of the main path of new language emergence, creole formation [27] , [28] .

causes of language death research paper

Other than converting the nominal classifications to numeral (e.g. EGIDS class 6a ‘vigorous’ to 6.0; 6b ‘threatened’ to 6.5; and 7 ‘shifting’ to 7.0) and applying a log transform to those fields (such as number of speakers or wikipedia size) that cover many orders of magnitude, we performed only two nontrivial data transformations. First, to control for the fact that the same number of (multibyte) characters will contain different amounts of information depending on writing system, we computed the character entropy of the language, and used it as a normalizing factor: for example, one Chinese character corresponds to about four Dutch characters, an effect quite visible if one compares the character counts of the same document, such as the UDHR or the Bible, in different languages. Second, in order to remove the effects of machine-generated wikipedia entries, we only considered those wikipedia pages to be ‘real’ that contain at least one paragraph with the equivalent of 450 German characters, pages that had less information were declared ‘fake’.

German was chosen as a baseline both because the German wikipedia is known to be high quality, and because before the adjustment it had the highest real ratio , defined as the number of ‘real’ pages divided by the total page count. After the adjustment it became clear that several wikipedias, such as Gujarati and Hebrew, have higher real ratios, but this does not affect our argument in that the same threshold could be expressed in Gujarati or Hebrew characters just as well. We define adjusted wikipedia size as the entropy-normalized total character count of real pages. The adjustment in most cases shrinks the wikipedia by less than a third, and in some cases such as Czech (real ratio 0.53) actually increases the size. Volapük, ranked 37 by article count, is ranked 163rd by adjusted wikipedia size.

causes of language death research paper

Preliminary results of the classification were disappointing, only about 40% correct, as tested by 10-fold crossvalidation. However, as soon as we realized that some parameters like L1 and L2 span many orders of magnitude, and switched to logarithms for these as discussed in Methods (for a complete list, see File S2 ), classification performance improved markedly, with results now in the 85–100% range (see Table 1 ). Since random performance would be about 50% in a 2-way classification task, the fact that the 2-way results are in the 95–100% range already shows that the classes were established in a coherent fashion. It is evident from Table 1 that the 3-way task obtained by merging the live languages is easier than the 3-way task obtained by merging the dead languages.

thumbnail

https://doi.org/10.1371/journal.pone.0077056.t001

Maxent models are defined by feature weights. Those features that contribute little to the classification have small weights (in absolute value), those that contribute a lot have greater values. Remarkably, the performance of our classifiers, originally built on 33 features (for a complete list see File S2 ) improves markedly if we drop out those features that contribute little and retrain on the rest. Automated feature selection is a standard technique in machine learning, where it is used mostly to improve training speeds and generalization [29] . Here it has the further advantage of defending the system from a charge of arbitrariness: why did we use the Crúbadán definition of FLOSS spellchecker rather than the HunSpell list? The answer is that it doesn’t matter, since feature selection will automatically decide which, if any, of these will be used.

Unsurprisingly, the best predictor of digital status was the traditional status. The feature encoding the EGIDS assessments by SIL experts was selected in all models, the feature encoding the Endangered Languages Project assessments was selected in all but one. The next best set of features indicated the quality of the wikipedia, followed by the number of L1 speakers, the size of the Crúbadán crawl, the existence of FLOSS spellcheckers, and the number of online texts listed in OLAC. This last feature, currently our best proxy for the intensity of the heritage conservation effort, has been selected in less than 5% of the cases, and when selected, has only 20% of the weight of the leading feature on average, clearly demonstrating that conservation has negligible impact on digital ascent.

causes of language death research paper

The distribution is sharply bimodal, with only 1.7% of the data in the middle, but this is to be expected from votes obtained from classifiers built to detect the same classes. The classifiers, both individually and collectively, identify a vast class of digitally dead languages that subsume over 96% of our entire data.

We emphasize that this massive die-off is not some future event that could, by some clever policies, be avoided or significantly mitigated – the deed is already done. We have identified a small group of about 170 languages (2%) that are ascending, or have already ascended, to the digital realm, and perhaps there is some hope for the 140 ‘borderline’ languages (1.7%) in the middle, a matter we shall discuss in the concluding section.

While the sheer magnitude of the failure to ascend is clear from the preceding, it would make no sense to declare some borderline language vital or still based on the result of any single classifier. Such individual judgment could only be made based on specific facts about the language in question, facts that need not be encoded in our dataset, and we see many examples of languages whose digital future is unwritten. That said, we can still demonstrate that the overall picture is remarkably robust under changes to the details of our method.

Because vital languages already have their survival assured, while heritage preservation is still very much an uphill battle, we looked more closely at 3-way classifiers that distinguishes heritage from still, but not thriving from vital. The best S-H-VT models discussed so far utilize 6–8 features, and have a precision of 97.1–100% based on 10-fold crossvalidation. To test robustness we randomized seed selection in the following manner.

causes of language death research paper

Dot size shows real ratio, color shows status: T hriving dark green; V ital light green; H eritage blue; S till black; B orderline red. See main text for definitions, File S1 for underlying data.

https://doi.org/10.1371/journal.pone.0077056.g002

causes of language death research paper

We emphasize that the 162 borderline languages, plotted in red, are not classifed ‘borderline’ but rather indicate the uncertainties inherent in the classification. The statistical summaries in Table 2 include these as well for the sake of completeness, but are not explained here, as these pertain to the margins rather than to true class averages.

thumbnail

https://doi.org/10.1371/journal.pone.0077056.t002

causes of language death research paper

There are 307 still languages, plotted in black, where no digital natives can be raised. The average number of speakers is 0.7 m, still quite sizeable, but the wikipedias are mostly incubators, essentially empty after adjustment. A typical example is Kanuri (kau), with main dialects Tumari (krt), Manga (kby), and Beriberi (knc), with EGIDS status 6a, 5, and 3 respectively. With vigorous language use, radio and TV broadcasts in the language, and a total of 3.76 m speakers, the language, at least the Central (Beriberi) dialect, is not on anybody’s radar as endangered – to the contrary, there are only 337 languages with EGIDS 3 or better. Yet the wikipedia was closed for lack of native language content and community, and the Crúbadán crawl listing three documents for less than 5,000 words total. The average EGIDS rating is 6.04, and the majority of the world’s languages are within one sigma of this value, consistent with our assessment that the majority of the world’s languages are digitally still.

Conclusions

We have machine classified the world’s languages as digitally ascending (including all vital, thriving, and borderline cases) or not, and concluded, optimistically, that the former class is at best 5% of the latter. Broken down to individual languages and language groups the situation is quite complex and does not lend itself to a straightforward summary. In our subjective estimate, no more than a third of the incubator languages will make the transition to the digital age. As the example of the erstwhile Klingon wikipedia (now hosted on Wikia) shows, a group of enthusiasts can do wonders, but it cannot create a genuine community. The wikipedia language policy, https://meta.wikimedia.org/wiki/Language_proposal_policy , demanding that “at least five active users must edit that language regularly before a test project will be considered successful” can hardly be more lenient, but the actual bar is much higher. Wikipedia is a good place for digitally-minded speakers to congregate, but the natural outcome of these efforts is a heritage project, not a live community.

A community of wikipedia editors that work together to anchor to the web the culture carried by the language is a necessary but insufficient condition of true survival. By definition, digital ascent requires use in a broad variety of digital contexts. This is not to deny the value of heritage preservation, for the importance of such projects can hardly be overstated, but language survival in the digital age is essentially closed off to local language varieties whose speakers have at the time of the Industrial Revolution already ceded both prestige and core areas of functionality to the leading standard koinés, the varieties we call, without qualification, French, German, and Italian today.

A typical example is Piedmontese, still spoken by some 2–3 m people in the Torino region, and even recognized as having official status by the regional administration of Piedmont, but without any significant digital presence. More closed communities perhaps have a better chance: Faroese, with less than 50 k speakers, but with a high quality wikipedia, could be an example. There are glimmers of hope, for example [2] reported 40,000 downloads for a smartphone app to learn West Flemish dialect words and expressions, but on the whole, the chances of digital survival for those languages that participate in widespread bilingualism with a thriving alternative, in particular the chances of any minority language of the British Isles, are rather slim.

In rare cases, such as that of Kurdish, we may see the emergence of a digital koiné in a situation where today separate Northern (Kurmanji), Central (Sorani), and Southern (Kermanshahi) versions are maintained (the latter as an incubator). But there is no royal road to the digital age. While our study is synchronic only, the diachronic path to literacy and digital literacy is well understood: it takes a Caxton, or at any rate a significant publishing infrastructure, to enforce a standard, and it takes many years of formal education and a concentrated effort on the part of the community to train computational linguists who can develop the necessary tools, from transliterators (such as already powering the Chinese wikipedia) to spellcheckers and machine translation for their language. Perhaps the most remarkable example of this is Basque, which enjoys the benefits of a far-sighted EU language policy, but such success stories are hardly, if at all, relevant to economically more blighted regions with greater language diversity.

The machine translation services offered by Google are an increasingly important driver of cross-language communication. As expected, the first several releases stayed entirely in the thriving zone, and to this day all language pairs are across vital and thriving languages, with the exception of French – Haitian Creole. Were it not for the special attention DARPA, one of the main sponsors of machine translation, devoted to Haitian Creole, it is dubious we would have any MT aimed at this language. There is no reason whatsoever to suppose the Haitian government would have, or even could have, sponsored a similar effort [32] . Be it as it may, Google Translate for any language pair currently likes to have gigaword corpora in the source and target languages and about a million words of parallel text. For vital languages this is not a hard barrier to cross. We can generally put together a gigaword corpus just by crawling the web, and the standardly translated texts form a solid basis for putting together a parallel corpus [33] . But for borderline languages this is a real problem, because online material is so thinly spread over the web that we need techniques specifically designed to find it [16] , and even these techniques yield only a drop in the bucket: instead of the gigaword monolingual corpora that we would need, the average language has only a few thousand words in the Crúbadán crawl. To make matters worse, the results of this crawl are not available to the public for fear of copyright infringement, yet in the digital age what cannot be downloaded does not exist.

The digital situation is far worse than the consensus figure of 2,500 to 3,000 endangered languages would suggest. Even the most pessimistic survey [34] assumed that as many as 600 languages, 10% of the population, were safe, but reports from the field increasingly contradict this. For British Columbia, [35] writes:

Here in BC, for example, the prospect of the survival of the native languages is nil for all of the languages other than Slave and Cree, which are somewhat more viable because they are still being learned by children in a few remote communities outside of BC. The native-language-as-second-language programs are so bad that I have NEVER encountered a child who has acquired any sort of functional command (and I don’t mean fluency - I mean even simple conversational ability or the ability to read and understand a fairly simple paragraph or non-ritual bit of conversation) through such a program. I have said this publicly on several occasions, at meetings of native language teachers and so forth, and have never been contradicted. Even if these programs were greatly improved, we know, from e.g. the results of French instruction, to which oodles of resources are devoted, that we could not expect to produce speakers sufficiently fluent to marry each other, make babies, and bring them up speaking the languages. It is perfectly clear that the only hope of revitalizing these languages is true immersion, but there are only two such programs in the province and there is little prospect of any more. The upshot is that the only reasonable policy is: (a) to document the languages thoroughly, both for scientific purposes and in the hope that perhaps, at some future time, conditions will have changed and if the communities are still interested, they can perhaps be revived then; (b) to focus school programs on the written language as vehicle of culture, like Latin, Hebrew, Sanskrit, etc. and on language appreciation. Nonetheless, there is no systematic program of documentation and instructional efforts are aimed almost entirely at conversation.

Cree, with a population of 117,400 (2006), actually has a wikipedia at http://cr.wikipedia.org but the real ratio is only 0.02, suggestive of a hobbyist project rather than a true community, an impression further supported by the fact that the Cree wikipedia has gathered less than 60 articles in the past six years. Slave (3,500 speakers in 2006) is not even in the incubator stage. This is to be compared to the over 30 languages listed by the Summer Institute of Linguistics for BC. In reality, there are currently less than 250 digitally ascending languages worldwide, and about half of the borderline cases are like Moroccan Arabic (ary), low prestige spoken dialects of major languages whose signs of vitality really originate with the high prestige acrolect. This suggests that in the long run no more than a third of the borderline cases will become vital. One group of languages that is particularly hard hit are the 120+ signed languages currently in use. Aside from American Sign Language, which is slowly but steadily acquiring digital dictionary data and search algorithms [36] , it is perhaps the emerging International Sign [37] that has the best chances of survival.

There could be another 20 spoken languages still in the wikipedia incubator stage or even before that stage that may make it, but every one of these will be an uphill struggle. Of the 7,000 languages still alive, perhaps 2,500 will survive, in the classical sense, for another century. With only 250 digital survivors, all others must inevitably drift towards digital heritage status (Nynorsk) or digital extinction (Mandinka). This makes language preservation projects such as http://www.endangeredlanguages.com even more important. To quote from [6] :

Each language reflects a unique world-view and culture complex, mirroring the manner in which a speech community has resolved its problems in dealing with the world, and has formulated its thinking, its system of philosophy and understanding of the world around it. In this, each language is the means of expression of the intangible cultural heritage of people, and it remains a reflection of this culture for some time even after the culture which underlies it decays and crumbles, often under the impact of an intrusive, powerful, usually metropolitan, different culture. However, with the death and disappearance of such a language, an irreplaceable unit in our knowledge and understanding of human thought and world-view is lost forever.

Unfortunately, at a practical level heritage projects (including wikipedia incubators) are haphazard, with no systematic programs of documentation. Resources are often squandered, both in the EU and outside, on feel-good revitalization efforts that make no sense in light of the preexisting functional loss and economic incentives that work against language diversity [38] .

causes of language death research paper

What must be kept in mind is that the scenario described for Komi is optimistic. There are several hundred thousand speakers, still amounting to about a quarter of the local population. There is a university. There are strong economic incentives (oil, timber) to develop the region further. But for the 95% of the world’s languages where one or more of these drivers are missing, there is very little hope of crossing the digital divide.

Supporting Information

Main data table.

https://doi.org/10.1371/journal.pone.0077056.s001

Details on data sources and encoding in S1.

https://doi.org/10.1371/journal.pone.0077056.s002

Acknowledgments

The author was greatly helped in the data gathering by the Human Language Technology group at the Hungarian Academy of Sciences Computer and Automation Research Institute, in particular by Attila Zséder who unified the data coming from many sources, and Katalin Pajkossy who ran the maxent. We thank Tamás Váradi (Hungarian Academy of Sciences), Hans Uszkoreit (Saarland University at Saarbrücken) and Georg Rehm (DFKI Berlin) for the opportunity to present an earlier version of this material at the Multilingual Europe Technology Alliance (META) Forum 2012 in Brussels. Comments by Onno Crasborn (Radboud University Nijmegen) and Bill Poser (Yinka Dene Language Institute) have led to significant improvements. Comments by the Editor and anonymous referees have led to very significant improvements. We thank Judit Ács, Márton Makrai, Gábor Recski, (HAS Computer and Automation Research Institute) and Taha Yasseri (Oxford) for research assistance.

Author Contributions

Conceived and designed the experiments: AK. Performed the experiments: AK. Analyzed the data: AK. Contributed reagents/materials/analysis tools: AK. Wrote the paper: AK.

  • 1. Morpurgo-Davies A (1998) History of Linguistics.Vol. IV: Nineteenth-Century Linguistics. London and New York: Longman, 464 pp.
  • 2. Darwin C (1871) The Descent of Man, and Selection in Relation to Sex. London: John Murray, 423 pp.
  • 3. Frank RM (2008) The language-organism-species analogy: A complex adaptive systems approach to shifting perspective on ‘language’. In: M FR, Dirven R, Ziemke T, Bernárdez E, editors, Body, Language and Mind. Vol. 2. Sociocultural Situatedness, Berlin: Mouton de Gruyter. 215–262.
  • 4. Nettle D, Romaine S (2000) Vanishing voices: The extinction of the world’s languages. Oxford: Oxford University Press, 243 pp.
  • 5. Crystal D (2002) Language death. Cambridge University Press, 210 pp.
  • 6. Ad hoc technical expert group on indicators for assessing progress towards the 2010 biodiversity target UNEPCoBD (2004). Indicators for assessing progress towards the 2010 target: status and trends of linguistic diversity and numbers of speakers of indigenous languages.
  • View Article
  • Google Scholar
  • 8. Dorian NC (1981) Language death: The life cycle of a Scottish Gaelic dialect. University of Pennsylvania Press Philadelphia, 202 pp.
  • 9. Schmidt A (1985) The fate of ergativity in dying Dyirbal. Language : 378–396.
  • 12. Fishman JA (1991) Reversing language shift: Theoretical and empirical foundations of assistance to threatened language, volume 76. Multilingual Matters Ltd, 431 pp.
  • 13. Mosley C (2010) Atlas of the World’s Languages in Danger. UNESCO Publishing, http://www.unesco.org/culture/languages-atlas . [accessed 17-July-2013].
  • 14. Simons GF, Lewis MP (2013). A global profile of language development versus language endangerment. http://www-01.sil.org/~simonsg/presentation/Simons%20and%20Lewis%20ICLDC%202013.pdf . [Online, accessed 27-June-2013].
  • 15. Cazden CB (2003) Sustaining indigenous languages in cyberspace. Nurturing Native Languages : 53–57.
  • 16. Scannell KP (2007) The Crúbadán Project: Corpus building for under-resourced languages. In: Building and Exploring Web Corpora: Proceedings of the 3rd Web as Corpus Workshop. volume 4, pp. 5–15.
  • 18. Rehm G, de Smedt K (2012) The Norwegian Language in the Digital Age: Norsk i Den Digitale Tidsalderen. Springer, 81 pp.
  • 19. Requests for new languages/Wikipedia Mandinka. http://meta.wikimedia.org/wiki/Requests _for _new _languag[14-September-2008, accessed 17-July-2013].
  • 20. Németh L, Trón V, Halácsy P, Kornai A, Rung A, et al.. (2004) Leveraging the open source ispell codebase for minority language analysis. In: Carson-Berndsen J, editor, Proc. SALTMIL. pp. 56–59.
  • 21. Quakenbush JS, Simons GF (2012) Looking at Austronesian language vitality through EGIDS and SUM. In: Proc. 12-ICAL.
  • 22. Mapuche indians to Bill Gates: hands off our language. http://www.smh.com.au/news/biztech/mapuche-indians-to-bill-gates-hands-off-our-language/2006/11/ [24-November-2006, accessed 27-August-2012].
  • 23. Hosmer D, Lemeshow S (1989) Applied logistic regression. Wiley, 392 pp.
  • 24. Menard S (2002) Applied logistic regression analysis. Sage Publications, 128 pp.
  • 27. Bickerton D (1981) The Roots of Language. Ann Arbor: Karoma, 351 pp.
  • 28. Izreel S (2003) The emergence of spoken Israeli Hebrew. In: Hary B, editor, Corpus Linguistics and Modern Hebrew: Towards the Compilation of the Corpus of Spoken Israeli Hebrew, Tel Aviv University Press. pp. 85–104.
  • 29. Pajkossy K (2013) Studying feature selection methods applied to classification tasks in natural language processing. MSc thesis, Eötvös Loránd University.
  • 31. The week in figures. http://www.flanderstoday.eu/content/week-figures-0 . [12-June-2012, accessed 27-August-2012].
  • 32. Spice B (2012). Carnegie Mellon releases data on Haitian Creole to hasten development of translation tools. http://www.eurekalert.org/pub_ releases/2010-01/cmu-cmr012710.php. [Online, accessed 27-August-2012].
  • 33. Varga D, Halacsy P, Kornai A, Nagy V, Nemeth L, et al.. (2007) Parallel corpora for medium density languages. In: Nicolov N, Bontcheva K, Angelova G, Mitkov R, editors, Recent Advances in Natural Language Processing IV. Selected papers from RANLP-05, Amsterdam: Benjamins. 247–258.
  • 35. Poser W (2012). personal communication.
  • 36. Thangali A, Nash JP, Sclaroff S, Neidle C (2011) Exploiting phonological constraints for handshape inference in ASL video. In: Proc. 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 521–528.
  • 38. Ginsburgh V, Weber S (2011) How many languages do we need?: The economics of linguistic diversity. Princeton University Press, 248 pp.
  • 39. Prószéky G, Novák A (2005) Computational morphologies for small Uralic languages. Inquiries into Words, Constraints and Contexts Festschrift in the Honour of Kimmo Koskenniemi on his 60th Birthday : 116–125.

8-hour time-restricted eating linked to a 91% higher risk of cardiovascular death

03/19/24 Editor’s note:  

  • The research authors have shared their full poster presentation for updated details about their research abstract. Please see the digital file attached, under additional resources below, for these details.
  • The most current statistics, reviewed and confirmed by the research authors, are in the poster (please see the digital file attached, under additional resources below) and the news release. 
  • As with any new science development, patients should always consult with their doctor prior to making changes to their health regimens.

As noted in all American Heart Association scientific meetings news releases, research abstracts are considered preliminary until published in a peer-reviewed scientific journal.

Research Highlights :

  • A study of over 20,000 adults found that those who followed an 8-hour time-restricted eating schedule, a type of intermittent fasting, had a 91% higher risk of death from cardiovascular disease.
  • People with heart disease or cancer also had an increased risk of cardiovascular death.
  • Compared with a standard schedule of eating across 12-16 hours per day, limiting food intake to less than 8 hours per day was not associated with living longer.

Embargoed until 3 p.m. CT/4 p.m. ET , Monday, March 18, 2024

CHICAGO, March 18, 2024 — An analysis of over 20,000 U.S. adults found that people who limited their eating across less than 8 hours per day, a time-restricted eating plan, were more likely to die from cardiovascular disease compared to people who ate across 12-16 hours per day, according to preliminary research presented at the American Heart Association’s  Epidemiology and Prevention│Lifestyle and Cardiometabolic Scientific Sessions 2024 , March 18- 21, in Chicago. The meeting offers the latest science on population-based health and wellness and implications for lifestyle.

Time-restricted eating, a type of intermittent fasting, involves limiting the hours for eating to a specific number of hours each day, which may range from a 4- to 12-hour time window in 24 hours. Many people who follow a time-restricted eating diet follow a 16:8 eating schedule, where they eat all their foods in an 8-hour window and fast for the remaining 16 hours each day, the researchers noted. Previous research has found that time-restricted eating improves several cardiometabolic health measures, such as blood pressure, blood glucose and cholesterol levels.

“Restricting daily eating time to a short period, such as 8 hours per day, has gained popularity in recent years as a way to lose weight and improve heart health,” said senior study author Victor Wenze Zhong, Ph.D., a professor and chair of the department of epidemiology and biostatistics at the Shanghai Jiao Tong University School of Medicine in Shanghai, China. “However, the long-term health effects of time-restricted eating, including risk of death from any cause or cardiovascular disease, are unknown.”

In this study, researchers investigated the potential long-term health impact of following an 8-hour time-restricted eating plan. They reviewed information about dietary patterns for participants in the annual 2003-2018 National Health and Nutrition Examination Surveys (NHANES) in comparison to data about people who died in the U.S., from 2003 through December 2019, from the Centers for Disease Control and Prevention’s National Death Index database.

The analysis found:

  • People who followed a pattern of eating all of their food across less than 8 hours per day had a 91% higher risk of death due to cardiovascular disease.
  • The increased risk of cardiovascular death was also seen in people living with heart disease or cancer.
  • Among people with existing cardiovascular disease, an eating duration of no less than 8 but less than 10 hours per day was also associated with a 66% higher risk of death from heart disease or stroke.
  • Time-restricted eating did not reduce the overall risk of death from any cause.
  • An eating duration of more than 16 hours per day was associated with a lower risk of cancer mortality among people with cancer.

“We were surprised to find that people who followed an 8-hour, time-restricted eating schedule were more likely to die from cardiovascular disease. Even though this type of diet has been popular due to its potential short-term benefits, our research clearly shows that, compared with a typical eating time range of 12-16 hours per day, a shorter eating duration was not associated with living longer,” Zhong said.

“It’s crucial for patients, particularly those with existing heart conditions or cancer, to be aware of the association between an 8-hour eating window and increased risk of cardiovascular death. Our study’s findings encourage a more cautious, personalized approach to dietary recommendations, ensuring that they are aligned with an individual’s health status and the latest scientific evidence,” he continued. “Although the study identified an association between an 8-hour eating window and cardiovascular death, this does not mean that time-restricted eating caused cardiovascular death.” Study details and background:

  • The study included approximately 20,000 adults in the U.S. with an average age of 49 years.
  • Study participants were followed for a median length of 8 years and maximum length of 17 years.
  • The study included data for NHANES participants who were at least 20 years old at enrollment, between 2003-2018, and had completed two 24-hour dietary recall questionnaires within the first year of enrollment.
  • Approximately half of the participants self-identified as men, and half self-identified as women. 73.3% of the participants self-identified as non-Hispanic white adults, 11% self-identified as Hispanic adults, 8% self-identified as non-Hispanic Black adults and 6.9% of adults self-identified as another racial category, including mixed-race adults and adults of other non-Hispanic races.

The study’s limitations included its reliance on self-reported dietary information, which may be affected by participant’s memory or recall and may not accurately assess typical eating patterns. Factors that may also play a role in health, outside of daily duration of eating and cause of death, were not included in the analysis.

Future research may examine the biological mechanisms that underly the associations between a time-restricted eating schedule and adverse cardiovascular outcomes, and whether these findings are similar for people who live in other parts of the world, the authors noted.

“Overall, this study suggests that time-restricted eating may have short-term benefits but long-term adverse effects. When the study is presented in its entirety, it will be interesting and helpful to learn more of the details of the analysis,” said Christopher D. Gardner, Ph.D., FAHA, the Rehnborg Farquhar Professor of Medicine at Stanford University in Stanford, California, and chair of the writing committee for the Association’s 2023 scientific statement, Popular Dietary Patterns: Alignment with American Heart Association 2021 Dietary Guidance . 

“One of those details involves the nutrient quality of the diets typical of the different subsets of participants. Without this information, it cannot be determined if nutrient density might be an alternate explanation to the findings that currently focus on the window of time for eating. Second, it needs to be emphasized that categorization into the different windows of time-restricted eating was determined on the basis of just two days of dietary intake,” he said.

“It will also be critical to see a comparison of demographics and baseline characteristics across the groups that were classified into the different time-restricted eating windows – for example, was the group with the shortest time-restricted eating window unique compared to people who followed other eating schedules,  in terms of weight, stress, traditional cardiometabolic risk factors or other factors associated with adverse cardiovascular outcomes? This additional information will help to better understand the potential independent contribution of the short time-restricted eating pattern reported in this interesting and provocative abstract.”

Co-authors, their disclosures and funding sources are listed in the abstract.

Statements and conclusions of studies that are presented at the American Heart Association’s scientific meetings are solely those of the study authors and do not necessarily reflect the Association’s policy or position. The Association makes no representation or guarantee as to their accuracy or reliability. Abstracts presented at the Association’s scientific meetings are not peer-reviewed, rather, they are curated by independent review panels and are considered based on the potential to add to the diversity of scientific issues and views discussed at the meeting. The findings are considered preliminary until published as a full manuscript in a peer-reviewed scientific journal.

The Association receives funding primarily from individuals; foundations and corporations (including pharmaceutical, device manufacturers and other companies) also make donations and fund specific Association programs and events. The Association has strict policies to prevent these relationships from influencing the science content. Revenues from pharmaceutical and biotech companies, device manufacturers and health insurance providers and the Association’s overall financial information are here .

Additional Resources:

  • Poster at EPI-Lifestyle 2024  (PDF): Association of 8-Hour Time-Restricted Eating with All-Cause and Cause-Specific Mortality
  • Multimedia is available on the right column of the release link  https://newsroom.heart.org/news/8-hour-time-restricted-eating-linked-to-a-91-higher-risk-of-cardiovascular-death?preview=cdac59c3c907975eecaef517166f08f8
  • After March 18, 2024, view  abstract P192  in the EPI│Lifestyle Scientific Sessions 2024  Online Program Planner .
  • AHA news release:  Reducing total calories may be more effective for weight loss than intermittent fasting  (January 2023)
  • AHA news release: 10 popular diets scored for heart-healthy elements; some need improvement (April 2023)
  • AHA news release: New look at nutrition research identifies 10 features of a heart-healthy eating pattern (November 2021)
  • AHA healthy eating tips: Eat Smart
  • For more news from AHA EPI|Lifestyle Scientific Sessions 2024, follow us on X (formerlTwitter) @HyeartNews , #EPILifestyle24.​

The American Heart Association’s EPI|Lifestyle Scientific Sessions 2024 is the world’s premier meeting dedicated to the latest advances in population-based science. The 2024 meeting is in-person only, Monday through Thursday, March 18-21 at the Hilton Chicago. The primary goal of the meeting is to promote the development and application of translational and population science to prevent heart disease and stroke and foster cardiovascular health. The sessions focus on risk factors, obesity, nutrition, physical activity, genetics, metabolism, biomarkers, subclinical disease, clinical disease, healthy populations, global health and prevention-oriented clinical trials. The Councils on Epidemiology and Prevention and Lifestyle and Cardiometabolic Health (Lifestyle) jointly planned the EPI|Lifestyle Scientific Sessions 2024. Follow the conference on Twitter at #EPILifestyle24 .

About the American Heart Association

The American Heart Association is a relentless force for a world of longer, healthier lives. We are dedicated to ensuring equitable health in all communities. Through collaboration with numerous organizations, and powered by millions of volunteers, we fund innovative research, advocate for the public’s health and share lifesaving resources. The Dallas-based organization has been a leading source of health information for a century. During 2024 - our Centennial year - we celebrate our rich 100-year history and accomplishments. As we forge ahead into our second century of bold discovery and impact, our vision is to advance health and hope for everyone, everywhere. Connect with us on heart.org , Facebook , X or by calling 1-800-AHA-USA1.

For Media Inquiries and AHA Expert Perspective:

AHA Communications & Media Relations in Dallas: 214-706-1173; [email protected]

John Arnst: [email protected], 214-706-1060

For Public Inquiries: 1-800-AHA-USA1 (242-8721)

heart.org and stroke.org

AHA Logo

Read our research on: Abortion | Podcasts | Election 2024

Regions & Countries

What the data says about abortion in the u.s..

Pew Research Center has conducted many surveys about abortion over the years, providing a lens into Americans’ views on whether the procedure should be legal, among a host of other questions.

In a  Center survey  conducted nearly a year after the Supreme Court’s June 2022 decision that  ended the constitutional right to abortion , 62% of U.S. adults said the practice should be legal in all or most cases, while 36% said it should be illegal in all or most cases. Another survey conducted a few months before the decision showed that relatively few Americans take an absolutist view on the issue .

Find answers to common questions about abortion in America, based on data from the Centers for Disease Control and Prevention (CDC) and the Guttmacher Institute, which have tracked these patterns for several decades:

How many abortions are there in the U.S. each year?

How has the number of abortions in the u.s. changed over time, what is the abortion rate among women in the u.s. how has it changed over time, what are the most common types of abortion, how many abortion providers are there in the u.s., and how has that number changed, what percentage of abortions are for women who live in a different state from the abortion provider, what are the demographics of women who have had abortions, when during pregnancy do most abortions occur, how often are there medical complications from abortion.

This compilation of data on abortion in the United States draws mainly from two sources: the Centers for Disease Control and Prevention (CDC) and the Guttmacher Institute, both of which have regularly compiled national abortion data for approximately half a century, and which collect their data in different ways.

The CDC data that is highlighted in this post comes from the agency’s “abortion surveillance” reports, which have been published annually since 1974 (and which have included data from 1969). Its figures from 1973 through 1996 include data from all 50 states, the District of Columbia and New York City – 52 “reporting areas” in all. Since 1997, the CDC’s totals have lacked data from some states (most notably California) for the years that those states did not report data to the agency. The four reporting areas that did not submit data to the CDC in 2021 – California, Maryland, New Hampshire and New Jersey – accounted for approximately 25% of all legal induced abortions in the U.S. in 2020, according to Guttmacher’s data. Most states, though,  do  have data in the reports, and the figures for the vast majority of them came from each state’s central health agency, while for some states, the figures came from hospitals and other medical facilities.

Discussion of CDC abortion data involving women’s state of residence, marital status, race, ethnicity, age, abortion history and the number of previous live births excludes the low share of abortions where that information was not supplied. Read the methodology for the CDC’s latest abortion surveillance report , which includes data from 2021, for more details. Previous reports can be found at  stacks.cdc.gov  by entering “abortion surveillance” into the search box.

For the numbers of deaths caused by induced abortions in 1963 and 1965, this analysis looks at reports by the then-U.S. Department of Health, Education and Welfare, a precursor to the Department of Health and Human Services. In computing those figures, we excluded abortions listed in the report under the categories “spontaneous or unspecified” or as “other.” (“Spontaneous abortion” is another way of referring to miscarriages.)

Guttmacher data in this post comes from national surveys of abortion providers that Guttmacher has conducted 19 times since 1973. Guttmacher compiles its figures after contacting every known provider of abortions – clinics, hospitals and physicians’ offices – in the country. It uses questionnaires and health department data, and it provides estimates for abortion providers that don’t respond to its inquiries. (In 2020, the last year for which it has released data on the number of abortions in the U.S., it used estimates for 12% of abortions.) For most of the 2000s, Guttmacher has conducted these national surveys every three years, each time getting abortion data for the prior two years. For each interim year, Guttmacher has calculated estimates based on trends from its own figures and from other data.

The latest full summary of Guttmacher data came in the institute’s report titled “Abortion Incidence and Service Availability in the United States, 2020.” It includes figures for 2020 and 2019 and estimates for 2018. The report includes a methods section.

In addition, this post uses data from StatPearls, an online health care resource, on complications from abortion.

An exact answer is hard to come by. The CDC and the Guttmacher Institute have each tried to measure this for around half a century, but they use different methods and publish different figures.

The last year for which the CDC reported a yearly national total for abortions is 2021. It found there were 625,978 abortions in the District of Columbia and the 46 states with available data that year, up from 597,355 in those states and D.C. in 2020. The corresponding figure for 2019 was 607,720.

The last year for which Guttmacher reported a yearly national total was 2020. It said there were 930,160 abortions that year in all 50 states and the District of Columbia, compared with 916,460 in 2019.

  • How the CDC gets its data: It compiles figures that are voluntarily reported by states’ central health agencies, including separate figures for New York City and the District of Columbia. Its latest totals do not include figures from California, Maryland, New Hampshire or New Jersey, which did not report data to the CDC. ( Read the methodology from the latest CDC report .)
  • How Guttmacher gets its data: It compiles its figures after contacting every known abortion provider – clinics, hospitals and physicians’ offices – in the country. It uses questionnaires and health department data, then provides estimates for abortion providers that don’t respond. Guttmacher’s figures are higher than the CDC’s in part because they include data (and in some instances, estimates) from all 50 states. ( Read the institute’s latest full report and methodology .)

While the Guttmacher Institute supports abortion rights, its empirical data on abortions in the U.S. has been widely cited by  groups  and  publications  across the political spectrum, including by a  number of those  that  disagree with its positions .

These estimates from Guttmacher and the CDC are results of multiyear efforts to collect data on abortion across the U.S. Last year, Guttmacher also began publishing less precise estimates every few months , based on a much smaller sample of providers.

The figures reported by these organizations include only legal induced abortions conducted by clinics, hospitals or physicians’ offices, or those that make use of abortion pills dispensed from certified facilities such as clinics or physicians’ offices. They do not account for the use of abortion pills that were obtained  outside of clinical settings .

(Back to top)

A line chart showing the changing number of legal abortions in the U.S. since the 1970s.

The annual number of U.S. abortions rose for years after Roe v. Wade legalized the procedure in 1973, reaching its highest levels around the late 1980s and early 1990s, according to both the CDC and Guttmacher. Since then, abortions have generally decreased at what a CDC analysis called  “a slow yet steady pace.”

Guttmacher says the number of abortions occurring in the U.S. in 2020 was 40% lower than it was in 1991. According to the CDC, the number was 36% lower in 2021 than in 1991, looking just at the District of Columbia and the 46 states that reported both of those years.

(The corresponding line graph shows the long-term trend in the number of legal abortions reported by both organizations. To allow for consistent comparisons over time, the CDC figures in the chart have been adjusted to ensure that the same states are counted from one year to the next. Using that approach, the CDC figure for 2021 is 622,108 legal abortions.)

There have been occasional breaks in this long-term pattern of decline – during the middle of the first decade of the 2000s, and then again in the late 2010s. The CDC reported modest 1% and 2% increases in abortions in 2018 and 2019, and then, after a 2% decrease in 2020, a 5% increase in 2021. Guttmacher reported an 8% increase over the three-year period from 2017 to 2020.

As noted above, these figures do not include abortions that use pills obtained outside of clinical settings.

Guttmacher says that in 2020 there were 14.4 abortions in the U.S. per 1,000 women ages 15 to 44. Its data shows that the rate of abortions among women has generally been declining in the U.S. since 1981, when it reported there were 29.3 abortions per 1,000 women in that age range.

The CDC says that in 2021, there were 11.6 abortions in the U.S. per 1,000 women ages 15 to 44. (That figure excludes data from California, the District of Columbia, Maryland, New Hampshire and New Jersey.) Like Guttmacher’s data, the CDC’s figures also suggest a general decline in the abortion rate over time. In 1980, when the CDC reported on all 50 states and D.C., it said there were 25 abortions per 1,000 women ages 15 to 44.

That said, both Guttmacher and the CDC say there were slight increases in the rate of abortions during the late 2010s and early 2020s. Guttmacher says the abortion rate per 1,000 women ages 15 to 44 rose from 13.5 in 2017 to 14.4 in 2020. The CDC says it rose from 11.2 per 1,000 in 2017 to 11.4 in 2019, before falling back to 11.1 in 2020 and then rising again to 11.6 in 2021. (The CDC’s figures for those years exclude data from California, D.C., Maryland, New Hampshire and New Jersey.)

The CDC broadly divides abortions into two categories: surgical abortions and medication abortions, which involve pills. Since the Food and Drug Administration first approved abortion pills in 2000, their use has increased over time as a share of abortions nationally, according to both the CDC and Guttmacher.

The majority of abortions in the U.S. now involve pills, according to both the CDC and Guttmacher. The CDC says 56% of U.S. abortions in 2021 involved pills, up from 53% in 2020 and 44% in 2019. Its figures for 2021 include the District of Columbia and 44 states that provided this data; its figures for 2020 include D.C. and 44 states (though not all of the same states as in 2021), and its figures for 2019 include D.C. and 45 states.

Guttmacher, which measures this every three years, says 53% of U.S. abortions involved pills in 2020, up from 39% in 2017.

Two pills commonly used together for medication abortions are mifepristone, which, taken first, blocks hormones that support a pregnancy, and misoprostol, which then causes the uterus to empty. According to the FDA, medication abortions are safe  until 10 weeks into pregnancy.

Surgical abortions conducted  during the first trimester  of pregnancy typically use a suction process, while the relatively few surgical abortions that occur  during the second trimester  of a pregnancy typically use a process called dilation and evacuation, according to the UCLA School of Medicine.

In 2020, there were 1,603 facilities in the U.S. that provided abortions,  according to Guttmacher . This included 807 clinics, 530 hospitals and 266 physicians’ offices.

A horizontal stacked bar chart showing the total number of abortion providers down since 1982.

While clinics make up half of the facilities that provide abortions, they are the sites where the vast majority (96%) of abortions are administered, either through procedures or the distribution of pills, according to Guttmacher’s 2020 data. (This includes 54% of abortions that are administered at specialized abortion clinics and 43% at nonspecialized clinics.) Hospitals made up 33% of the facilities that provided abortions in 2020 but accounted for only 3% of abortions that year, while just 1% of abortions were conducted by physicians’ offices.

Looking just at clinics – that is, the total number of specialized abortion clinics and nonspecialized clinics in the U.S. – Guttmacher found the total virtually unchanged between 2017 (808 clinics) and 2020 (807 clinics). However, there were regional differences. In the Midwest, the number of clinics that provide abortions increased by 11% during those years, and in the West by 6%. The number of clinics  decreased  during those years by 9% in the Northeast and 3% in the South.

The total number of abortion providers has declined dramatically since the 1980s. In 1982, according to Guttmacher, there were 2,908 facilities providing abortions in the U.S., including 789 clinics, 1,405 hospitals and 714 physicians’ offices.

The CDC does not track the number of abortion providers.

In the District of Columbia and the 46 states that provided abortion and residency information to the CDC in 2021, 10.9% of all abortions were performed on women known to live outside the state where the abortion occurred – slightly higher than the percentage in 2020 (9.7%). That year, D.C. and 46 states (though not the same ones as in 2021) reported abortion and residency data. (The total number of abortions used in these calculations included figures for women with both known and unknown residential status.)

The share of reported abortions performed on women outside their state of residence was much higher before the 1973 Roe decision that stopped states from banning abortion. In 1972, 41% of all abortions in D.C. and the 20 states that provided this information to the CDC that year were performed on women outside their state of residence. In 1973, the corresponding figure was 21% in the District of Columbia and the 41 states that provided this information, and in 1974 it was 11% in D.C. and the 43 states that provided data.

In the District of Columbia and the 46 states that reported age data to  the CDC in 2021, the majority of women who had abortions (57%) were in their 20s, while about three-in-ten (31%) were in their 30s. Teens ages 13 to 19 accounted for 8% of those who had abortions, while women ages 40 to 44 accounted for about 4%.

The vast majority of women who had abortions in 2021 were unmarried (87%), while married women accounted for 13%, according to  the CDC , which had data on this from 37 states.

A pie chart showing that, in 2021, majority of abortions were for women who had never had one before.

In the District of Columbia, New York City (but not the rest of New York) and the 31 states that reported racial and ethnic data on abortion to  the CDC , 42% of all women who had abortions in 2021 were non-Hispanic Black, while 30% were non-Hispanic White, 22% were Hispanic and 6% were of other races.

Looking at abortion rates among those ages 15 to 44, there were 28.6 abortions per 1,000 non-Hispanic Black women in 2021; 12.3 abortions per 1,000 Hispanic women; 6.4 abortions per 1,000 non-Hispanic White women; and 9.2 abortions per 1,000 women of other races, the  CDC reported  from those same 31 states, D.C. and New York City.

For 57% of U.S. women who had induced abortions in 2021, it was the first time they had ever had one,  according to the CDC.  For nearly a quarter (24%), it was their second abortion. For 11% of women who had an abortion that year, it was their third, and for 8% it was their fourth or more. These CDC figures include data from 41 states and New York City, but not the rest of New York.

A bar chart showing that most U.S. abortions in 2021 were for women who had previously given birth.

Nearly four-in-ten women who had abortions in 2021 (39%) had no previous live births at the time they had an abortion,  according to the CDC . Almost a quarter (24%) of women who had abortions in 2021 had one previous live birth, 20% had two previous live births, 10% had three, and 7% had four or more previous live births. These CDC figures include data from 41 states and New York City, but not the rest of New York.

The vast majority of abortions occur during the first trimester of a pregnancy. In 2021, 93% of abortions occurred during the first trimester – that is, at or before 13 weeks of gestation,  according to the CDC . An additional 6% occurred between 14 and 20 weeks of pregnancy, and about 1% were performed at 21 weeks or more of gestation. These CDC figures include data from 40 states and New York City, but not the rest of New York.

About 2% of all abortions in the U.S. involve some type of complication for the woman , according to an article in StatPearls, an online health care resource. “Most complications are considered minor such as pain, bleeding, infection and post-anesthesia complications,” according to the article.

The CDC calculates  case-fatality rates for women from induced abortions – that is, how many women die from abortion-related complications, for every 100,000 legal abortions that occur in the U.S .  The rate was lowest during the most recent period examined by the agency (2013 to 2020), when there were 0.45 deaths to women per 100,000 legal induced abortions. The case-fatality rate reported by the CDC was highest during the first period examined by the agency (1973 to 1977), when it was 2.09 deaths to women per 100,000 legal induced abortions. During the five-year periods in between, the figure ranged from 0.52 (from 1993 to 1997) to 0.78 (from 1978 to 1982).

The CDC calculates death rates by five-year and seven-year periods because of year-to-year fluctuation in the numbers and due to the relatively low number of women who die from legal induced abortions.

In 2020, the last year for which the CDC has information , six women in the U.S. died due to complications from induced abortions. Four women died in this way in 2019, two in 2018, and three in 2017. (These deaths all followed legal abortions.) Since 1990, the annual number of deaths among women due to legal induced abortion has ranged from two to 12.

The annual number of reported deaths from induced abortions (legal and illegal) tended to be higher in the 1980s, when it ranged from nine to 16, and from 1972 to 1979, when it ranged from 13 to 63. One driver of the decline was the drop in deaths from illegal abortions. There were 39 deaths from illegal abortions in 1972, the last full year before Roe v. Wade. The total fell to 19 in 1973 and to single digits or zero every year after that. (The number of deaths from legal abortions has also declined since then, though with some slight variation over time.)

The number of deaths from induced abortions was considerably higher in the 1960s than afterward. For instance, there were 119 deaths from induced abortions in  1963  and 99 in  1965 , according to reports by the then-U.S. Department of Health, Education and Welfare, a precursor to the Department of Health and Human Services. The CDC is a division of Health and Human Services.

Note: This is an update of a post originally published May 27, 2022, and first updated June 24, 2022.

causes of language death research paper

Sign up for our weekly newsletter

Fresh data delivered Saturday mornings

Key facts about the abortion debate in America

Public opinion on abortion, three-in-ten or more democrats and republicans don’t agree with their party on abortion, partisanship a bigger factor than geography in views of abortion access locally, do state laws on abortion reflect public opinion, most popular.

About Pew Research Center Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of The Pew Charitable Trusts .

Intermittent fasting linked to higher risk of cardiovascular death, research suggests

Intermittent fasting, a diet pattern that involves alternating between periods of fasting and eating, can lower blood pressure and help some people lose weight , past research has indicated.

But an analysis presented Monday at the American Heart Association’s scientific sessions in Chicago challenges the notion that intermittent fasting is good for heart health. Instead, researchers from Shanghai Jiao Tong University School of Medicine in China found that people who restricted food consumption to less than eight hours per day had a 91% higher risk of dying from cardiovascular disease over a median period of eight years, relative to people who ate across 12 to 16 hours.

It’s some of the first research investigating the association between time-restricted eating (a type of intermittent fasting) and the risk of death from cardiovascular disease.

The analysis — which has not yet been peer-reviewed or published in an academic journal — is based on data from the Centers for Disease Control and Prevention’s National Health and Nutrition Examination Survey collected between 2003 and 2018. The researchers analyzed responses from around 20,000 adults who recorded what they ate for at least two days, then looked at who had died from cardiovascular disease after a median follow-up period of eight years.

However, Victor Wenze Zhong, a co-author of the analysis, said it’s too early to make specific recommendations about intermittent fasting based on his research alone.

“Practicing intermittent fasting for a short period such as 3 months may likely lead to benefits on reducing weight and improving cardiometabolic health,” Zhong said via email. But he added that people “should be extremely cautious” about intermittent fasting for longer periods of time, such as years.

Intermittent fasting regimens vary widely. A common schedule is to restrict eating to a period of six to eight hours per day, which can lead people to consume fewer calories, though some eat the same amount in a shorter time. Another popular schedule is the "5:2 diet," which involves eating 500 to 600 calories on two nonconsecutive days of the week but eating normally for the other five.

A fixed rhythm for meals helps against unwanted kilos on the scales.

Zhong said it’s not clear why his research found an association between time-restricted eating and a risk of death from cardiovascular disease. He offered an observation, though: People who limited their eating to fewer than eight hours per day had less lean muscle mass than those who ate for 12 to 16 hours. Low lean muscle mass has been linked to a higher risk of cardiovascular death .

Cardiovascular and nutrition experts who were not involved in the analysis offered several theories about what might explain the results.

Dr. Benjamin Horne, a research professor at Intermountain Health in Salt Lake City, said fasting can increase stress hormones such as cortisol and adrenaline, since the body doesn’t know when to expect food next and goes into survival mode. That added stress may raise the short-term risk of heart problems among vulnerable groups, he said, particularly elderly people or those with chronic health conditions.

Horne’s research has shown that fasting twice a week for four weeks, then once a week for 22 weeks may increase a person’s risk of dying after one year but decrease their 10-year risk of chronic disease.

“In the long term, what it does is reduces those risk factors for heart disease and reduces the risk factors for diabetes and so forth — but in the short term, while you’re actually doing it, your body is in a state where it’s at a higher risk of having problems,” he said.

Even so, Horne added, the analysis “doesn’t change my perspective that there are definite benefits from fasting, but it’s a cautionary tale that we need to be aware that there are definite, potentially major, adverse effects.” 

Intermittent fasting gained popularity about a decade ago, when the 5:2 diet was touted as a weight loss strategy in the U.K. In the years to follow, several celebrities espoused the benefits of an eight-hour eating window for weight loss, while some Silicon Valley tech workers believed that extreme periods of fasting boosted productivity . Some studies have also suggested that intermittent fasting might help extend people’s lifespans by warding off disease .

However, a lot of early research on intermittent fasting involved animals. In the last seven years or so, various clinical trials have investigated potential benefits for humans, including for heart health.

“The purpose of intermittent fasting is to cut calories, lose weight,” said Penny Kris-Etherton, emeritus professor of nutritional sciences at Penn State University and a member of the American Heart Association nutrition committee. “It’s really how intermittent fasting is implemented that’s going to explain a lot of the benefits or adverse associations.”

Dr. Francisco Lopez-Jimenez, a cardiologist at Mayo Clinic, said the timing of when people eat may influence the effects they see. 

“I haven’t met a single person or patient that has been practicing intermittent fasting by skipping dinner,” he said, noting that people more often skip breakfast, a schedule associated with an increased risk of heart disease and death .

The new research comes with limitations: It relies on people’s memories of what they consumed over a 24-hour period and doesn’t consider the nutritional quality of the food they ate or how many calories they consumed during an eating window.

So some experts found the analysis too narrow.

“It’s a retrospective study looking at two days’ worth of data, and drawing some very big conclusions from a very limited snapshot into a person’s lifestyle habits,” said Dr. Pam Taub, a cardiologist at UC San Diego Health.

Taub said her patients have seen “incredible benefits” from fasting regimens.

“I would continue doing it,” she said. “For people that do intermittent fasting, their individual results speak for themselves. Most people that do intermittent fasting, the reason they continue it is they see a decrease in their weight. They see a decrease in blood pressure. They see an improvement in their LDL cholesterol.” 

Kris-Etherton, however, urged caution: “Maybe consider a pause in intermittent fasting until we have more information or until the results of the study can be better explained,” she said.

causes of language death research paper

Aria Bendix is the breaking health reporter for NBC News Digital.

IMAGES

  1. (PDF) Exploring the Causes of Language Death: A Review Paper Teaching

    causes of language death research paper

  2. (PDF) Language Birth and Death

    causes of language death research paper

  3. (PDF) Language Death and Language Maintenance: Problems and Prospects

    causes of language death research paper

  4. SOLUTION: Causes of language death

    causes of language death research paper

  5. (PDF) Language Death and Endangered Languages

    causes of language death research paper

  6. (PDF) Language Death: The Case of Gubatnon Dialect

    causes of language death research paper

VIDEO

  1. Language Death in Sociolinguistics

  2. Cervical cancer || Hpv vaccine || Poonampandey death news || gynaecologist || cancer prevention

  3. A Level English Language (9093) Paper 4- Section B: Language and the Self (Part 2)

  4. A History of Near Death Experiences: A Comparison Between Modern and Ancient NDEs

  5. NDE TV Presents Bob, retired central supervisor of banks & insurance companies/NDE Researcher/Author

  6. When Languages Die

COMMENTS

  1. Exploring the Causes of Language Death: A Review Paper Teaching Assistant Teaching Assistant

    This review paper is aimed at exploring the causes of language death. To be more specific, this papers seeks to describe the language death, and the major reasons behind the death of a language ...

  2. PDF Language death

    The paper's editor made it the keynote of his summary, and most of the published letters which followed focused on the issue of language death. It was good to see ... surrounding the topic of language death. Whether a dramatic as opposed to a scholarly encounter with the topic is likely to have greater impact I cannot say. All I know is that

  3. PDF Language maintenance, shift and death, and the implications for

    with little consideration of language trends and needs of a community. This paper is an attempt to tie together research on language maintenance/shift and that on bilingual education, showing that the success of an education program can be enhanced by the two supporting each other. 1. Language maintenance, shift and death 1.1.

  4. PDF How Languages Die

    Not being able to speak the language has to do with a form of "atrophy," i.e., the loss of competence in the language due to lack of practice. When the process is experienced by all the speakers of a language and this can no longer be learned by their children, it can be characterized as dying or dead.

  5. Drivers of language loss

    Fig. 1: Languages at risk. This map shows locations of 3,000 spoken languages around the world, rated as awakening, dormant, endangered, severely endangered, threatened or unknown. Bromham et al ...

  6. Language extinction and linguistic fronts

    Language birth and death are natural ongoing processes worldwide, but, in recent times, the processes of language extinction have accelerated, partly owing to improved communications and globalization processes [ 15, 16 ]. Currently, about 4% of the languages are spoken by 96% of the population, whereas 25% of the languages have fewer than 1000 ...

  7. Language Death and Dying

    This chapter contains sections titled: Types of Language Death. Causes of Language Death. Models of Language Loss. Structural Levels in Language Death

  8. Language Death and Disappearance: Causes and Circumstances

    A Grammar of the Kiwai Language, Fly Delta, Papua, with a Kiwai Vocabulary by E. Baxter Riley. Port Moresby, Government Printer. Vakhtin, Nikolaj Borisovich and Golovko, Jevgenin Vasiljevich (1987), ' Ob odnom neordinarnom sledstvii jazykovykh kontaktov: jazyk ostrova Mednyj' [About an unusual consequence of language contacts: the language of ...

  9. PDF LANGUAGE DEATH

    pursuit of education, learning and research at the highest international ... The paper's editor made it the keynote of his summary, and most of ... 978-1-107-43181-2 - Language Death David Crystal Frontmatter More information. Created Date: 12/1/2014 5:42:16 PM ...

  10. PDF Language vitality: Theorizing language loss, shift, and reclamation

    digenous minority language endangerment is worth considering as a distinct category of LEL . I begin this response in §2 with a brief summary of the state of knowledge of causes of language endangerment and loss, and the underpinnings of current theorizing on LEL. I then discuss in §3 several points that emerge from Mufwene's position paper:

  11. INTRODUCTION LANGUAGE DEATH AND LANGUAGE MAINTENANCE

    A language is dead when there are no speakers left at all. The factors determining language death are typically "non-linguistic" (Swadesh 1948:235). A long list of such factors can be found in Campbell (1994:1963). The most commonly cited are socioeconomic and sociopolitical.

  12. Language death (Chapter 11)

    Introduction. Language death occurs in unstable bilingual or multilingual speech communities as a result of language shift from a regressive minority language to a dominant majority language. Dressler's definition immediately allows us to link the special social context of language death with its linguistic consequences: as we shall see below ...

  13. PDF Language death

    passed away. If you are the last speaker of a language, your language - viewed as a tool of communication - is already dead. For a lan-guage is really alive only as long as there is someone to speak it to. When you are the only one left, your knowledge of your language is like a repository, or archive, of your people's spoken linguistic past.

  14. Digital Language Death

    Of the approximately 7,000 languages spoken today, some 2,500 are generally considered endangered. Here we argue that this consensus figure vastly underestimates the danger of digital language death, in that less than 5% of all languages can still ascend to the digital realm. We present evidence of a massive die-off caused by the digital divide.

  15. [PDF] Language Decline and Death in Africa: Causes, Consequences and

    Language Decline and Death in Africa: Causes, Consequences and Challenges. H. Batibo. Published 2005. Linguistics, Sociology. Preface 1: The languages of Africa 2: Patterns of language use in Africa 3: African languages as a resource 4: The minority languages of Africa 5: The endangered languages of Africa 6: Language shift and death in Africa ...

  16. Language Decline and Death in Africa: Causes, Consequences and

    SUBMIT PAPER. Teachers College Record: The Voice of Scholarship in Education. ... Language Decline and Death in Africa: Causes, Consequences and Challenges. Kathleen Kimpel View all authors and affiliations. Volume 108, ... Sage Research Methods Supercharging research opens in new tab;

  17. Language Death and Disappearance: Causes and Circumstances

    A Grammar of the Kiwai Language, Fly Delta, Papua, with a Kiwai Vocabulary by E. Baxter Riley. Port Moresby, Government Printer. Vakhtin, Nikolaj Borisovich and Golovko, Jevgenin Vasiljevich (1987), ' Ob odnom neordinarnom sledstvii jazykovykh kontaktov: jazyk ostrova Mednyj' [About an unusual consequence of language contacts: the language of ...

  18. 8-hour time-restricted eating linked to a 91% higher risk of

    Research Highlights: A study of over 20,000 adults found that those who followed an 8-hour time-restricted eating schedule, a type of intermittent fasting, had a 91% higher risk of death from cardiovascular disease. ... Factors that may also play a role in health, outside of daily duration of eating and cause of death, were not included in the ...

  19. What the data says about abortion in the U.S.

    The CDC calculates death rates by five-year and seven-year periods because of year-to-year fluctuation in the numbers and due to the relatively low number of women who die from legal induced abortions. In 2020, the last year for which the CDC has information, six women in the U.S. died due to complications from induced abortions. Four women ...

  20. Language Death and Disappearance: Causes and Circumstances

    Semantic Scholar extracted view of "Language Death and Disappearance: Causes and Circumstances" by S. Wurm. ... This research aims to describe the vitality of the Lauje language as an enclave of ethnic minority languages in Tolitoli Regency, Central Sulawesi. ... Abstract This paper considers two different Indigenous-led initiatives, the ...

  21. Intermittent fasting linked to risk of cardiovascular death

    Horne's research has shown that fasting twice a week for four weeks, then once a week for 22 weeks may increase a person's risk of dying after one year but decrease their 10-year risk of ...