Search Results

Language segmentation

This phrase is most concisely described in in this work by David Alfter:

Language segmentation consists in finding the boundaries where one language ends and another language begins in a text written in more than one language. This is important for all natural language processing tasks.

[…]

One important point that has to be borne in mind is the difference between language identification and language segmentation. Language identification is concerned with recognizing the language at hand. It is possible to use language identification for language segmentation. Indeed, by identifying the languages in a text, the segmentation is implicitly obtained. Language segmentation on the other hand is only concerned with identifying language boundaries. No claims about the languages involved are made.