The preferred formats are plain text (.txt) and/or extensible markup language (.xml). Where available, metadata describing text should be acquired. Depending on anticipated use, other formats, markup, and metadata may be appropriate objects of collecting activity.