NCBI represents the human proteome with overlapping* sets of the Reference Sequences (RefSeq) protein sequences.
Set 1: Sequences currently annotated on the latest human reference genome; accessible from the Assembly database:
As of October 2019, the latest human reference assembly release is GRCh38.p13 and the last full annotation on the assembly is updated annotation release 109.20190905. Your download will include predicted models for proteins (the XP_ accessions) in addition to the known RefSeq proteins (the NP_ accessions). If you are interested in obtaining data for (1) interim annotation release that followed release 108 and excludes predicted models or (2) earlier assemblies and/or annotations, see the article on accessing such data on the Genomes FTP site.
Set 2: Cumulative sequence data, updated weekly, including those that are not annotated on the reference genome assembly:
See the article on obtaining various subsets of the proteome, such as the subset that excludes the predicted models.
*A protein will not be included in set 1 if it is not annotated on the reference genome. Further differences between the two sets stem from different update/release dates. RefSeq staff update the transcripts and protein records daily and combine these in weekly releases for human (and some other organisms) and in releases that occur every two months for all organisms that are included in the RefSeq project. Annotation of eukaryotic genomes follows a different schedule. RefSeq staff do not archive data of previous weekly or bi-monthly RefSeq releases.