annotate sra_fetch.py @ 0:ffdd41766195 draft

Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
author matt-shirley <mdshw5@gmail.com>
date Tue, 27 Nov 2012 13:44:28 -0500
parents
children 45031bbf6b27
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
1 from ftplib import FTP
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
2 import sys
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
3
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
4 # Get accession number from argument
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
5 accession = sys.argv[1]
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
6 outfile = sys.argv[2]
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
7 prefix = accession[0:3]
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
8 middle = accession[3:6]
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
9 suffix = accession[6:9]
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
10
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
11 # NCBI SRA FTP site
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
12 ftp = FTP('ftp-trace.ncbi.nih.gov')
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
13
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
14 # Open file and transfer requested SRA as a file
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
15 # Try to change the working directory until it works
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
16 sra = open(outfile, 'wb')
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
17 ftp.login('ftp')
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
18 connected = False
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
19 while not connected:
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
20 try:
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
21 ftp.cwd('/sra/sra-instant/reads/ByRun/sra/' +
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
22 prefix + '/' +
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
23 prefix + middle + '/' +
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
24 prefix + middle + suffix + '/')
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
25 connected = True
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
26 except:
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
27 pass
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
28
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
29 ftp.retrbinary('RETR ' + prefix + middle + suffix + '.sra', sra.write)
ffdd41766195 Initial version - still need to test if datatype works correctly, and implement scripted download of SRA binaries.
matt-shirley <mdshw5@gmail.com>
parents:
diff changeset
30 ftp.quit()