# HG changeset patch # User rlegendre # Date 1438076688 14400 # Node ID 0cec66d7f6379d885d64f9092a2fbaf9e6694e8b # Parent 5ed41b948030513f978b2d4d2b33fc7e8e3acc42 Uploaded diff -r 5ed41b948030 -r 0cec66d7f637 COPYING.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/COPYING.txt Tue Jul 28 05:44:48 2015 -0400 @@ -0,0 +1,674 @@ + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + Copyright (C) + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +. diff -r 5ed41b948030 -r 0cec66d7f637 README.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.txt Tue Jul 28 05:44:48 2015 -0400 @@ -0,0 +1,152 @@ +Authors' notes and preface +------------------------ +The script in this package was designed by Marc Descrimes. +The current implementation was written by Marc Descrimes and Yousra Ben Zouari. + +Software: Ving +Title: Representation and manipulation of RNA seq data. +Description: Tool for efficient visualization and analysis + of high-throughput sequencing data (a.k.a. NGS data). + +Version: beta 1.0 + +Prerequisites +------------- +The script runs with R(>= 3.0.2). +R packages GenomicRanges and Rsamtools should be installed. + +Samtools should be up and running in order to create bam file indexes. +Index files must be in the same directory as bam files, with the default +file name given by samtools. +Samtools are in the repositories of many distros. Go here for the freshest version: + + http://samtools.sourceforge.net/ + +If everything goes well you can now get started using real data! :) + +How to use the Ving.R script +---------------------------- + +To get more help, just run: Rscript ving.R + +Visualization requires at least a bam file and coordinates of a genome region to visualize: chromosome name, start and end. + Rscript ving.R [options] [ ...] +For example: + Rscript ving.R WT1.bam -c chr15 -S 210400 -E 224400 + +This produces an "output.png" image with the default parameters. + +You can also chmod +x ving.R and run ving like this: ./ving.R +Just check the shebang in the first line of ving.R, it has to match the PATH of your Rscript. + +Options +------- + +-o/--output + Sets file name for Ving output. Default is "output.png" + +-v/--typeVisu + Changes visualization type. Default is "classic". Can be "heatmap" or "lines". + +-t/--description-data + Describes each data set with a name. Default is the filename. + Example for multiple bam files : + Rscript ving.R -c chr15 -S 210400 -E 224400 WT1.bam -t "wild type rep1" WT2.bam -t "wild type rep2" + If two (or more) bam files have exactly the same name, ving will pool these bam files + to generate only one signal. + Example : + Rscript ving.R -c chr15 -S 210400 -E 224400 WT1.bam -t "wild type" WT2.bam -t "wild type" + +-n/--normalization-coefficient + Each bam file is multiplied by a coefficient. + By default no normalization is performed. + You have to specify as many coefficient as there are bam files. + Example for multiple bam files : + Rscript ving.R -c chr15 -S 210400 -E 224400 WT1.bam WT2.bam -n 0.5,1.5 + +-i/--inverseStrand + Use this if library type is fr-first-strand. + All bam files are treated the same way. + +-u/--unstranded + Use for an non-strand-specific visualization + Default is strand-specific. + +-l/--scale-log + Use this if you want a logarithmic scale. + Don't use this if you don't want a logarithmic scale. + +-y/--symetric-scale + This option is for a symetric scale with the same maximum and minimum + for both strand in the "classic" and "lines" visualization. + +-a/--annotation + Specifies an annotation file (GFF file). + Annotation features will be displayed on output image. + More than one file can be provided: + Rscript ving.R -c chr15 -S 210400 -E 224400 WT1.bam -a saccharomyces_cerevisiae.gff -a cryptics.gff + +-r/--typeTranscript + Use this option to select annotation features for visualization. + Separate multiple transcripts by commas. Example : + Rscript ving.R -c chr15 -S 210400 -E 224400 WT1.bam -a saccharomyces_cerevisiae.gff -r gene,XUT + +-C/--annotation-colors + Sets the different colors for annoatation features. Any color name accepted by R. + Hexadecimal color code is allowed, but do not use "#" character. + Use comma-separated list. Number of colors and number of features should be the same. + Example : + Rscript ving.R -c chr15 -S 210400 -E 224400 WT1.bam -a saccharomyces_cerevisiae.gff -a cryptics.gff -r gene,XUT,SUT -C blue,salmon,649B88 + +--annotation-color-by-strand + Use this if you want the transcripts to be colored according to the strand. + +-s/--annotation-shapes + Sets the shape used for annotation (Comma-separated list). + Four shapes are available: box,rectangle,arrow,line. Default is box. + Numbers of shapes and features should be the same. + For organisms with few or no introns (e.g. yeast), we suggest to use "box" for every type of feature. + Otherwise, we suggest to use "rectangle" for exons and "arrow" for introns. + +--classic-plus-color + Sets the color of plus strand for the classic visualization. + +--classic-minus-color + Sets the color of minus strand for the classic visualization. + Example : + Rscript ving.R -c chr15 -S 210400 -E 224400 WT1.bam WT2.bam -n 0.5,1.5 --classic-plus-color=darkred --classic-minus-color=gold + +--heatmap-max-color + Sets the maximum color of the palette for heatmap visualization. + +--heatmap-min-color + Sets the minimum color of the palette for heatmap visualization. + +--heatmap-palette-method + Sets the method of the palette for the heatmap visualization. + Two settings : + - hsv : varies the hues + - rgb : varies the shades + Example : + Rscript ving.R -v heatmap -c chr15 -S 210400 -E 224400 WT1.bam WT2.bam --heatmap-min-color=lavenderblush --heatmap-max-color=darksalmon + Rscript ving.R -v heatmap -c chr15 -S 210400 -E 224400 WT1.bam WT2.bam --heatmap-min-color=white --heatmap-max-color=black --heatmap-palette-method=rgb + +--lines-samples-colors + Sets the colors for lines visualization (comma-separated list). + You have to specify as many colors as there are bam files. + +--lines-samples-type-line + Sets the type of lines for lines visualization (comma-separated list). + You have to specify as many line types as there are bam files. + Example :axs + Rscript ving.R -v lines -c chr15 -S 210400 -E 224400 WT1.bam WT2.bam mut1.bam mut2.bam -l --lines-samples-colors=1,1,2,2 --lines-samples-type-line=1,2,1,2 + +-L/--smoothLength + Sets the length of the sliding window smoothing. The default is NA and allows ving to compute + an optimal length, according to the length of the asked genomic region. + If you don't want any smoothing, set this to zero. + +Examples : + Rscript ving.R -o classic.png -c chr15 -S 210400 -E 224400 WT1.bam -t "wild type rep1" WT2.bam -t "wild type rep2" mut1.bam -t "mutant rep1" mut2.bam -t "mutant rep2" -a saccharomyces_cerevisiae.gff -a cryptics.gff -r gene,XUT,SUT -C blue,salmon,649B88 + Rscript ving.R -v heatmap -o heatmap.png -c chr15 -S 210400 -E 224400 WT1.bam -t "wild type rep1" WT2.bam -t "wild type rep2" mut1.bam -t "mutant rep1" mut2.bam -t "mutant rep2" -l -a saccharomyces_cerevisiae.gff -a cryptics.gff -r gene,XUT,SUT -C blue,salmon,649B88 + Rscript ving.R -v lines -o lines.png -c chr15 -S 210400 -E 224400 WT1.bam -t "wild type rep1" WT2.bam -t "wild type rep2" mut1.bam -t "mutant rep1" mut2.bam -t "mutant rep2" -l -a saccharomyces_cerevisiae.gff -a cryptics.gff -r gene,XUT,SUT -C blue,salmon,649B88 --lines-samples-colors=black,black,green,green --lines-samples-type-line=1,2,1,2 diff -r 5ed41b948030 -r 0cec66d7f637 ving.R --- a/ving.R Tue Jul 29 09:47:47 2014 -0400 +++ b/ving.R Tue Jul 28 05:44:48 2015 -0400 @@ -1,211 +1,186 @@ #! /usr/bin/Rscript -RetreiveAndDestroy=function(opt,root,stem,regexp,SearchNames,Out,isValue,NbFound,StockNames){ - Bool=lapply(paste(root,SearchNames,stem,sep=""),grepl,opt) - names(Bool)=StockNames - Pos=lapply(Bool,which) - names(Pos)=StockNames - disable=c() - for (i in StockNames){ - nbmatch=length(Pos[[i]]) - if(nbmatch>0){ - NbFound[[i]]=NbFound[[i]]+nbmatch - disable=c(disable,-1*Pos[[i]]) - if(is.null(Out[[i]])){ - if(isValue[[i]]!=0){ - if(regexp=="next"){ - Out[[i]]=opt[Pos[[i]]+1] - disable=c(disable,-1*(Pos[[i]]+1)) - }else{ - Out[[i]]=sub(regexp,"\\1",opt[Pos[[i]]]) - } - }else{ - Out[[i]]=TRUE - } - }else{ - if(isValue[[i]]!=0){ - if(regexp=="next"){ - Out[[i]]=c(Out[[i]],opt[Pos[[i]]+1]) - disable=c(disable,-1*(Pos[[i]]+1)) - }else{ - Out[[i]]=c(Out[[i]],sub(regexp,"\\1",opt[Pos[[i]]])) - } - }else{ - Out[[i]]=c(Out[[i]],TRUE) - } - } - } +RetreiveAndDestroy=function(opt,root,stem,regexp,SearchNames,Out,isValue,NbFound,StockNames) { + Bool=lapply(paste(root,SearchNames,stem,sep=""),grepl,opt) + names(Bool)=StockNames + Pos=lapply(Bool,which) + names(Pos)=StockNames + disable=c() + for(i in StockNames) { + nbmatch=length(Pos[[i]]) + if(nbmatch>0) { + NbFound[[i]]=NbFound[[i]]+nbmatch + disable=c(disable,-1*Pos[[i]]) + if(is.null(Out[[i]])) { + if(isValue[[i]]!=0) { + if(regexp=="next") { + Out[[i]]=opt[Pos[[i]]+1] + disable=c(disable,-1*(Pos[[i]]+1)) + } + else { + Out[[i]]=sub(regexp,"\\1",opt[Pos[[i]]]) + } + } + else { + Out[[i]]=TRUE } - if(length(disable)>0){ - opt=opt[disable] + } + else { + if(isValue[[i]]!=0) { + if(regexp=="next") { + Out[[i]]=c(Out[[i]],opt[Pos[[i]]+1]) + disable=c(disable,-1*(Pos[[i]]+1)) + } + else { + Out[[i]]=c(Out[[i]],sub(regexp,"\\1",opt[Pos[[i]]])) + } } - Out[["ARGUMENT"]]=list() - Out[["ARGUMENT"]][["opt"]]=opt - Out[["ARGUMENT"]][["NbFound"]]=NbFound - return(Out) + else { + Out[[i]]=c(Out[[i]],TRUE) + } + } + } + } + if(length(disable)>0) { + opt=opt[disable] + } + Out[["ARGUMENT"]]=list() + Out[["ARGUMENT"]][["opt"]]=opt + Out[["ARGUMENT"]][["NbFound"]]=NbFound + return(Out) } -getopt = function (spec=NULL,opt=commandArgs()) { - - FindArgs=which(opt=="--args") - if(length(FindArgs)!=1){ - stop(length(FindArgs)," --args found where 1 expected.",call.=F) - } - ExecName=sub("--file=","",opt[FindArgs-1]) - - if(FindArgsmin.columns){ - colNames=c(colNames,"Description") - } - - if(is.null(spec) | !is.matrix(spec) | (DimSpec[2]max.columns)){ - stop('argument "spec" is required and must be a matrix with 4|5 columns.',call.=F) - } - colnames(spec)=colNames - - spec=as.data.frame(spec,stringsAsFactors=F) - #spec validation - if(length(unique(c(spec$ShortName,"ARGUMENT","args")))!=DimSpec[1]+2 | length(unique(spec$LongName))!=DimSpec[1]){ - stop('Long|Short names for flags must be unique (Long name : "ARGUMENT" and "args" forbidden).', - "\n","List of duplicated :", - "\n","Short: ",paste(spec$ShortName[duplicated(c(spec$ShortName,"ARGUMENT","args"))],collapse=" "), - "\n","Long: ",paste(spec$ShortName[duplicated(spec$LongName)],collapse=" "),call.=F) - } - if(length(which(nchar(spec$ShortName)>1))!=0){ - stop('Short names flags can\'t be longer than 1 character.') - } - - - #initialize - Out=list() - Short2Long=list() - NbFound=list() - isValue=list() - for (i in 1:DimSpec[1]){ - Short2Long[[spec$ShortName[i]]]=spec$LongName[i] - NbFound[[spec$LongName[i]]]=0 - isValue[[spec$LongName[i]]]=spec$Flag[i] - } - - #Map, retreive and suppress ARGUMENTs and arguments - #Value ARGUMENT --example=value - Out=RetreiveAndDestroy(opt,"^--","=.+$",".+=(.+)$",spec$LongName,Out,isValue,NbFound,spec$LongName) - opt=Out[["ARGUMENT"]][["opt"]] - NbFound=Out[["ARGUMENT"]][["NbFound"]] - Out[["ARGUMENT"]]=NULL - #boolean ARGUMENT --example - Out=RetreiveAndDestroy(opt,"^--","$","$",spec$LongName,Out,isValue,NbFound,spec$LongName) - opt=Out[["ARGUMENT"]][["opt"]] - NbFound=Out[["ARGUMENT"]][["NbFound"]] - Out[["ARGUMENT"]]=NULL - #short name ARGUMENT -t value OR boolean -t - Out=RetreiveAndDestroy(opt,"^-","$","next",spec$ShortName,Out,isValue,NbFound,spec$LongName) - opt=Out[["ARGUMENT"]][["opt"]] - NbFound=Out[["ARGUMENT"]][["NbFound"]] - Out[["ARGUMENT"]]=NULL - #Warn about non mapped ARGUMENTs - if(length(opt)>0){ - PosUnkArg=which(grepl("^-",opt)) - if(length(PosUnkArg)){ - message("Error, argument unreconized :","\n",paste(opt[PosUnkArg],collapse="\n"),"\n\n") - } - if(length(PosUnkArg)>0){ - opt=opt[PosUnkArg*-1] - } - } - #Arguments - Out[["ARGUMENT"]]=opt - - #Validation of ARGUMENTs - for(i in 1:DimSpec[1]){ - if(spec$Flag[i]=="0"){#verify boolean arguments - NbValue=length(Out[[spec$LongName[i]]]) - if(NbValue>1){ - message("Warning : ",spec$LongName[i]," found ",NbValue," times") - } - } - if(length(Out[[spec$LongName[i]]])==0){ - Out[[spec$LongName[i]]]=spec$Default[i] - } - library("methods") - Out[[spec$LongName[i]]]=as(Out[[spec$LongName[i]]],spec$Mod[i]) - } - - return(Out) +getopt=function (spec=NULL,opt=commandArgs()) { + FindArgs=which(opt=="--args") + if(length(FindArgs)!=1) { + stop(length(FindArgs)," --args found where 1 expected.",call.=F) + } + ExecName=sub("--file=","",opt[FindArgs-1]) + + if(FindArgsmin.columns) { + colNames=c(colNames,"Description") + } + + if(is.null(spec) | !is.matrix(spec) | (DimSpec[2]max.columns)) { + stop('argument "spec" is required and must be a matrix with 4|5 columns.',call.=F) + } + colnames(spec)=colNames + + spec=as.data.frame(spec,stringsAsFactors=F) + #spec validation + if(length(unique(c(spec$ShortName,"ARGUMENT","args")))!=DimSpec[1]+2 | length(unique(spec$LongName))!=DimSpec[1]) { + stop('Long|Short names for flags must be unique (Long name : "ARGUMENT" and "args" forbidden).', + "\n","List of duplicated :", + "\n","Short: ",paste(spec$ShortName[duplicated(c(spec$ShortName,"ARGUMENT","args"))],collapse=" "), + "\n","Long: ",paste(spec$ShortName[duplicated(spec$LongName)],collapse=" "),call.=F) + } + if(length(which(nchar(spec$ShortName)>1))!=0) { + stop('Short names flags can\'t be longer than 1 character.') + } + + #initialize + Out=list() + Short2Long=list() + NbFound=list() + isValue=list() + for(i in 1:DimSpec[1]) { + Short2Long[[spec$ShortName[i]]]=spec$LongName[i] + NbFound[[spec$LongName[i]]]=0 + isValue[[spec$LongName[i]]]=spec$Flag[i] + } + + #Map, retreive and suppress ARGUMENTs and arguments + #Value ARGUMENT --example=value + Out=RetreiveAndDestroy(opt,"^--","=.+$",".+=(.+)$",spec$LongName,Out,isValue,NbFound,spec$LongName) + opt=Out[["ARGUMENT"]][["opt"]] + NbFound=Out[["ARGUMENT"]][["NbFound"]] + Out[["ARGUMENT"]]=NULL + #boolean ARGUMENT --example + Out=RetreiveAndDestroy(opt,"^--","$","$",spec$LongName,Out,isValue,NbFound,spec$LongName) + opt=Out[["ARGUMENT"]][["opt"]] + NbFound=Out[["ARGUMENT"]][["NbFound"]] + Out[["ARGUMENT"]]=NULL + #short name ARGUMENT -t value OR boolean -t + Out=RetreiveAndDestroy(opt,"^-","$","next",spec$ShortName,Out,isValue,NbFound,spec$LongName) + opt=Out[["ARGUMENT"]][["opt"]] + NbFound=Out[["ARGUMENT"]][["NbFound"]] + Out[["ARGUMENT"]]=NULL + #Warn about non mapped ARGUMENTs + if(length(opt)>0) { + PosUnkArg=which(grepl("^-",opt)) + if(length(PosUnkArg)) { + message("Error, argument unreconized :","\n",paste(opt[PosUnkArg],collapse="\n"),"\n\n") + } + if(length(PosUnkArg)>0) { + opt=opt[PosUnkArg*-1] + } + } + #Arguments + Out[["ARGUMENT"]]=opt + + #Validation of ARGUMENTs + for(i in 1:DimSpec[1]) { + if(spec$Flag[i]=="0") {#verify boolean arguments + NbValue=length(Out[[spec$LongName[i]]]) + if(NbValue>1) { + message("Warning : ",spec$LongName[i]," found ",NbValue," times") + } + } + if(length(Out[[spec$LongName[i]]])==0) { + Out[[spec$LongName[i]]]=spec$Default[i] + } + library("methods") + Out[[spec$LongName[i]]]=as(Out[[spec$LongName[i]]],spec$Mod[i]) + } + + return(Out) } -### ========================================================================= -### Ving's methods -### ------------------------------------------------------------------------- -### - -##Converts the flag numbers into binary -integer.base.b = function(x, b=2){ - xi <- as.integer(x) +## Converts the flag numbers into binary +integer.base.b=function(x, b=2) { + xi=as.integer(x) if(any(is.na(xi) | ((x-xi)!=0))) print(list(ERROR="x not integer", x=x)) - N <- length(x) - xMax <- max(c(x,1)) - ndigits <- 11 - Base.b <- array(NA, dim=c(N, ndigits)) - for(i in 1:ndigits){#i <- 1 - Base.b[, ndigits-i+1] <- (x %% b) - x <- (x %/% b) + N=length(x) + xMax=max(c(x,1)) + ndigits=11 + Base.b=array(NA, dim=c(N, ndigits)) + for(i in 1:ndigits) {#i=1 + Base.b[, ndigits-i+1]=(x %% b) + x=(x %/% b) } Base.b } -##Checks if color used is an acceptable color -is.acceptable.color = function(character) { +## Checks if color used is an acceptable color +is.acceptable.color=function(character) { tmp=try(col2rgb(character),TRUE) return(class(tmp)!="try-error") } -##Combines two data frame -listOfDataFrame2DataFrame = function(x,vertical=TRUE) { - if(length(x)==1) { - x[[1]] - } - else { - if(vertical) { - ncol = dim(x[[1]])[2] - vect=unlist(lapply(x,t)) - retour = as.data.frame(t(array(vect,dim=c(ncol,length(vect)/ncol))),stringsAsFactors=FALSE) - names(retour)=names(x[[1]]) - } - else { - nline=dim(x[[1]])[1] - vect=unlist(x) - retour=as.data.frame(array(vect,dim=c(nline,length(vect)/nline))) - names(retour) = unlist(lapply(x,names)) - } - retour - } -} - -# utilise la fonction scanBam d'une façon plus partique pour moi -extractFromBam = function(file,which,what) { +extractFromBam=function(file,which,what) { return(scanBam(file, param=ScanBamParam(which=which,what=what))[[1]][[1]]) } -##Returns a list from cigar expression -interpreteCIGAR = function(cigar) { - cigar_un = strsplit(unique(cigar),split="") - n_cigar_un = length(cigar_un) - taille_cigar = list() - analise_cigar = function(cigar_) { - cigar_sortie = list() - acc = "" +## Returns a list from cigar expression +interpreteCIGAR=function(cigar) { + cigar_un=strsplit(unique(cigar),split="") + n_cigar_un=length(cigar_un) + taille_cigar=list() + analise_cigar=function(cigar_) { + cigar_sortie=list() + acc="" for(j in 1:length(cigar_)) { if(sum(cigar_[j]==as.character(0:9))==1) { acc=paste(acc,cigar_[j],sep="") @@ -218,15 +193,15 @@ } return(cigar_sortie) } - cigar_interprete = lapply(cigar_un,analise_cigar) - names(cigar_interprete) = unique(cigar) + cigar_interprete=lapply(cigar_un,analise_cigar) + names(cigar_interprete)=unique(cigar) return(cigar_interprete) } # prend un CIGAR splités et retourne la taille occupé par le read sur la séquence génomique (introns compris) -calcule_longueur_cigar = function(cigar_) { - lon = 0 +calcule_longueur_cigar=function(cigar_) { + lon=0 N=length(cigar_) for(j in seq(2,N,2)) { if(cigar_[[j]]!="I") { @@ -237,9 +212,9 @@ } # prend un CIGAR splités et retourne les positions -calcule_junction_cigar = function(cigar_) { +calcule_junction_cigar=function(cigar_) { retour=list() - lon = 0 + lon=0 N=length(cigar_) for(j in seq(2,N,2)) { if(cigar_[[j]]!="I") { @@ -252,8 +227,8 @@ return(retour) } -##Returns a list of numbers of single read with their coordinates -compresse_coordonnees = function(debut,fin) { +## Returns a list of numbers of single read with their coordinates +compresse_coordonnees=function(debut,fin) { if(length(debut)==0) { return(list(numeric(),numeric(),numeric())) } @@ -262,57 +237,57 @@ tmp_rle=rle(tmp) poids=tmp_rle$lengths values_split=strsplit(tmp_rle$values,split="_") - doit = function(j) { + doit=function(j) { return(as.integer(values_split[[j]][1])) } debut_uni=sapply(1:length(poids),doit) - doit = function(j) { + doit=function(j) { return(as.integer(values_split[[j]][2])) } fin_uni=sapply(1:length(poids),doit) - ordre_debut = order(debut_uni) + ordre_debut=order(debut_uni) return(list(debut_uni[ordre_debut],fin_uni[ordre_debut],poids[ordre_debut])) } } -RDataFileName = function(file) { +RDataFileName=function(file) { return(paste(file,".RData",sep="")) } -##Function converts and extracts the infos from bamfile -readBam_ <-function(file_,insert_max_=2000,stranded_=TRUE,ncore_=1,libraryType_=c("standard","inverse"),fileNameRData_=NA,normalized_=NULL,chrName_=NULL,from_=1,to_=NULL) { +## Function converts and extracts the infos from bamfile +readBam_=function(file_,insert_max_=2000,stranded_=TRUE,ncore_=1,libraryType_=c("standard","inverse"),fileNameRData_=NA,normalized_=NULL,chrName_=NULL,from_=1,to_=NULL) { suppressPackageStartupMessages(require("Rsamtools")) suppressPackageStartupMessages(require("GenomicRanges")) -##Declaration of variables - flagstat = numeric(11) +## Declaration of variables + flagstat=numeric(11) names(flagstat)=c("total","duplicates","mapped","paired","read1","read2","properly paired","itself and mate mapped","singletons","mate mapped on a different chr","QC-failed") genome_info=scanBamHeader(file_)[[1]]$targets - noms_chromosomes = names(genome_info) - longueur_chromosomes = as.integer(genome_info) - nombre_chromosomes = length(noms_chromosomes) - brin_F = list() - brin_R = list() - brin_F_junction = list() - brin_R_junction = list() - pas<-c(1,2,6,7) - i_zone = 0 + noms_chromosomes=names(genome_info) + longueur_chromosomes=as.integer(genome_info) + nombre_chromosomes=length(noms_chromosomes) + brin_F=list() + brin_R=list() + brin_F_junction=list() + brin_R_junction=list() + pas=c(1,2,6,7) + i_zone=0 if(is.null(chrName_)) { - chrName__ = noms_chromosomes + chrName__=noms_chromosomes } else { - chrName__ = chrName_ + chrName__=chrName_ } -##Fragments identification +## Fragments identification for(i in (1:nombre_chromosomes)) { - i_zone = i_zone +1 - nom_chromo = noms_chromosomes[i] - lon_chromo = longueur_chromosomes[i] + i_zone=i_zone +1 + nom_chromo=noms_chromosomes[i] + lon_chromo=longueur_chromosomes[i] if(!(nom_chromo %in% chrName__)) { - brin_F[[i]] = list(numeric(),numeric(),numeric()) - brin_R[[i]] = list(numeric(),numeric(),numeric()) - brin_F_junction[[i]] = list(numeric(),numeric(),numeric()) - brin_R_junction[[i]] = list(numeric(),numeric(),numeric()) + brin_F[[i]]=list(numeric(),numeric(),numeric()) + brin_R[[i]]=list(numeric(),numeric(),numeric()) + brin_F_junction[[i]]=list(numeric(),numeric(),numeric()) + brin_R_junction[[i]]=list(numeric(),numeric(),numeric()) } else { if(is.null(to_)) { @@ -323,19 +298,19 @@ } from_i=from_[min(i_zone,length(from_))] - commande = paste("RangesList(`",nom_chromo,"` = IRanges(",from_i,",",to_i,"))",sep="") + commande=paste("RangesList(`",nom_chromo,"`=IRanges(",from_i,",",to_i,"))",sep="") expr=try(parse(text=commande),TRUE) -##Function used from GenomicRanges package - which = eval(expr) - what<-c("flag","mpos","cigar","mrnm","isize") - param<- ScanBamParam(what=what, which=which) -##Case of no reads on the chromosome +## Function used from GenomicRanges package + which=eval(expr) + what=c("flag","mpos","cigar","mrnm","isize") + param=ScanBamParam(what=what, which=which) +## Case of no reads on the chromosome start=extractFromBam(file=file_,which=which,what="pos") - if(length(start) == 0 ){ - brin_F[[i]] = list(numeric(),numeric(),numeric()) - brin_R[[i]] = list(numeric(),numeric(),numeric()) - brin_F_junction[[i]] = list(numeric(),numeric(),numeric()) - brin_R_junction[[i]] = list(numeric(),numeric(),numeric()) + if(length(start)==0 ) { + brin_F[[i]]=list(numeric(),numeric(),numeric()) + brin_R[[i]]=list(numeric(),numeric(),numeric()) + brin_F_junction[[i]]=list(numeric(),numeric(),numeric()) + brin_R_junction[[i]]=list(numeric(),numeric(),numeric()) } else { strand=extractFromBam(file=file_,which=which,what="strand") @@ -344,235 +319,156 @@ cigar=extractFromBam(file=file_,which=which,what="cigar") mrnm=extractFromBam(file=file_,which=which,what="mrnm") isize=extractFromBam(file=file_,which=which,what="isize") - + first_read=integer.base.b(flag)[,5]==1 - strand[strand =="+" & !first_read ]="*" - strand[strand =="-" & !first_read ] ="+" - strand[strand =="*" & !first_read ] ="-" + strand[strand=="+" & !first_read ]="*" + strand[strand=="-" & !first_read ]="+" + strand[strand=="*" & !first_read ]="-" -##CIGAR's interpreter - cigar_interprete = interpreteCIGAR(cigar) - longueur_cigar = lapply(cigar_interprete,calcule_longueur_cigar) - junction_cigar = lapply(cigar_interprete,calcule_junction_cigar) - debut_junction = numeric() - fin_junction = numeric() - brin_junction = numeric() - i_junction = 0 +## CIGAR's interpreter + cigar_interprete=interpreteCIGAR(cigar) + longueur_cigar=lapply(cigar_interprete,calcule_longueur_cigar) + junction_cigar=lapply(cigar_interprete,calcule_junction_cigar) end=start+sapply(1:length(cigar),function(j) longueur_cigar[[cigar[j]]]) -##Case of pairend reads - is_on_same_chr = mrnm==nom_chromo - is_on_same_chr[is.na(is_on_same_chr)] = FALSE - is_paired = is_on_same_chr & abs(isize) <= insert_max_ - is_paired[first_read & strand=="+" & (isize<0 | isize>insert_max_)] = FALSE - is_paired[!first_read & strand=="+" & (isize>0 | isize < -insert_max_)] = FALSE - is_paired[first_read & strand=="-" & (isize>0 | isize < -insert_max_)] = FALSE - is_paired[!first_read & strand=="-" & (isize<0 | isize>insert_max_)] = FALSE - is_paired[is.na(is_paired)] = FALSE - - debut_fragment_paired_plus<-mpos[!first_read & strand =="+" & is_paired] - fin_fragment_paired_plus<-end[!first_read & strand=="+" & is_paired] - debut_fragment_paired_moins<-mpos[first_read & strand=="-" & is_paired] - fin_fragment_paired_moins<-end[first_read & strand =="-" & is_paired] +## Case of pairend reads + is_on_same_chr=mrnm==nom_chromo + is_on_same_chr[is.na(is_on_same_chr)]=FALSE + is_paired=is_on_same_chr & abs(isize) <=insert_max_ + is_paired[first_read & strand=="+" & (isize<0 | isize>insert_max_)]=FALSE + is_paired[!first_read & strand=="+" & (isize>0 | isize < -insert_max_)]=FALSE + is_paired[first_read & strand=="-" & (isize>0 | isize < -insert_max_)]=FALSE + is_paired[!first_read & strand=="-" & (isize<0 | isize>insert_max_)]=FALSE + is_paired[is.na(is_paired)]=FALSE + + debut_fragment_paired_plus=mpos[!first_read & strand=="+" & is_paired] + fin_fragment_paired_plus=end[!first_read & strand=="+" & is_paired] + debut_fragment_paired_moins=mpos[first_read & strand=="-" & is_paired] + fin_fragment_paired_moins=end[first_read & strand=="-" & is_paired] + +## Case of single reads + debut_fragment_singleton_plus=start[!is_paired & strand=="+"] + fin_fragment_singleton_plus=end[!is_paired & strand=="+"] + debut_fragment_singleton_moins=start[!is_paired & strand=="-"] + fin_fragment_singleton_moins=end[!is_paired & strand=="-"] -##Case of single reads - debut_fragment_singleton_plus = start[!is_paired & strand =="+"] - fin_fragment_singleton_plus = end[!is_paired & strand =="+"] - debut_fragment_singleton_moins=start[!is_paired & strand =="-"] - fin_fragment_singleton_moins= end[!is_paired & strand =="-"] - -##Fragments - debut_frag_plus = c(debut_fragment_paired_plus,debut_fragment_singleton_plus) - fin_frag_plus = c(fin_fragment_paired_plus,fin_fragment_singleton_plus) +## Fragments + debut_frag_plus=c(debut_fragment_paired_plus,debut_fragment_singleton_plus) + fin_frag_plus=c(fin_fragment_paired_plus,fin_fragment_singleton_plus) debut_frag_moins=c(debut_fragment_paired_moins,debut_fragment_singleton_moins) - fin_frag_moins = c(fin_fragment_paired_moins,fin_fragment_singleton_moins) - brin_F[[i]] = compresse_coordonnees(debut_frag_plus,fin_frag_plus) - brin_R[[i]] = compresse_coordonnees(debut_frag_moins,fin_frag_moins) + fin_frag_moins=c(fin_fragment_paired_moins,fin_fragment_singleton_moins) + brin_F[[i]]=compresse_coordonnees(debut_frag_plus,fin_frag_plus) + brin_R[[i]]=compresse_coordonnees(debut_frag_moins,fin_frag_moins) -##junction read +## Junction read + debut_junction=numeric() + fin_junction=numeric() + brin_junction=numeric() + i_junction=0 for(j in 1:length(cigar)) { - junctions_ = junction_cigar[[cigar[j]]] - if(length(junctions_)){ + junctions_=junction_cigar[[cigar[j]]] + if(length(junctions_)) { for(k in 1:length(junctions_)) { - i_junction = i_junction + 1 - debut_junction[i_junction] = start[j] + junctions_[[k]][1] - 1 - fin_junction[i_junction] = start[j] + junctions_[[k]][2] - 1 - brin_junction[i_junction] = as.character(strand[j]) + i_junction=i_junction + 1 + debut_junction[i_junction]=start[j] + junctions_[[k]][1] - 1 + fin_junction[i_junction]=start[j] + junctions_[[k]][2] - 1 + brin_junction[i_junction]=as.character(strand[j]) } } } if(i_junction==0) { - brin_F_junction[[i]] = list(numeric(),numeric(),numeric()) - brin_R_junction[[i]] = list(numeric(),numeric(),numeric()) + brin_F_junction[[i]]=list(numeric(),numeric(),numeric()) + brin_R_junction[[i]]=list(numeric(),numeric(),numeric()) } else { - brin_F_junction[[i]] = compresse_coordonnees(debut_junction[brin_junction=="+"],fin_junction[brin_junction=="+"]) - brin_R_junction[[i]] = compresse_coordonnees(debut_junction[brin_junction=="-"],fin_junction[brin_junction=="-"]) + brin_F_junction[[i]]=compresse_coordonnees(debut_junction[brin_junction=="+"],fin_junction[brin_junction=="+"]) + brin_R_junction[[i]]=compresse_coordonnees(debut_junction[brin_junction=="-"],fin_junction[brin_junction=="-"]) } -##Flagstat interpreter - flag_bits = integer.base.b(flag)#remplie les données stat pour un flag donné + +## Flagstat interpreter + flag_bits=integer.base.b(flag)#remplie les données stat pour un flag donné - ##flagstat - ##total - flagstat[1] = flagstat[1] + sum(flag_bits[,2]==0) - ##duplicates - flagstat[2] = flagstat[2] + sum((flag_bits[,1]==1)&(flag_bits[,2]==0)) - ##mapped - flagstat[3] = flagstat[3] + sum((flag_bits[,9]==0)&(flag_bits[,2]==0)) - ##paired - flagstat[4] = flagstat[4] + sum((flag_bits[,11]==1)&(flag_bits[,2]==0)) - ##read1 - flagstat[5] = flagstat[5] + sum((flag_bits[,5]==1)&(flag_bits[,2]==0)) - ##read2 - flagstat[6] = flagstat[6] + sum((flag_bits[,4]==1)&(flag_bits[,2]==0)) - ##iself and mate mapped - flagstat[8] = flagstat[8] + sum((flag_bits[,11]==1)&(flag_bits[,9]==0)&(flag_bits[,8]==0)&(flag_bits[,2]==0)) - ##singletons - flagstat[9] = flagstat[9] + sum((flag_bits[,8]==1)&(flag_bits[,2]==0)) - ##QC-failed - flagstat[11] = flagstat[11] + sum(flag_bits[,2]==1) - ##flagstat - ##mate on a different chr - flagstat[10] = flagstat[10] + sum((!is_on_same_chr)&(flag_bits[,11]==1)&(flag_bits[,9]==0)&(flag_bits[,8]==0)&(flag_bits[,2]==0)) - ##flagstat - ##properly paired - flagstat[7] = flagstat[7] + sum(is_paired) - + ## flagstat + ## total + flagstat[1]=flagstat[1] + sum(flag_bits[,2]==0) + ## duplicates + flagstat[2]=flagstat[2] + sum((flag_bits[,1]==1)&(flag_bits[,2]==0)) + ## mapped + flagstat[3]=flagstat[3] + sum((flag_bits[,9]==0)&(flag_bits[,2]==0)) + ## paired + flagstat[4]=flagstat[4] + sum((flag_bits[,11]==1)&(flag_bits[,2]==0)) + ## read1 + flagstat[5]=flagstat[5] + sum((flag_bits[,5]==1)&(flag_bits[,2]==0)) + ## read2 + flagstat[6]=flagstat[6] + sum((flag_bits[,4]==1)&(flag_bits[,2]==0)) + ## iself and mate mapped + flagstat[8]=flagstat[8] + sum((flag_bits[,11]==1)&(flag_bits[,9]==0)&(flag_bits[,8]==0)&(flag_bits[,2]==0)) + ## singletons + flagstat[9]=flagstat[9] + sum((flag_bits[,8]==1)&(flag_bits[,2]==0)) + ## QC-failed + flagstat[11]=flagstat[11] + sum(flag_bits[,2]==1) + ## flagstat + ## mate on a different chr + flagstat[10]=flagstat[10] + sum((!is_on_same_chr)&(flag_bits[,11]==1)&(flag_bits[,9]==0)&(flag_bits[,8]==0)&(flag_bits[,2]==0)) + ## flagstat + ## properly paired + flagstat[7]=flagstat[7] + sum(is_paired) } } } -##Data storing - names(brin_F) = noms_chromosomes - names(brin_R) = noms_chromosomes - names(brin_F_junction) = noms_chromosomes - names(brin_R_junction) = noms_chromosomes + +## Data storing + names(brin_F)=noms_chromosomes + names(brin_R)=noms_chromosomes + names(brin_F_junction)=noms_chromosomes + names(brin_R_junction)=noms_chromosomes - bamHandler = list() + bamHandler=list() if(libraryType_[1]=="inverse") { - bamHandler[[1]] = brin_R - bamHandler[[2]] = brin_F + bamHandler[[1]]=brin_R + bamHandler[[2]]=brin_F } else { - bamHandler[[1]] = brin_F - bamHandler[[2]] = brin_R + bamHandler[[1]]=brin_F + bamHandler[[2]]=brin_R } - bamHandler[[3]] = longueur_chromosomes - bamHandler[[4]] = flagstat - bamHandler[[5]] = stranded_ + bamHandler[[3]]=longueur_chromosomes + bamHandler[[4]]=flagstat + bamHandler[[5]]=stranded_ if(libraryType_[1]=="inverse") { - bamHandler[[6]] = brin_R_junction - bamHandler[[7]] = brin_F_junction + bamHandler[[6]]=brin_R_junction + bamHandler[[7]]=brin_F_junction } else { - bamHandler[[6]] = brin_F_junction - bamHandler[[7]] = brin_R_junction + bamHandler[[6]]=brin_F_junction + bamHandler[[7]]=brin_R_junction } - bamHandler[[8]] = FALSE -if(!is.null(normalized_)) { + bamHandler[[8]]=FALSE + if(!is.null(normalized_)) { for( i in pas) { - for(j in 1:nombre_chromosomes){ - bamHandler[[i]][[j]][[4]]=normalized_*bamHandler[[i]][[j]][[3]] - } + for(j in 1:nombre_chromosomes) { + bamHandler[[i]][[j]][[4]]=normalized_*bamHandler[[i]][[j]][[3]] + } } - bamHandler[[8]] = TRUE -} + bamHandler[[8]]=TRUE + } - names(bamHandler) = c("F","R","chrLength","flagstat","stranded","junctions_F","junctions_R","norm") + names(bamHandler)=c("F","R","chrLength","flagstat","stranded","junctions_F","junctions_R","norm") if((is.null(chrName_))&(from_==1)&(is.null(to_))) { if(is.null(fileNameRData_)|is.na(fileNameRData_)) { - save(bamHandler,file=RDataFileName(file_)) + save(bamHandler,file=RDataFileName(file_)) } else { - save(bamHandler,file=fileNameRData_) + save(bamHandler,file=fileNameRData_) } } return(bamHandler) } -chercheBamHandlerDansFichierRData = function(fichierRData) { - tmp=try(load(fichierRData),TRUE) - if(class(tmp)=="try-error") { - return(tmp) - } - else { - return(try(bamHandler,TRUE)) - } -} - -##Public Function -readBam = function(file,insert_max=2000,stranded=TRUE,reload=FALSE,ncore=1,libraryType=c("standard","inverse"),normalized=NULL,chrName=NULL,from=1,to=NULL) { - if(length(file)==1) { - nom_fichier_RData = RDataFileName(file) - if(!file.exists(nom_fichier_RData)|reload) { - return(try(readBam_(file_=file,insert_max_=insert_max,stranded_=stranded,ncore_=ncore,libraryType_=libraryType,normalized_=normalized,chrName_=chrName,from_=from,to_=to),TRUE)) - } - else { - load(nom_fichier_RData) - return(bamHandler) - } - } - else { - require("multicore") - chargeBamsDansFichiersRData = function(oneFile) { - nom_fichier_RData = RDataFileName(oneFile) - if(!file.exists(nom_fichier_RData)|reload) { - bamHandler=try(readBam_(oneFile,insert_max_=insert_max,stranded_=stranded,libraryType_=libraryType,normalized_=normalized,chrName_=chrName,from_=from,to_=to),TRUE) - } - return(nom_fichier_RData) - } - fichiersRData = mclapply(file,chargeBamsDansFichiersRData,mc.cores=ncore) - gc() - return(lapply(fichiersRData,chercheBamHandlerDansFichierRData)) - } -} - -## Returns an adjusted bamHandler object -ajustBam = function(bamHandler,coeff=1) { - if(class(bamHandler)=="try-error") { - return(bamHandler) - } - else { - bamHandler_ = bamHandler - L = length(bamHandler_$F) - for(i in 1:L) { - bamHandler_$F[[i]][[4]]=bamHandler_$F[[i]][[3]]*coeff - bamHandler_$R[[i]][[4]]=bamHandler_$R[[i]][[3]]*coeff - bamHandler_$junctions_F[[i]][[4]]=bamHandler_$junctions_F[[i]][[3]]*coeff - bamHandler_$junctions_R[[i]][[4]]=bamHandler_$junctions_R[[i]][[3]]*coeff - } - bamHandler_$norm=TRUE - return(bamHandler_) - } - } - -##Returns an ajusted bamHandler -normalizeBams = function(data,poids,which=1:length(poids),relative=TRUE) { - N=length(data) - j=0 - retour = list() - for(i in 1:N) { - - if(i %in% which) { - j = j + 1 - if(relative) { - coeff=min(poids)/poids[j] - } - else { - coeff=poids[j] - } - retour[[i]] = ajustBam(data[[i]],coeff) - } - else { - retour[[i]] = data[[i]] - } - } - return(retour) -} - -##Returns the sum of two bamHandler objects -addBam = function(bamHandler1,bamHandler2) { +## Returns the sum of two bamHandler objects +addBam=function(bamHandler1,bamHandler2) { if(class(bamHandler1)=="try-error"|class(bamHandler2)=="try-error") { if(class(bamHandler1)=="try-error") { return(bamHandler2) @@ -582,121 +478,99 @@ } } else { - brin_F = list() - junctions_brin_F = list() - brin_R = list() - junctions_brin_R = list() - L = length(bamHandler1$F) - bamHandler = list() - if(bamHandler1$norm != bamHandler2$norm ){ + brin_F=list() + junctions_brin_F=list() + brin_R=list() + junctions_brin_R=list() + L=length(bamHandler1$F) + bamHandler=list() + if(bamHandler1$norm !=bamHandler2$norm ) { warning(expr="Two different bam files(normalized and non normalized)!!",immediate.=TRUE) } for(i in 1:L) { - brin_F[[i]] = list(c(bamHandler1$F[[i]][[1]],bamHandler2$F[[i]][[1]]),c(bamHandler1$F[[i]][[2]],bamHandler2$F[[i]][[2]]),c(bamHandler1$F[[i]][[3]],bamHandler2$F[[i]][[3]])) - junctions_brin_F[[i]] = list(c(bamHandler1$junctions_F[[i]][[1]],bamHandler2$junctions_F[[i]][[1]]),c(bamHandler1$junctions_F[[i]][[2]],bamHandler2$junctions_F[[i]][[2]]),c(bamHandler1$junctions_F[[i]][[3]],bamHandler2$junctions_F[[i]][[3]])) - brin_R[[i]] = list(c(bamHandler1$R[[i]][[1]],bamHandler2$R[[i]][[1]]),c(bamHandler1$R[[i]][[2]],bamHandler2$R[[i]][[2]]),c(bamHandler1$R[[i]][[3]],bamHandler2$R[[i]][[3]])) - junctions_brin_R[[i]] = list(c(bamHandler1$junctions_R[[i]][[1]],bamHandler2$junctions_R[[i]][[1]]),c(bamHandler1$junctions_R[[i]][[2]],bamHandler2$junctions_R[[i]][[2]]),c(bamHandler1$junctions_R[[i]][[3]],bamHandler2$junctions_R[[i]][[3]])) - if(bamHandler1$norm & bamHandler2$norm){ - brin_F[[i]][[4]] = c(bamHandler1$F[[i]][[4]],bamHandler2$F[[i]][[4]]) - junctions_brin_F[[i]][[4]] = c(bamHandler1$junctions_F[[i]][[4]],bamHandler2$junctions_F[[i]][[4]]) - brin_R[[i]][[4]] = c(bamHandler1$R[[i]][[4]],bamHandler2$R[[i]][[4]]) - junctions_brin_R[[i]][[4]] = c(bamHandler1$junctions_R[[i]][[4]],bamHandler2$junctions_R[[i]][[4]]) + brin_F[[i]]=list(c(bamHandler1$F[[i]][[1]],bamHandler2$F[[i]][[1]]),c(bamHandler1$F[[i]][[2]],bamHandler2$F[[i]][[2]]),c(bamHandler1$F[[i]][[3]],bamHandler2$F[[i]][[3]])) + junctions_brin_F[[i]]=list(c(bamHandler1$junctions_F[[i]][[1]],bamHandler2$junctions_F[[i]][[1]]),c(bamHandler1$junctions_F[[i]][[2]],bamHandler2$junctions_F[[i]][[2]]),c(bamHandler1$junctions_F[[i]][[3]],bamHandler2$junctions_F[[i]][[3]])) + brin_R[[i]]=list(c(bamHandler1$R[[i]][[1]],bamHandler2$R[[i]][[1]]),c(bamHandler1$R[[i]][[2]],bamHandler2$R[[i]][[2]]),c(bamHandler1$R[[i]][[3]],bamHandler2$R[[i]][[3]])) + junctions_brin_R[[i]]=list(c(bamHandler1$junctions_R[[i]][[1]],bamHandler2$junctions_R[[i]][[1]]),c(bamHandler1$junctions_R[[i]][[2]],bamHandler2$junctions_R[[i]][[2]]),c(bamHandler1$junctions_R[[i]][[3]],bamHandler2$junctions_R[[i]][[3]])) + if(bamHandler1$norm & bamHandler2$norm) { + brin_F[[i]][[4]]=c(bamHandler1$F[[i]][[4]],bamHandler2$F[[i]][[4]]) + junctions_brin_F[[i]][[4]]=c(bamHandler1$junctions_F[[i]][[4]],bamHandler2$junctions_F[[i]][[4]]) + brin_R[[i]][[4]]=c(bamHandler1$R[[i]][[4]],bamHandler2$R[[i]][[4]]) + junctions_brin_R[[i]][[4]]=c(bamHandler1$junctions_R[[i]][[4]],bamHandler2$junctions_R[[i]][[4]]) } } - - names(brin_F) = names(bamHandler1$F) - names(brin_R) = names(bamHandler2$R) - names(junctions_brin_F) = names(bamHandler1$junctions_F) - names(junctions_brin_R) = names(bamHandler2$junctions_R) - bamHandler[[1]] = brin_F - bamHandler[[2]] = brin_R - bamHandler[[3]] = bamHandler1[[3]] - bamHandler[[4]] = bamHandler1[[4]] + bamHandler2[[4]] - bamHandler[[5]] = bamHandler1[[5]] & bamHandler2[[5]] - bamHandler[[6]] = junctions_brin_F - bamHandler[[7]] = junctions_brin_R - bamHandler[[8]] = bamHandler1$norm & bamHandler2$norm - names(bamHandler) = c("F","R","chrLength","flagstat","stranded","junctions_F","junctions_R","norm") + names(brin_F)=names(bamHandler1$F) + names(brin_R)=names(bamHandler2$R) + names(junctions_brin_F)=names(bamHandler1$junctions_F) + names(junctions_brin_R)=names(bamHandler2$junctions_R) + bamHandler[[1]]=brin_F + bamHandler[[2]]=brin_R + bamHandler[[3]]=bamHandler1[[3]] + bamHandler[[4]]=bamHandler1[[4]] + bamHandler2[[4]] + bamHandler[[5]]=bamHandler1[[5]] & bamHandler2[[5]] + bamHandler[[6]]=junctions_brin_F + bamHandler[[7]]=junctions_brin_R + bamHandler[[8]]=bamHandler1$norm & bamHandler2$norm + names(bamHandler)=c("F","R","chrLength","flagstat","stranded","junctions_F","junctions_R","norm") return(bamHandler) } } -##Returns the sum of a list of bamHandler objects -sumBam = function(...) { - args = list(...) - retour = args[[1]] - if(length(args)>1) { - for(i in 2:length(args)) { - retour = addBam(retour,args[[i]]) - } - } - return(retour) -} - -##Returns the mean of bamHandler objects -meanBam = function(...) { - args = list(...) - bamsReduced=normalizeBams(args,rep(1,length(args))/length(args),relative=FALSE) - retour = bamsReduced[[1]] - if(length(bamsReduced)>1) { - for(i in 2:length(bamsReduced)) { - retour = addBam(retour,bamsReduced[[i]]) - } - } - return(retour) -} - -##Extracts the signal from bamHandler objects -extractSignal = function(bamHandlerList,chrName,from=1, to=NULL,normalized_=FALSE) { +## Extracts the signal from bamHandler objects +extractSignal=function(bamHandlerList,chrName,from=1, to=NULL,normalized_=FALSE) { forward=list() reverse=list() chr=which(names(bamHandlerList[[1]]$F)==chrName) - if(is.null(to)){ + if(is.null(to)) { to=bamHandlerList[[1]]$chrLength[chr] } - if(normalized_) { - end=4 - } - else { - end=3 - } - for(i in 1:length(bamHandlerList)){ + for(i in 1:length(bamHandlerList)) { + if(normalized_) { + if(bamHandlerList[[i]]$norm) { + end=4 + } + else { + end=3 + } + } + else { + end=3 + } forward_=numeric(to-from+1) - which_read = which((bamHandlerList[[i]]$F[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$F[[chrName]][[1]]<=to)) + which_read=which((bamHandlerList[[i]]$F[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$F[[chrName]][[1]]<=to)) n_reads=length(which_read) if(n_reads>0) { for(k in which_read) { - debut_read = max(1,bamHandlerList[[i]]$F[[chrName]][[1]][k]-from+1) - fin_read = min(bamHandlerList[[i]]$F[[chrName]][[2]][k]-from+1,to-from+1) + debut_read=max(1,bamHandlerList[[i]]$F[[chrName]][[1]][k]-from+1) + fin_read=min(bamHandlerList[[i]]$F[[chrName]][[2]][k]-from+1,to-from+1) forward_[debut_read:fin_read]=forward_[debut_read:fin_read]+bamHandlerList[[i]]$F[[chrName]][[end]][k] } } - which_junctions = which((bamHandlerList[[i]]$junctions_F[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$junctions_F[[chrName]][[1]]<=to)) + which_junctions=which((bamHandlerList[[i]]$junctions_F[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$junctions_F[[chrName]][[1]]<=to)) n_junctions=length(which_junctions) if(n_junctions>0) { for(k in which_junctions) { - debut_junction = max(1,bamHandlerList[[i]]$junctions_F[[chrName]][[1]][k]-from+1) - fin_junction = min(bamHandlerList[[i]]$junctions_F[[chrName]][[2]][k]-from+1,to-from+1) + debut_junction=max(1,bamHandlerList[[i]]$junctions_F[[chrName]][[1]][k]-from+1) + fin_junction=min(bamHandlerList[[i]]$junctions_F[[chrName]][[2]][k]-from+1,to-from+1) forward_[debut_junction:fin_junction]=forward_[debut_junction:fin_junction]-bamHandlerList[[i]]$junctions_F[[chrName]][[end]][k] } } - reverse_=numeric(to-from+1) - which_read = which((bamHandlerList[[i]]$R[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$R[[chrName]][[1]]<=to)) - n_reads= length(which_read) + which_read=which((bamHandlerList[[i]]$R[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$R[[chrName]][[1]]<=to)) + n_reads=length(which_read) if(n_reads>0) { for(k in which_read) { - debut_read = max(1,bamHandlerList[[i]]$R[[chrName]][[1]][k]-from+1) - fin_read = min(bamHandlerList[[i]]$R[[chrName]][[2]][k]-from+1,to-from+1) + debut_read=max(1,bamHandlerList[[i]]$R[[chrName]][[1]][k]-from+1) + fin_read=min(bamHandlerList[[i]]$R[[chrName]][[2]][k]-from+1,to-from+1) reverse_[debut_read:fin_read]=reverse_[debut_read:fin_read]+bamHandlerList[[i]]$R[[chrName]][[end]][k] } } - which_junctions = which((bamHandlerList[[i]]$junctions_R[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$junctions_R[[chrName]][[1]]<=to)) + which_junctions=which((bamHandlerList[[i]]$junctions_R[[chrName]][[2]]>=from)&(bamHandlerList[[i]]$junctions_R[[chrName]][[1]]<=to)) n_junctions=length(which_junctions) if(n_junctions>0) { for(k in which_junctions) { - debut_junction = max(1,bamHandlerList[[i]]$junctions_R[[chrName]][[1]][k]-from+1) - fin_junction = min(bamHandlerList[[i]]$junctions_R[[chrName]][[2]][k]-from+1,to-from+1) + debut_junction=max(1,bamHandlerList[[i]]$junctions_R[[chrName]][[1]][k]-from+1) + fin_junction=min(bamHandlerList[[i]]$junctions_R[[chrName]][[2]][k]-from+1,to-from+1) reverse_[debut_junction:fin_junction]=reverse_[debut_junction:fin_junction]-bamHandlerList[[i]]$junctions_R[[chrName]][[end]][k] } } @@ -711,27 +585,21 @@ reverse[[i]]=numeric(to-from+1) } } - chr_ = list() - chr_$F = forward - chr_$R = reverse + chr_=list() + chr_$F=forward + chr_$R=reverse return(chr_) } -##Intern function for -totalReads = function(bamHandler) { - return(bamHandler[[4]][1]+bamHandler[[4]][11]) -} - -##Intern function for readGff function +## Intern function for readGff function my.read.lines2=function(fname) { - - s = file.info( fname )$size - buf = readChar( fname, s, useBytes=T) + s=file.info( fname )$size + buf=readChar( fname, s, useBytes=T) strsplit( buf,"\n",fixed=T,useBytes=T)[[1]] } -##Extracts the annotation infos from Gff file -readGff = function(file_in, from=1, to=Inf, chr=NULL, infoName=c("ID","Name","Parent","gene","Alias","orf_classification","Ontology_term","Note","GO")) { +## Extracts the annotation infos from Gff file +readGff=function(file_in, from=1, to=Inf, chr=NULL, infoName=c("ID","Name","Parent","gene","Alias","orf_classification","Ontology_term","Note","GO")) { tmp=try(my.read.lines2(file_in)) if(!is.null(chr)) { tmp1=grep(chr, tmp, value=TRUE,useBytes=T) @@ -740,28 +608,28 @@ tmp1=tmp } N=length(tmp1) - Chr = array() - Start= array() - Stop= array() + Chr=array() + Start=array() + Stop=array() Strand=array() Type=array() info=list() - for(i in 1:length(infoName)) info[[i]] = array() + for(i in 1:length(infoName)) info[[i]]=array() names(info)=infoName - j<-1 + j=1 for (i in 1:N) { if(substr(tmp1[i],1,1)!="#") { line_split=unlist(strsplit(tmp1[i],"\t",fixed=T,useBytes=T)) if((as.integer(line_split[4])<=to) & (as.integer(line_split[5])>=from)) { - Chr[j] = line_split[1] - Start[j] = as.integer(line_split[4]) - Stop[j] = as.integer(line_split[5]) + Chr[j]=line_split[1] + Start[j]=as.integer(line_split[4]) + Stop[j]=as.integer(line_split[5]) Strand[j]=line_split[7] - Type[j] = line_split[3] + Type[j]=line_split[3] ninth=unlist(strsplit(line_split[9],";",fixed=T,useBytes=T)) element_ninth_empty=rep(TRUE,length(infoName)) for(element_ninth in ninth) { - element_ninth_split = unlist(strsplit(element_ninth,"=",fixed=T,useBytes=T)) + element_ninth_split=unlist(strsplit(element_ninth,"=",fixed=T,useBytes=T)) if(length(element_ninth_split)==2) { if(element_ninth_split[1] %in% infoName) { info[[element_ninth_split[1]]][j]=element_ninth_split[2] @@ -772,16 +640,16 @@ for(infoName_ in infoName[element_ninth_empty]) { info[[infoName_]][j]="." } - j<-j+1 + j=j+1 } } } - retour = data.frame(Chr,Type,Start,Stop,Strand,info,stringsAsFactors = FALSE) + retour=data.frame(Chr,Type,Start,Stop,Strand,info,stringsAsFactors=FALSE) return(retour) } -##Returns the classic visualisation -plotRNAseq = function(forward,reverse,debut_vue=1,fin_vue=length(forward),chr=NULL,annot=NULL,style=NULL,top=NULL,bottom=NULL,x="",y="",titre="",repeated=FALSE,name_flags="",decal=0,ataxises=NULL,classic_plus_color="navyblue",classic_minus_color="mediumvioletred",stranded=TRUE) { +## Returns the classic visualisation +plotRNAseq=function(forward,reverse,debut_vue=1,fin_vue=length(forward),chr=NULL,annot=NULL,style=NULL,top=NULL,bottom=NULL,x="",y="",titre="",repeated=FALSE,name_flags="",decal=0,ataxises=NULL,classic_plus_color="navyblue",classic_minus_color="mediumvioletred",stranded=TRUE) { if(repeated) { forward_=numeric(fin_vue) forward_[debut_vue:fin_vue]=forward @@ -805,6 +673,8 @@ plot(c(debut_vue,fin_vue)+decal,c(-bottom,top),ylim=c(-bottom,top),xlab=x,ylab=y,main=titre,col="white",xaxs="i",xaxt="n",cex.main=2,yaxt="n",cex.lab=1.8) ataxisesLabels=as.character(ataxises) ataxisesLabels[((1:length(ataxises))%%2)==0]="" + ataxisesLabels[1]="" + ataxisesLabels[length(ataxises)]="" lim=c(-bottom,top) ataxises_y=pretty(lim,n=4) ataxisesLabels_y=as.character(abs(ataxises_y)) @@ -814,7 +684,7 @@ axis(2, at=ataxises_y,labels=ataxisesLabels_y,cex.axis=2,line=-0.4,lwd=0) } polygon(c(debut_vue,debut_vue:fin_vue,fin_vue)+decal,c(0,forward_[debut_vue:fin_vue],0),col=classic_plus_color,border=NA) - if(stranded){ + if(stranded) { text(fin_vue+(fin_vue-debut_vue)*0.01,top/2,"+",xpd=NA,cex=3) text(fin_vue+(fin_vue-debut_vue)*0.01,-bottom/2,"-",xpd=NA,cex=3) text(fin_vue+(fin_vue-debut_vue)*0.025,(top-bottom)/2,"Strand",xpd=NA,cex=2,srt=-90) @@ -822,41 +692,24 @@ abline(h=0,lwd=2) abline(h=0,col="white") } - #if(!is.null(chr)&!is.null(annot)&!is.null(style)) { - # N=dim(annot)[1] - #if(N>0) { - # for(i in 1:N) { - #if(annot$Strand[i]=="+") { - # arrows(annot$Start[i],top,annot$Stop[i],top,col=style$col[style$Type==annot$Type[i]],length=0.10) - #} - #if(annot$Strand[i]=="-") { - # arrows(annot$Stop[i],-bottom,annot$Start[i],-bottom,col=style$col[style$Type==annot$Type[i]],length=0.10) - #} - #if(annot$Strand[i]==".") { - # segments(annot$Start[i],top,annot$Stop[i],top,col=style$col[style$Type==annot$Type[i]]) - #segments(annot$Stop[i],-bottom,annot$Start[i],-bottom,col=style$col[style$Type==annot$Type[i]]) - #} - #} - #} - #} if(name_flags!="") { - flags=try(get(name_flags),TRUE) - if(class(flags)!="try-error") { - f_=flags[(flags$Chr==chr)&(flags$Stop>=debut_vue)&(flags$Start<=fin_vue),] - N=dim(f_)[1] - points(f_$Start,rep(0,N),col=2,pch=19,cex=2) - } - } - if(decal<=0) { + flags=try(get(name_flags),TRUE) + if(class(flags)!="try-error") { + f_=flags[(flags$Chr==chr)&(flags$Stop>=debut_vue)&(flags$Start<=fin_vue),] + N=dim(f_)[1] + points(f_$Start,rep(0,N),col=2,pch=19,cex=2) + } + } + if(decal<=0) { lines(c(0,0),c(top,-bottom),col=1) - } + } } -##Intern function for heatmap visualisation -paletteFromColors = function(colMin="blue",colMax="red",n=300,method=c("hsv","rgb")) { - colMinRGB = col2rgb(colMin)[,1]/255 - colMaxRGB = col2rgb(colMax)[,1]/255 - seqList = list() +## Intern function for heatmap visualisation +paletteFromColors=function(colMin="blue",colMax="red",n=300,method=c("hsv","rgb")) { + colMinRGB=col2rgb(colMin)[,1]/255 + colMaxRGB=col2rgb(colMax)[,1]/255 + seqList=list() if(method[1]=="rgb") { for(i in 1:3) { seqList[[i]]=seq(colMinRGB[i],colMaxRGB[i],length.out=n) @@ -864,8 +717,8 @@ return(rgb(seqList[[1]],seqList[[2]],seqList[[3]])) } else { - colMinHSV = rgb2hsv(colMinRGB) - colMaxHSV = rgb2hsv(colMaxRGB) + colMinHSV=rgb2hsv(colMinRGB) + colMaxHSV=rgb2hsv(colMaxRGB) for(i in 1:3) { seqList[[i]]=seq(colMinHSV[i],colMaxHSV[i],length.out=n) } @@ -873,14 +726,14 @@ } } -##Retuns the heatmap visualisation -myHeatMap = function(data,debut_vue,fin_vue,ataxises=NULL,lim=NULL,heatmap_max_color="#000055",heatmap_min_color="#FFFFAA",heatmap_palette_method="hsv",textOnLeftSide="") { +## Retuns the heatmap visualisation +myHeatMap=function(data,debut_vue,fin_vue,ataxises=NULL,lim=NULL,heatmap_max_color="#000055",heatmap_min_color="#FFFFAA",heatmap_palette_method="hsv",textOnLeftSide="") { palette=paletteFromColors(heatmap_min_color,heatmap_max_color,method=heatmap_palette_method) if(is.null(lim)) { - image(debut_vue:fin_vue,1:dim(data)[1],t(data), col = palette,xlab="",ylab="",xaxt="n",yaxt="n") + image(debut_vue:fin_vue,1:dim(data)[1],t(data), col=palette,xlab="",ylab="",xaxt="n",yaxt="n") } else { - image(debut_vue:fin_vue,1:dim(data)[1],t(data), col = palette,xlab="",ylab="",xaxt="n",yaxt="n",zlim=lim) + image(debut_vue:fin_vue,1:dim(data)[1],t(data), col=palette,xlab="",ylab="",xaxt="n",yaxt="n",zlim=lim) } box() if(is.null(ataxises)) { @@ -898,49 +751,64 @@ text(fin_vue+(fin_vue-debut_vue)*0.015,(dim(data)[1]+1)/2,textOnLeftSide,xpd=NA,cex=1,srt=-90) } -##Returns the title of visualisation +## Returns the title of visualisation plotTitle=function(debut_vue=1,fin_vue=length(listForward[[1]]),chr=NULL,style=NULL) { - plot(c(0,1),c(0,1),col="white",ylab="",xlab="",fg="white",col.axis="white",xaxs="i",yaxs="i") - text(0,0.5,paste(chr,":",debut_vue,"-",fin_vue,sep=""),cex=2.2,adj=0) + plot(c(0,1),c(0,1),cex=0,ylab="",xlab="",fg="white",axes=FALSE,xaxs="i",yaxs="i") + text(0,0.5,paste(chr,":",debut_vue,"-",fin_vue,sep=""),cex=2.2,adj=0) +} + +openGraphicalDevice=function(file,widthPixels,heightPixels,fileType=c("png","jpeg","tiff","bmp","pdf"),resolutionDPI=72) { + widthInches=widthPixels/72 + heightInches=heightPixels/72 + doit=function(x) { + switch(x, + png(file,widthInches,heightInches,units="in",res=resolutionDPI), + jpeg=jpeg(file,widthInches,heightInches,units="in",res=resolutionDPI), + tiff=tiff(file,widthInches,heightInches,units="in",res=resolutionDPI), + bmp=bmp(file,widthInches,heightInches,units="in",res=resolutionDPI), + pdf=pdf(file,widthInches,heightInches)) + } + doit(fileType[1]) } -##The main function of visualisation -plotVisu = function(file,typeVisu="classic",listForward,listReverse,which=1:length(listForward),stranded=TRUE, +## The main function of visualisation +plotVisu=function(file,typeVisu="classic",listForward,listReverse,which=1:length(listForward),stranded=TRUE, debut_vue=1,fin_vue=length(listForward[[1]]),chr=NULL,annot=NULL,style=NULL,tops=NULL,bottoms=NULL,marks=NULL,strandMarks=NULL, titres="",repeated=FALSE,name_flags="",decal=0,log=TRUE,classic_plus_color="navyblue",classic_minus_color="deeppink3", -heatmap_max_color="#000055",heatmap_min_color="#FFFFAA",heatmap_palette_method="hsv", +heatmap_max_color="#000055",heatmap_min_color="#FFFFAA",heatmap_palette_method="hsv",heatmap_lane_height=round(10+40/(1+10^((length(which)-12)/9))), lines_samples_colors=c(1,3,4,2)[((0:length(listForward))%%4)+1],lines_samples_type_line=((0:length(listForward))%/%4)+1, -smoothLength=trunc((fin_vue-debut_vue)/1200),annotation_color_by_strand=FALSE,annotation_placed_by_strand=FALSE) { +smoothLength=trunc((fin_vue-debut_vue)/1200),annotation_color_by_strand=FALSE,annotation_placed_by_strand=FALSE,display_name=NULL,initialize_label_sizes=NULL, +fileType="png",resolutionDPI=72) { if(fin_vue-debut_vue+1>1000000) { - png(file,1200,400) + openGraphicalDevice(file,1200,400,fileType=fileType,resolutionDPI=resolutionDPI) plot(0:1,0:1,fg="white",col="white",axes=FALSE,frame=FALSE,xlab="",ylab="") text(0.5,0.5,"Window too long !",cex=5) } else { - n_element_vue = length(which) - i_data = which[1] + n_element_vue=length(which) + i_data=which[1] forward_matrice=NULL reverse_matrice=NULL - all_stranded = data[[i_data]]$stranded + all_stranded=data[[i_data]]$stranded if(!repeated) { for(i in 1:n_element_vue) { - i_data = which[i] - forward_matrice = rbind(forward_matrice,listForward[[i_data]][debut_vue:fin_vue]) - reverse_matrice = rbind(reverse_matrice,listReverse[[i_data]][debut_vue:fin_vue]) - all_stranded = all_stranded & data[[i_data]]$stranded + i_data=which[i] + forward_matrice=rbind(forward_matrice,listForward[[i_data]][debut_vue:fin_vue]) + reverse_matrice=rbind(reverse_matrice,listReverse[[i_data]][debut_vue:fin_vue]) + all_stranded=all_stranded & data[[i_data]]$stranded } } - else{ + else { for(i in 1:n_element_vue) { - i_data = which[i] - forward_matrice = rbind(forward_matrice,listForward[[i_data]]) - reverse_matrice = rbind(reverse_matrice,listReverse[[i_data]]) - all_stranded = all_stranded & data[[i_data]]$stranded + i_data=which[i] + forward_matrice=rbind(forward_matrice,listForward[[i_data]]) + reverse_matrice=rbind(reverse_matrice,listReverse[[i_data]]) + all_stranded=all_stranded & data[[i_data]]$stranded } } rownames(forward_matrice)=titres[which] rownames(reverse_matrice)=titres[which] - for( i in 1:n_element_vue){ + for( i in 1:n_element_vue) { if(smoothLength>1) { lo=smooth(forward_matrice[i,],L=smoothLength) forward_matrice[i,]=lo @@ -954,28 +822,29 @@ } else { annot_selec=annot[(annot$Chr==chr)&(annot$Stop>=debut_vue)&(annot$Start<=fin_vue),] - heights_ann=sizePlotAnnotation(annot=annot_selec,chr=chr,debut=debut_vue,fin=fin_vue,style=style,annotation_placed_by_strand=annotation_placed_by_strand)*50 + heights_ann=sizePlotAnnotation(annot=annot_selec,chr=chr,debut=debut_vue,fin=fin_vue,annotation_placed_by_strand=annotation_placed_by_strand,display_name=display_name,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes)*50 } ataxises=pretty((debut_vue:fin_vue)+decal,n=14) - if(log){ + if(log) { label_scal="log2 tag densities" - }else{ + } + else { label_scal="tag densities" } if(debut_vue==1&decal==0) { ataxises[1]=1 - }##Classic visualisation + }## Classic visualisation if(typeVisu=="classic") { - height_panels = c(40,rep(200,n_element_vue),heights_ann) - png(file,1200,sum(height_panels)) - prev=par(no.readonly = TRUE) + height_panels=c(40,rep(200,n_element_vue),heights_ann) + openGraphicalDevice(file,1200,sum(height_panels),fileType=fileType,resolutionDPI=resolutionDPI) + prev=par(no.readonly=TRUE) n_panels=length(height_panels) layout(matrix(1:n_panels,n_panels,1),heights=height_panels) par(mar=c(0, 5, 0, 4)+0.1) plotTitle(debut_vue=debut_vue,fin_vue=fin_vue,chr=chr) par(mar=c(2.5, 5, 2.5, 4)+0.1) for(element in 1:n_element_vue) { - i_data = which[element] + i_data=which[element] plotRNAseq(listForward[[i_data]],listReverse[[i_data]],debut_vue,fin_vue,chr,annot_selec,style,top=tops[min(element,length(tops))],bottom=bottoms[min(element,length(bottoms))],y=label_scal,titre=titres[min(i_data,length(titres))],name_flags=name_flags,repeated=repeated,decal=decal,ataxises=ataxises,classic_plus_color=classic_plus_color,classic_minus_color=classic_minus_color,stranded=stranded) if(!is.null(marks)) { if(is.null(tops)) { @@ -1007,27 +876,27 @@ } } } - } - par(mar=c(0,5,1,4)+0.1) - if(!is.null(annot_selec)) { - plot_annotation(annot_selec,chr,debut=debut_vue,fin=fin_vue,style=style,textSize=1.5,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand) - } + } + if(!is.null(annot_selec)) { + par(mar=c(0,5,1,4)+0.1) + plot_annotation(annot_selec,chr,debut=debut_vue,fin=fin_vue,style=style,textSize=1.5,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand,display_name=display_name,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes) + } } - else {##Heatmap visualisation + else {## Heatmap visualisation if(typeVisu=="heatmap") { if(stranded) { - height_panels=c(40,50*n_element_vue+28,heights_ann,50*n_element_vue+10,100) + height_panels=c(40,heatmap_lane_height*n_element_vue+28,heights_ann,heatmap_lane_height*n_element_vue+10,100) } else { - height_panels=c(40,50*n_element_vue+28,heights_ann,100) + height_panels=c(40,heatmap_lane_height*n_element_vue+28,heights_ann,100) } - png(file,1200,sum(height_panels)) - prev=par(no.readonly = TRUE) + openGraphicalDevice(file,1200,sum(height_panels),fileType=fileType,resolutionDPI=resolutionDPI) + prev=par(no.readonly=TRUE) n_panels=length(height_panels) layout(matrix(1:n_panels,n_panels,1),heights=height_panels) par(mar=c(0, 8, 0, 2)+0.1) plotTitle(debut_vue=debut_vue,fin_vue=fin_vue,chr=chr) - limIntensity = c(0,max(forward_matrice,reverse_matrice,na.rm=TRUE)) + limIntensity=c(0,max(forward_matrice,reverse_matrice,na.rm=TRUE)) if(limIntensity[2]==0) { limIntensity[2]=0.01 } @@ -1043,8 +912,8 @@ tmp="Both strands" } myHeatMap(forward_matrice[(dim(forward_matrice)[1]):1,],debut_vue+decal,fin_vue+decal,ataxises,lim=limIntensity,heatmap_max_color=heatmap_max_color,heatmap_min_color=heatmap_min_color,heatmap_palette_method=heatmap_palette_method,textOnLeftSide=tmp) - axis(3,at=ataxises,labels=FALSE,cex.axis=1.4) - axis(3,at=ataxises,labels=ataxisesLabels,cex.axis=1.4,line=-0.4,lwd=0) + axis(3,at=ataxises,labels=FALSE,cex.axis=1.2) + axis(3,at=ataxises,labels=ataxisesLabels,cex.axis=1.2,line=-0.4,lwd=0) if(!is.null(marks)) { if(is.null(strandMarks)|!all_stranded) { segments(marks,-2,y1=dim(forward_matrice)[1]+2,col=2,lwd=2,xpd=TRUE) @@ -1065,7 +934,7 @@ } if(!is.null(annot_selec)) { par(mar=c(0,8,0,2)+0.1) - plot_annotation(annot_selec,chr,debut=debut_vue,fin=fin_vue,style=style,textSize=0.9,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand) + plot_annotation(annot_selec,chr,debut=debut_vue,fin=fin_vue,style=style,textSize=0.9,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand,display_name=display_name,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes) } if(stranded) { par(mar=c(1, 8, 0, 2)+0.1) @@ -1080,7 +949,7 @@ } } } - axis(1,at=ataxises,labels=ataxisesLabels,cex.axis=1.4) + axis(1,at=ataxises,labels=ataxisesLabels,cex.axis=1.2) } palette=matrix(seq(limIntensity[1],limIntensity[2],length.out=2000),1,2000) rownames(palette)="0" @@ -1088,17 +957,16 @@ myHeatMap(palette,1,2000,ataxises=NA,heatmap_max_color=heatmap_max_color,heatmap_min_color=heatmap_min_color,heatmap_palette_method=heatmap_palette_method) labelAxisHeatmapLegend=pretty(c(limIntensity[1],limIntensity[2]),n=7) atAxisHeatmapLegend=1+((labelAxisHeatmapLegend-limIntensity[1])/(limIntensity[2]-limIntensity[1]))*1999 - axis(1,at=atAxisHeatmapLegend,labels=FALSE,cex.axis=1.4) - axis(1,at=atAxisHeatmapLegend,labels=labelAxisHeatmapLegend,cex.axis=1.4,line=-0.4,lwd=0) + axis(1,at=atAxisHeatmapLegend,labels=FALSE,cex.axis=1.2) + axis(1,at=atAxisHeatmapLegend,labels=labelAxisHeatmapLegend,cex.axis=1.2,line=-0.4,lwd=0) text(1000,2,label_scal,xpd=NA,font=2,cex=1.4) - - }##Lines visualisation - else if(typeVisu=="lines"){ - legendSize = (floor(n_element_vue/2)+n_element_vue%%2)*40 + }## Lines visualisation + else if(typeVisu=="lines") { + legendSize=(floor(n_element_vue/2)+n_element_vue%%2)*40 height_panels=c(40,legendSize,400,heights_ann) n_panels=length(height_panels) - png(file,1200,sum(height_panels)) - prev=par(no.readonly = TRUE) + openGraphicalDevice(file,1200,sum(height_panels),fileType=fileType,resolutionDPI=resolutionDPI) + prev=par(no.readonly=TRUE) par(mar=c(0, 5, 0,4)+0.1,cex=1.1) layout(matrix(c(1:n_panels),n_panels,4),heights=height_panels) par(mar=c(0, 5, 0,4)+0.1) @@ -1106,9 +974,9 @@ lines_legend(n_element_vue=n_element_vue,which,titres,lines_samples_colors=lines_samples_colors,lines_samples_type_line=lines_samples_type_line) par(mar=c(3, 5, 0,4)+0.1) plotlines(forward_matrice,reverse_matrice,which=which,debut_vue, fin_vue , chr, annot, style, tops, bottoms,marks,strandMarks,titres, repeated,name_flags,decal,ataxises=ataxises,n_element_vue=n_element_vue,y=label_scal,lines_samples_colors=lines_samples_colors,lines_samples_type_line=lines_samples_type_line,stranded=stranded) - par(mar=c(0,5,0,4)+0.1) if(!is.null(annot_selec)) { - plot_annotation(annot_selec,chr,debut=debut_vue,fin=fin_vue,style=style,textSize=1.5,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand) + par(mar=c(0,5,0,4)+0.1) + plot_annotation(annot_selec,chr,debut=debut_vue,fin=fin_vue,style=style,textSize=1.5,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand,display_name=display_name,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes) } } @@ -1117,36 +985,36 @@ invisible(dev.off()) } -## -smooth = function(X,L=10) { +## +smooth=function(X,L=10) { x_smooth=filter(X,rep(1,L)/L) x_smooth[is.na(x_smooth)]=0 return(x_smooth) } -##Returns lines visualisation -plotlines = function(forward_matrice,reverse_matrice,which=1:length(forward), debut_vue = 1,fin_vue =length(forward), chr=NULL, annot=NULL, style=NULL, tops=NULL,bottoms=NULL,marks=NULL,strandMarks=NULL, titres="", repeated=FALSE,name_flags="",decal=0,ataxises=NULL,n_element_vue=length(which),y="",lines_samples_colors=c(1,3,4,2)[((0:length(forward))%%4)+1],lines_samples_type_line=((0:length(forward))%%4)+1,stranded=TRUE){ - limIntensity = c(-max(bottoms),max(tops)) -plot(c(debut_vue,fin_vue)+decal,limIntensity,col="white",ylab=y,main="",xaxs="i",xaxt="n",yaxt="n",cex.lab=1.2,xlab="",cex.lab=1.8) - lty=lines_samples_type_line - col=lines_samples_colors - ataxises_y= pretty(limIntensity,n=8) +## Returns lines visualisation +plotlines=function(forward_matrice,reverse_matrice,which=1:length(forward), debut_vue=1,fin_vue=length(forward), chr=NULL, annot=NULL, style=NULL, tops=NULL,bottoms=NULL,marks=NULL,strandMarks=NULL, titres="", repeated=FALSE,name_flags="",decal=0,ataxises=NULL,n_element_vue=length(which),y="",lines_samples_colors=c(1,3,4,2)[((0:length(forward))%%4)+1],lines_samples_type_line=((0:length(forward))%%4)+1,stranded=TRUE) { + limIntensity=c(-max(bottoms),max(tops)) + plot(c(debut_vue,fin_vue)+decal,limIntensity,col="white",ylab=y,main="",xaxs="i",xaxt="n",yaxt="n",cex.lab=1.2,xlab="",cex.lab=1.8) + lty=lines_samples_type_line + col=lines_samples_colors + ataxises_y=pretty(limIntensity,n=8) for(i in ataxises) abline(v=i,lty=2,col="#808080") for(i in ataxises_y) abline(h=i,lty=2,col="#808080") - for( i in 1:n_element_vue){ + for( i in 1:n_element_vue) { lo=forward_matrice[i,] - lines((debut_vue:fin_vue)+decal,lo,type="l",lty =lty[i],col=col[i],lwd=2,xaxt="n") + lines((debut_vue:fin_vue)+decal,lo,type="l",lty=lty[i],col=col[i],lwd=2,xaxt="n") if(stranded) { los=-reverse_matrice[i,] - lines((debut_vue:fin_vue)+decal,los,type="l",lty= lty[i],col=col[i],lwd=2,xaxt="n") + lines((debut_vue:fin_vue)+decal,los,type="l",lty=lty[i],col=col[i],lwd=2,xaxt="n") } } ataxisesLabels=as.character(ataxises) ataxisesLabels[((1:length(ataxises))%%2)==0]="" ataxisesLabels_y=as.character(abs(ataxises_y)) ataxisesLabels_y[((1:length(ataxises_y))%%2)==0]="" - axis(1,at=ataxises,labels=FALSE,cex.axis=2.6) - axis(1,at=ataxises,labels=ataxisesLabels,cex.axis=2.6,line=0.6,lwd=0) + axis(1,at=ataxises,labels=FALSE,cex.axis=2) + axis(1,at=ataxises,labels=ataxisesLabels,cex.axis=2,line=0.6,lwd=0) axis(2, at=ataxises_y,labels=FALSE,cex.axis=2) axis(2, at=ataxises_y,labels=ataxisesLabels_y,cex.axis=2,line=-0.4,lwd=0) if(stranded) { @@ -1188,11 +1056,11 @@ } } -##Returns lines legend -lines_legend=function(n_element_vue,which,titres,lines_samples_colors=c(1,3,4,2)[((0:n_element_vue)%%4)+1],lines_samples_type_line=((0:n_element_vue)%%4)+1){ +## Returns lines legend +lines_legend=function(n_element_vue,which,titres,lines_samples_colors=c(1,3,4,2)[((0:n_element_vue)%%4)+1],lines_samples_type_line=((0:n_element_vue)%%4)+1) { lty=lines_samples_type_line col=lines_samples_colors - n_y = floor(n_element_vue/2)+n_element_vue%%2 + n_y=floor(n_element_vue/2)+n_element_vue%%2 plot(c(0,4),c(0,-(n_y+1)),col="white", ylab="",xlab="",main="",fg="white",col.axis="white",yaxs="i") i_style=0 for(i in 1:n_y) { @@ -1209,43 +1077,62 @@ } } -##Returns the shape of plain arrow for the annotation -plain_arrow = function(left,right,y,thickness=1,pickSize=(right-left)*0.1,pickSide=c("right","left","both","none"),col="blue") { +## Returns the shape of plain arrow for the annotation +plain_arrow=function(left,right,y,thickness=1,pickSize=(right-left)*0.1,pickSide=c("right","left","both","none"),col="blue") { middle=(left+right)/2 if(pickSide[1]=="right") { - pick_point = max(right - pickSize,middle) + pick_point=max(right - pickSize,middle) polygon(c(left,pick_point,right,pick_point,left),c(y-thickness/2,y-thickness/2,y,y+thickness/2,y+thickness/2),col=col) } if(pickSide[1]=="left") { - pick_point = min(left + pickSize,middle) + pick_point=min(left + pickSize,middle) polygon(c(right,pick_point,left,pick_point,right),c(y-thickness/2,y-thickness/2,y,y+thickness/2,y+thickness/2),col=col) } if(pickSide[1]=="none") { polygon(c(right,left,left,right),c(y-thickness/2,y-thickness/2,y+thickness/2,y+thickness/2),col=col) } if(pickSide[1]=="both") { - pick_point_1 = min(left + pickSize,middle) - pick_point_2 = max(right - pickSize,middle) + pick_point_1=min(left + pickSize,middle) + pick_point_2=max(right - pickSize,middle) polygon(c(left,pick_point_1,pick_point_2,right,pick_point_2,pick_point_1),c(y,y-thickness/2,y-thickness/2,y,y+thickness/2,y+thickness/2),col=col) } } -##Returns the size of the panel of the annotation -sizePlotAnnotation=function(annot,chr,debut,fin,style=NULL,annotation_placed_by_strand=FALSE) { - left=c() - right=c() - annot_chr=annot[annot$Chr==chr,] - unique_ID=unique(annot_chr$ID) - for(j in 1:length(unique_ID)){ - left[j]=min(annot_chr$Start[annot_chr$ID==unique_ID[j]]) - right[j]=max(annot_chr$Stop[annot_chr$ID==unique_ID[j]]) - } - y_plot=parking(left,right) - return(max(y_plot)-min(y_plot)+1) +## Returns the size of the panel of the annotation +sizePlotAnnotation=function(annot,chr,debut,fin,annotation_placed_by_strand=FALSE,display_name=NULL,typeVisu="classic",initialize_label_sizes=NULL) { + left=c() + right=c() + labels=c() + annot_chr=annot[annot$Chr==chr,] + N=dim(annot_chr)[1] + all_names=annot_chr[,c(display_name,"ID")] + unique_ID=unique(annot_chr$ID) + for(j in 1:length(unique_ID)) { + left[j]=min(annot_chr$Start[annot_chr$ID==unique_ID[j]]) + right[j]=max(annot_chr$Stop[annot_chr$ID==unique_ID[j]]) + all_names=unlist(annot_chr[annot_chr$ID==unique_ID[j],c(display_name,"ID")]) + all_names=all_names[all_names!="."] + labels[j]=all_names[1] + } + if(annotation_placed_by_strand) { + y_plot=parking(left,right,debut,fin,annot_chr$Strand=="+",FALSE,labels=labels,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes) + } + else { + y_plot=parking(left,right,debut,fin,biggestOnesInside=FALSE,labels=labels,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes) + } + return(max(y_plot)-min(y_plot)+1) } -##Function to organise the annotation shapes to display -parking = function(left,right,plus=rep(TRUE,length(left)),biggestOnesInside=TRUE) { +## Function to organise the annotation shapes to display +parking=function(left,right,debut,fin,plus=rep(TRUE,length(left)),labels=c(),biggestOnesInside=TRUE,typeVisu="classic",initialize_label_sizes=NULL) { + if(length(labels)!=0) { + if(!is.null(initialize_label_sizes)) { + initialize_=initialize_label_sizes[[typeVisu]] + for(i in 1:length(left)) { + right[i]=max(right[i],left[i]+initialize_$size[initialize_$labels==labels[i]])+(fin-debut)/100 + } + } + } y=rep(0,length(left)) if(sum(plus)>0) { left_plus=left[plus] @@ -1302,30 +1189,33 @@ return(y) } -##Function to -plot_annotation = function(annot,chr,debut,fin,style=NULL,textSize=par("cex"),annotation_color_by_strand=FALSE,annotation_placed_by_strand=FALSE) { +## Function to +plot_annotation=function(annot,chr,debut,fin,style=NULL,textSize=par("cex"),annotation_color_by_strand=FALSE,annotation_placed_by_strand=FALSE,display_name=NULL,typeVisu="classic",initialize_label_sizes=NULL) { left=c() right=c() + labels=c() annot_chr=annot[annot$Chr==chr,] - N = dim(annot_chr)[1] - strand=c("-","+") + N=dim(annot_chr)[1] if(N>0) { + all_names=annot_chr[,c(display_name,"ID")] unique_ID=unique(annot_chr$ID) - par_prev=par(no.readonly=TRUE) - for(j in 1:length(unique_ID)){ + for(j in 1:length(unique_ID)) { left[j]=min(annot_chr$Start[annot_chr$ID==unique_ID[j]]) right[j]=max(annot_chr$Stop[annot_chr$ID==unique_ID[j]]) + all_names=unlist(annot_chr[annot_chr$ID==unique_ID[j],c(display_name,"ID")]) + all_names=all_names[all_names!="."] + labels[j]=all_names[1] } if(annotation_placed_by_strand) { - y_plot=parking(left,right,annot_chr$Strand=="+",FALSE) + y_plot=parking(left,right,debut,fin,annot_chr$Strand=="+",FALSE,labels=labels,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes) } else { - y_plot=parking(left,right,biggestOnesInside=FALSE) + y_plot=parking(left,right,debut,fin,biggestOnesInside=FALSE,labels=labels,typeVisu=typeVisu,initialize_label_sizes=initialize_label_sizes) } plot(c(debut,fin),c(min(y_plot)-0.5,max(y_plot)+0.5),col="white",ylab="",xlab="",fg="white",col.axis="white",xaxs="i",yaxs="i") - for(j in 1:length(unique_ID)){ + for(j in 1:length(unique_ID)) { annot_ID=annot_chr[annot_chr$ID==unique_ID[j],] - x_text<-Inf + x_text=Inf for(i_annot_ID in 1:dim(annot_ID)[1]) { if(annot_ID$Strand[i_annot_ID]=="-") { orientation="left" @@ -1333,30 +1223,30 @@ else { orientation="right" } - iDraw = FALSE + iDraw=FALSE if(annot_ID$Strand[i_annot_ID]==".") { tmp="+" } else { tmp=annot_ID$Strand[i_annot_ID] } - style_demande = style[style$Type==annot_ID$Type[i_annot_ID]&style$Strand==tmp,] - x_text<-min(x_text,annot_ID$Start[i_annot_ID]) + style_demande=style[style$Type==annot_ID$Type[i_annot_ID]&style$Strand==tmp,] + x_text=min(x_text,annot_ID$Start[i_annot_ID]) if(style_demande$shape=="box") { plain_arrow(annot_ID$Start[i_annot_ID],annot_ID$Stop[i_annot_ID],y_plot[j],thickness=0.5, pickSide=orientation,col=style_demande$col,pickSize=(fin-debut)*0.02) iDraw=TRUE } - if(style_demande$shape =="line") { + if(style_demande$shape=="line") { segments(annot_ID$Start[i_annot_ID],y_plot[j],annot_ID$Stop[i_annot_ID],y_plot[j],col=style_demande$col) iDraw=TRUE } - if(style_demande$shape == "rectangle") { + if(style_demande$shape=="rectangle") { plain_arrow(annot_ID$Start[i_annot_ID],annot_ID$Stop[i_annot_ID],y_plot[j],thickness=0.5, pickSide="none",col=style_demande$col) iDraw=TRUE } - if(style_demande$shape == "arrow") { - arrowHeads = pretty(debut:fin,n=50) - x<-c(annot_ID$Start[i_annot_ID],arrowHeads[arrowHeads>annot_ID$Start[i_annot_ID]&arrowHeadsannot_ID$Start[i_annot_ID]&arrowHeads [ ...] - options: + ./ving.R [options] [ ...] + Options: -o/--output [default:./ouput.png] + -F/--fileType [default: same as -o] (png,jpeg,bmp,tiff,pdf) + -R/--resolution [default: 72 ] -v/--typeVisu [default: classic ] (classic,lines,heatmap) -t/--description-data [default: ] -n/--normalization-coefficient [default: none ] @@ -1453,8 +1342,9 @@ --annotation-placed-by-strand [default: FALSE ] -L/--smoothLength [default: NA ] ",stdout()) - q("no") -}else { + q("no") +} +else { tmp=suppressPackageStartupMessages(require("Rsamtools")) if(!tmp) { stop("Package Rsamtools required !!") @@ -1463,51 +1353,51 @@ if(!tmp) { stop("Package GenomicRanges required !!") } - optArgs=getopt( - rbind( + optArgs=getopt(rbind( + c('output','o', 1, 'character',"output.png"), + c('fileType','F',1,'character',NA), + c('resolution','R',1,'numeric',72), + c('typeVisu', 'v', 1, 'character', "classic"), + c('description-data','t',1,'character',NA), + c('chromosome-name', 'c', 1, 'character', NA), + c('start', 'S', 1, 'numeric',1), + c('end', 'E', 1, 'numeric',NA), + c('annotation','a',1,'character',NA), + c('typeTranscript','r',1,'character',NA), + c('annotation-colors','C',1,'character',NA), + c('annotation-shapes','s','1','character',NA), + c('normalization-coefficient','n','1','character',NA), + c('classic-plus-color',1,1,'character',"navyblue"), + c('classic-minus-color',2,1,'character',"deeppink3"), + c('heatmap-max-color',3,1,'character',"#000055"), + c('heatmap-min-color',4,1,'character',"#FFFFAA"), + c('heatmap-palette-method',7,1,'character',"hsv"), + c('lines-samples-colors',5,1,'character',NA), + c('lines-samples-type-line',6,1,'character',NA), + c('scale-log', 'l',0,'logical',FALSE), + c('inverseStrand','i',0,'logical', FALSE), + c('unstranded','u',0,'logical', FALSE), + c('symetric-scale','y',0,'logical', FALSE), + c('annotation-color-by-strand',8,0,'logical',FALSE), + c('annotation-placed-by-strand',9,0,'logical',FALSE), + c('smoothLength','L',1,'numeric',NA) + )) +} - c('output','o', 1, 'character',"output.png"), - c('typeVisu', 'v', 1, 'character', "classic"), - c('description-data','t',1,'character',NA), - c('chromosome-name', 'c', 1, 'character', NA), - c('start', 'S', 1, 'numeric',1), - c('end', 'E', 1, 'numeric',NA), - c('annotation','a',1,'character',NA), - c('typeTranscript','r',1,'character',NA), - c('annotation-colors','C',1,'character',NA), - c('annotation-shapes','s','1','character',NA), - c('normalization-coefficient','n','1','character',NA), - c('classic-plus-color',1,1,'character',"navyblue"), - c('classic-minus-color',2,1,'character',"deeppink3"), - c('heatmap-max-color',3,1,'character',"#000055"), - c('heatmap-min-color',4,1,'character',"#FFFFAA"), - c('heatmap-palette-method',7,1,'character',"hsv"), - c('lines-samples-colors',5,1,'character',NA), - c('lines-samples-type-line',6,1,'character',NA), - c('scale-log', 'l',0,'logical',FALSE), - c('inverseStrand','i',0,'logical', FALSE), - c('unstranded','u',0,'logical', FALSE), - c('symetric-scale','y',0,'logical', FALSE), - c('annotation-color-by-strand',8,0,'logical',FALSE), - c('annotation-placed-by-strand',9,0,'logical',FALSE), - c('smoothLength','L',1,'numeric',NA) - ) - ) -} ################### ## ARGUMENTS ################################################################################# files=optArgs$ARGUMENT -##Case file doesn't exist - for( i in 1:length(files)) { - if(!file.exists(files[i])) { - stop(paste(files[i],"do not exist!","\n")) - } +## Case file doesn't exist +for( i in 1:length(files)) { + if(!file.exists(files[i])) { + stop(paste(files[i],"do not exist!","\n")) } +} imagefile=optArgs$output typeVisu=optArgs$typeVisu description_data=optArgs$`description-data` -chrName = optArgs$`chromosome-name`[1] +chrName=optArgs$`chromosome-name`[1] start=optArgs$start[1] end=optArgs$end[1] annotation=optArgs$annotation @@ -1525,269 +1415,274 @@ log=optArgs$`scale-log` inverseStrand=optArgs$inverseStrand unstranded=optArgs$unstranded -symetric= optArgs$`symetric-scale` +symetric=optArgs$`symetric-scale` annotation_color_by_strand=optArgs$`annotation-color-by-strand` annotation_placed_by_strand=optArgs$`annotation-placed-by-strand` smoothLength=optArgs$`smoothLength` +resolution=optArgs$`resolution` +fileType=optArgs$`fileType` + ################### ## MAIN ################################################################################### - - genome_info=scanBamHeader(files[1])[[1]]$targets - noms_chromosomes = names(genome_info) - longueur_chromosomes = as.integer(genome_info) - nombre_chromosomes = length(noms_chromosomes) +genome_info=scanBamHeader(files[1])[[1]]$targets +noms_chromosomes=names(genome_info) +longueur_chromosomes=as.integer(genome_info) +nombre_chromosomes=length(noms_chromosomes) -##Case no chromosome specified - if(sum(is.na(chrName))) { - chrName = noms_chromosomes[1] - write(paste("No chromosome specified, processing chromosome :",chrName,"\n",sep=""),stderr()) - } - +## Case no chromosome specified +if(sum(is.na(chrName))) { + chrName=noms_chromosomes[1] + write(paste("No chromosome specified, processing chromosome :",chrName,"\n",sep=""),stderr()) +} + +## Case false chromosome name +if(!(chrName %in% noms_chromosomes)) { + stop(paste("\"",chrName,"\" is not a proper chromosome name")) +} -##Case false chromosome name - if(!(chrName %in% noms_chromosomes)) { - stop(paste("\"",chrName,"\" is not a proper chromosome name")) - } - - if(is.na(end)) { - end = longueur_chromosomes[chrName==noms_chromosomes] - } - - if(start > end) { - stop("The start is bigger than the end!") - } +if(is.na(end)) { + end=longueur_chromosomes[chrName==noms_chromosomes] +} + +if(start > end) { + stop("The start is bigger than the end!") +} -##Case asked coordinates outside the chromosome - if(start<0|end>longueur_chromosomes[chrName==noms_chromosomes]) { - stop("You are outside the chromosome") - } - - if(sum(is.na(weight))>0) { - normalized_data=NULL - isnormalized=FALSE - } - else { - isnormalized=TRUE - normalized_data=unlist(strsplit(weight,split=",")) - if(length(files)!=length(normalized_data)) { - stop("Different number of files and weights ") - } - else { - normalized_data=as.numeric(normalized_data) - } - } +## Case asked coordinates outside the chromosome +if(start<0|end>longueur_chromosomes[chrName==noms_chromosomes]) { + stop("You are outside the chromosome") +} - if(inverseStrand) { - libraryType="inverse" +if(sum(is.na(weight))>0) { + normalized_data=NULL + isnormalized=FALSE +} +else { + isnormalized=TRUE + normalized_data=unlist(strsplit(weight,split=",")) + if(length(files)!=length(normalized_data)) { + stop("Different number of files and weights ") } else { - libraryType="standard" - } -##Read the bam file and extract the infos - doit=function(i,libraryType) { - - try(readBam_(files[i], libraryType=libraryType, chrName_=chrName, from_=start, to_=end,normalized_=normalized_data[i])) - } - data=lapply(1:length(files),doit,libraryType=libraryType) - ctrl=unlist(lapply(data,class))=="try-error" - - if(sum(ctrl)>0) { - for(i in which(ctrl)) { - write(paste("Problem with file :",files[i],"\n",sep=""),stderr()) - } - stop("At least a file has problem") + normalized_data=as.numeric(normalized_data) } +} -##Read the GFF file and extract the infos - if(sum(is.na(annotation))==0) { - gff=try(readGff(annotation[1],from=start,to=end,chr=chrName),TRUE) - ctrl=class(gff)=="try-error" - if(sum(ctrl)>0) { - stop(paste("Problem with Gff file :",annotation,"\n")) - } - if(length(annotation)>1) { - for(i in 2:length(annotation)) { - gff1=try(readGff(annotation[i],from=start,to=end,chr=chrName),TRUE) - ctrl=class(gff1)=="try-error" - if(sum(ctrl)>0) { - stop(paste("Problem with Gff file :",gff1,"\n")) - } +if(inverseStrand) { + libraryType="inverse" +} +else { + libraryType="standard" +} +## Read the bam file and extract the infos +doit=function(i,libraryType) { + try(readBam_(files[i], libraryType=libraryType, chrName_=chrName, from_=start, to_=end,normalized_=normalized_data[i])) +} +data=lapply(1:length(files),doit,libraryType=libraryType) +ctrl=unlist(lapply(data,class))=="try-error" + +if(sum(ctrl)>0) { + for(i in which(ctrl)) { + write(paste("Problem with file :",files[i],"\n",sep=""),stderr()) + } + stop("At least a file has problem") +} + +## Read the GFF file and extract the infos +if(sum(is.na(annotation))==0) { + gff=try(readGff(annotation[1],from=start,to=end,chr=chrName),TRUE) + ctrl=class(gff)=="try-error" + if(sum(ctrl)>0) { + stop(paste("Problem with Gff file :",annotation,"\n")) + } + if(length(annotation)>1) { + for(i in 2:length(annotation)) { + gff1=try(readGff(annotation[i],from=start,to=end,chr=chrName),TRUE) + ctrl=class(gff1)=="try-error" + if(sum(ctrl)>0) { + stop(paste("Problem with Gff file :",gff1,"\n")) + } + if(sum(is.na(gff1))==0) { gff=rbind(gff,gff1) } } } - else { - gff=NA - } - - if(sum(is.na(description_data))>0) { - description_data = files - } +} +else { + gff=NA +} + +if(sum(is.na(description_data))>0) { + description_data=files +} - ##Case of different number of files and description - if(length(description_data)!=length(files)) { - stop("Different number of files and description") - } - -## pooling bam files (if necessary) - description_data_unique=unique(description_data) - data_pooled=list() - for(i_descri in description_data_unique) { - i_data_voulu=which(description_data==i_descri) - data_pooled_=data[[i_data_voulu[1]]] - if(length(i_data_voulu)>1) { - for(i_i_data_voulu in 2:length(i_data_voulu)) { - data_pooled_=addBam(data_pooled_,data[[i_data_voulu[i_i_data_voulu]]]) - } - } - data_pooled[[i_descri]]=data_pooled_ - } +## Case of different number of files and description +if(length(description_data)!=length(files)) { + stop("Different number of files and description") +} - if(sum(is.na(typeTranscript)>0)) { - if(sum(is.na(annotation))==0) { - typeTranscritSplit=unique(gff$Type) - }else { - typeTranscritSplit=NA - } - }else { - typeTranscritSplit=unlist(strsplit(typeTranscript,split=",")) - } - - if((sum(is.na(gff))>0)|(sum(is.na(typeTranscritSplit))>0)) { - annot=NULL - }else { - annot = gff[gff$Type %in% typeTranscritSplit,] - } - -##check the colors - - if(sum(is.na(colors)>0)) { - nTypeTranscrit = length(typeTranscritSplit) - if(annotation_color_by_strand) { - colorsSplit=c(classic_plus_color,classic_minus_color) - } - else { - colorsSplit=rainbow(nTypeTranscrit+1)[1:nTypeTranscrit] +## Pooling bam files (if necessary) +description_data_unique=unique(description_data) +data_pooled=list() +for(i_descri in description_data_unique) { + i_data_voulu=which(description_data==i_descri) + data_pooled_=data[[i_data_voulu[1]]] + if(length(i_data_voulu)>1) { + for(i_i_data_voulu in 2:length(i_data_voulu)) { + data_pooled_=addBam(data_pooled_,data[[i_data_voulu[i_i_data_voulu]]]) } } + data_pooled[[i_descri]]=data_pooled_ +} + +if(sum(is.na(typeTranscript)>0)) { + if(sum(is.na(annotation))==0) { + typeTranscritSplit=unique(gff$Type) + } else { - colorsSplit=unlist(strsplit(colors,split=",")) + typeTranscritSplit=NA } - for(i in 1:length(colorsSplit)) { - tmp=unlist(strsplit(colorsSplit[i],split="")) - if(length(tmp)==6|length(tmp)==8) { - if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { - colorsSplit[i] = paste("#",colorsSplit[i],sep="") - } - } +} +else { + typeTranscritSplit=unlist(strsplit(typeTranscript,split=",")) +} + +if((sum(is.na(gff))>0)|(sum(is.na(typeTranscritSplit))>0)) { + annot=NULL +} +else { + annot=gff[gff$Type %in% typeTranscritSplit,] +} + +## Check the colors +if(sum(is.na(colors)>0)) { + nTypeTranscrit=length(typeTranscritSplit) + if(annotation_color_by_strand) { + colorsSplit=c(classic_plus_color,classic_minus_color) } - - tmp=unlist(strsplit(classic_plus_color,split="")) + else { + colorsSplit=rainbow(nTypeTranscrit+1)[1:nTypeTranscrit] + } +} +else { + colorsSplit=unlist(strsplit(colors,split=",")) +} +for(i in 1:length(colorsSplit)) { + tmp=unlist(strsplit(colorsSplit[i],split="")) if(length(tmp)==6|length(tmp)==8) { if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { - classic_plus_color = paste("#",classic_plus_color,sep="") + colorsSplit[i]=paste("#",colorsSplit[i],sep="") } } - tmp=unlist(strsplit(classic_minus_color,split="")) - if(length(tmp)==6|length(tmp)==8) { - if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { - classic_minus_color = paste("#",classic_minus_color,sep="") - } +} + +tmp=unlist(strsplit(classic_plus_color,split="")) +if(length(tmp)==6|length(tmp)==8) { + if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { + classic_plus_color=paste("#",classic_plus_color,sep="") + } +} +tmp=unlist(strsplit(classic_minus_color,split="")) +if(length(tmp)==6|length(tmp)==8) { + if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { + classic_minus_color=paste("#",classic_minus_color,sep="") + } +} +tmp=unlist(strsplit(heatmap_max_color,split="")) +if(length(tmp)==6|length(tmp)==8) { + if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { + heatmap_max_color=paste("#",heatmap_max_color,sep="") + } +} +tmp=unlist(strsplit(heatmap_min_color,split="")) +if(length(tmp)==6|length(tmp)==8) { + if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { + heatmap_min_color=paste("#",heatmap_min_color,sep="") } - tmp=unlist(strsplit(heatmap_max_color,split="")) - if(length(tmp)==6|length(tmp)==8) { - if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { - heatmap_max_color = paste("#",heatmap_max_color,sep="") - } - } - tmp=unlist(strsplit(heatmap_min_color,split="")) - if(length(tmp)==6|length(tmp)==8) { - if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { - heatmap_min_color = paste("#",heatmap_min_color,sep="") +} +if(sum(is.na(lines_samples_colors))>0) { + lines_samples_colors_split=c(1,3,4,2)[((0:(length(files)-1))%%4)+1] +} +else { + lines_samples_colors_split=unlist(strsplit(lines_samples_colors,split=",")) + for(i in 1:length(lines_samples_colors_split)) { + tmp=unlist(strsplit(lines_samples_colors_split[i],split="")) + if(length(tmp)==6|length(tmp)==8) { + if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { + lines_samples_colors_split[i]=paste("#",lines_samples_colors_split[i],sep="") + } + } + } +} +colorToCheck=c(colorsSplit,classic_plus_color,classic_minus_color,heatmap_max_color,heatmap_min_color,lines_samples_colors_split) +ctrl=sapply(colorToCheck,is.acceptable.color) +if(sum(!ctrl)>0) { + for(i in which(!ctrl)) { + write(paste("\"",colorToCheck[i],"\" is not a proper color name.\n",sep=""),stderr()) + } + stop("At least one color has a problem") +} +if(annotation_color_by_strand) { + if(length(colorsSplit)>2) { + stop("You have to specify two and only two colors!") + } +} +else { + if(length(typeTranscritSplit)!=length(colorsSplit)) { + stop("Please specify the same number of transcript types and colors") + } +} +## Check the line types +if(sum(is.na(lines_samples_type_line))>0) { + lines_samples_type_line_split=((0:(length(files)-1))%/%4)+1 +} +else { + lines_samples_type_line_split=unlist(strsplit(lines_samples_type_line,split=",")) +} +if(typeVisu=="lines") { + ctrl=sapply(lines_samples_type_line_split,function(x) { + tmp=suppressWarnings(as.numeric(x)) + if(!is.na(tmp)) { + return((tmp==floor(tmp))&tmp>=1&tmp<=5) } - } - if(sum(is.na(lines_samples_colors))>0) { - lines_samples_colors_split=c(1,3,4,2)[((0:(length(files)-1))%%4)+1] - } - else{ - lines_samples_colors_split=unlist(strsplit(lines_samples_colors,split=",")) - for(i in 1:length(lines_samples_colors_split)) { - tmp=unlist(strsplit(lines_samples_colors_split[i],split="")) - if(length(tmp)==6|length(tmp)==8) { - if(sum(tmp %in% c(0:9,"A","B","C","D","E","F"))==length(tmp)) { - lines_samples_colors_split[i] = paste("#",lines_samples_colors_split[i],sep="") - } - } + else { + return(FALSE) } - } - colorToCheck=c(colorsSplit,classic_plus_color,classic_minus_color,heatmap_max_color,heatmap_min_color,lines_samples_colors_split) - ctrl=sapply(colorToCheck,is.acceptable.color) + }) if(sum(!ctrl)>0) { for(i in which(!ctrl)) { - write(paste("\"",colorToCheck[i],"\" is not a proper color name.\n",sep=""),stderr()) - } - stop("At least one color has problem") - } - if(annotation_color_by_strand){ - if(length(colorsSplit)>2) { - stop("You have to specify two and only two colors!") + write(paste("\"",lines_samples_type_line_split[i],"\" is not a proper line style.\n",sep=""),stderr()) } - }else { - if(length(typeTranscritSplit)!=length(colorsSplit)) { - stop("Please specify the same number of transcript types and colors") - } - } -##check the line types - if(sum(is.na(lines_samples_type_line))>0) { - lines_samples_type_line_split=((0:(length(files)-1))%/%4)+1 - } - else { - lines_samples_type_line_split=unlist(strsplit(lines_samples_type_line,split=",")) + stop("At least one line style has problem") } - if(typeVisu=="lines") { - ctrl=sapply(lines_samples_type_line_split,function(x) { - tmp=suppressWarnings(as.numeric(x)) - if(!is.na(tmp)) { - return((tmp==floor(tmp))&tmp>=1&tmp<=5) - } - else { - return(FALSE) - } - } - ) - if(sum(!ctrl)>0) { - for(i in which(!ctrl)) { - write(paste("\"",lines_samples_type_line_split[i],"\" is not a proper line style.\n",sep=""),stderr()) - } - stop("At least one line style has problem") - } - lines_samples_type_line_split=as.integer(lines_samples_type_line_split) - } -##check the shapes + lines_samples_type_line_split=as.integer(lines_samples_type_line_split) +} +## Check the shapes type_shape=rep(1,length(typeTranscritSplit)) if(sum(is.na(shape_data)>0)) { + for( i in 1:length(typeTranscritSplit)) { + type_shape[i]="box" + } +} +else { + shape=unlist(strsplit(shape_data,split=",")) + shape=as.array(shape) + if(length(typeTranscritSplit)!=length(shape)) { + stop("Please specify the same number of transcript types and shapes") + } + else { for( i in 1:length(typeTranscritSplit)) { - type_shape[i]="box" + type_shape[i]=shape[[i]] } -}else{ - shape=unlist(strsplit(shape_data,split=",")) - shape=as.array(shape) - if(length(typeTranscritSplit)!=length(shape)) { - stop("Please specify the same number of transcript types and shapes") - }else { - for( i in 1:length(typeTranscritSplit)){ - type_shape[i]= shape[[i]] - } } } - -##Style for the annotation +## Style for the annotation label=rep(1,length(typeTranscritSplit)) -style=data.frame(Type=c(typeTranscritSplit,typeTranscritSplit),Strand=c(rep("+",length(typeTranscritSplit)),rep("-",length(typeTranscritSplit))),col=NA,shape=NA,label,stringsAsFactors = FALSE) +style=data.frame(Type=c(typeTranscritSplit,typeTranscritSplit),Strand=c(rep("+",length(typeTranscritSplit)),rep("-",length(typeTranscritSplit))),col=NA,shape=NA,label,stringsAsFactors=FALSE) for(i in 1:length(typeTranscritSplit)) { style$shape[style$Type==typeTranscritSplit[i]]=type_shape[i] } @@ -1801,27 +1696,27 @@ } } -##Main for visualisation +## Main for visualisation databychr=extractSignal(data_pooled,chrName,from=start,to=end,normalized_=isnormalized) { reverse=list() forward=list() if(log) { - for(i in 1:length(databychr$F)){ - forward_ = numeric() - tmp=log2(1+databychr$F[[i]]) - forward_[1:length(tmp)]=tmp - forward[[i]]=forward_ - reverse_ = numeric() - tmp=log2(1+databychr$R[[i]]) - reverse_[1:length(tmp)]=tmp - reverse[[i]]=reverse_ - } + for(i in 1:length(databychr$F)) { + forward_=numeric() + tmp=log2(1+databychr$F[[i]]) + forward_[1:length(tmp)]=tmp + forward[[i]]=forward_ + reverse_=numeric() + tmp=log2(1+databychr$R[[i]]) + reverse_[1:length(tmp)]=tmp + reverse[[i]]=reverse_ + } } else { - forward=databychr$F - reverse=databychr$R + forward=databychr$F + reverse=databychr$R } } @@ -1832,25 +1727,40 @@ smoothLength=smoothLength[1] } if(unstranded) { - for(i in 1:length(databychr$F)){ - lo=smooth(forward[[i]]+reverse[[i]],L=smoothLength) + for(i in 1:length(databychr$F)) { + if(smoothLength>0) { + lo=smooth(forward[[i]]+reverse[[i]],L=smoothLength) + } + else { + lo=forward[[i]]+reverse[[i]] + } forward[[i]]=lo los=rep(0,length(lo)) reverse[[i]]=los } } else { - for(i in 1:length(databychr$F)){ - lo=smooth(forward[[i]],L=smoothLength) + for(i in 1:length(databychr$F)) { + if(smoothLength>0) { + lo=smooth(forward[[i]],L=smoothLength) + } + else { + lo=forward[[i]] + } forward[[i]]=lo - los=smooth(reverse[[i]],L=smoothLength) + if(smoothLength>0) { + los=smooth(reverse[[i]],L=smoothLength) + } + else { + los=reverse[[i]] + } reverse[[i]]=los } } -group_maximum = rep(1,length(databychr$F)) -max_forward = numeric(length(databychr$F)) -max_reverse = numeric(length(databychr$F)) +group_maximum=rep(1,length(databychr$F)) +max_forward=numeric(length(databychr$F)) +max_reverse=numeric(length(databychr$F)) for(i_data in 1:length(databychr$F)) { max_forward[i_data]=max(forward[[i_data]],na.rm=TRUE) max_reverse[i_data]=max(reverse[[i_data]],na.rm=TRUE) @@ -1860,13 +1770,23 @@ max_reverse[group_maximum==i_max]=max(max_reverse[group_maximum==i_max],na.rm=TRUE) } -if(symetric){ +if(symetric) { for(i_data in 1:length(databychr$F)) { max_forward[i_data]=max(max_forward[i_data],max_reverse[i_data],na.rm=TRUE) } max_reverse=max_forward } +if(is.na(fileType)) { + imageFileSplit=unlist(strsplit(imagefile,split=".",fixed=TRUE)) + if(length(imageFileSplit)>=2) { + fileType=imageFileSplit[length(imageFileSplit)] + } + else { + fileType="png" + } +} + plotVisu(imagefile,typeVisu=typeVisu,listForward=forward,listReverse=reverse, debut_vue=start,fin_vue=end,chr=chrName,annot=annot,repeated=TRUE, titres=description_data_unique,name_flags="",style=style,log=log,stranded=!unstranded, @@ -1874,6 +1794,7 @@ classic_plus_color=classic_plus_color,classic_minus_color=classic_minus_color, heatmap_max_color=heatmap_max_color,heatmap_min_color=heatmap_min_color,heatmap_palette_method=heatmap_palette_method, lines_samples_colors=lines_samples_colors_split,lines_samples_type_line=lines_samples_type_line_split, -smoothLength=1,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand) +smoothLength=1,annotation_color_by_strand=annotation_color_by_strand,annotation_placed_by_strand=annotation_placed_by_strand, +fileType=fileType,resolutionDPI=resolution) } diff -r 5ed41b948030 -r 0cec66d7f637 ving_wrapper.xml --- a/ving_wrapper.xml Tue Jul 29 09:47:47 2014 -0400 +++ b/ving_wrapper.xml Tue Jul 28 05:44:48 2015 -0400 @@ -1,6 +1,7 @@ - + R + Samtools Bioconductor GenomicRanges Rsamtools @@ -8,14 +9,33 @@ Visualization of NGS data +#silent sys.stderr.write("!!!! Cheetah Template Variables !!!!\n") +#silent sys.stderr.write(" searchList = '%s'\n" % (str($searchList)) ) +#silent sys.stderr.write("!!!! end-of-list !!!!\n") ./ving.R - #for $i in $series - #set $index = os.popen("samtools index " + str($i.bamfile) + " " + str($i.bamfile) + ".bai") - ${i.bamfile} --description-data=${i.description} - #end for - --output=$output - #if $normalizationCoefficient - --normalization-coefficient=$normalizationCoefficient + #set $norm='' + #set $coef=[] + #if $norm_coeff.condition: + #for $i, $input in enumerate ( $norm_coeff.series ): + #set $index = os.popen("samtools index " + str($input.bamfile) + " " + str($input.bamfile) + ".bai") + ${input.bamfile} + #if str($input.description) != '': + --description-data=${input.description} + #end if + #silent $coef.append( str($input.normalizationCoefficient)) + #end for + #set $norm=','.join($coef) + + --normalization-coefficient=$norm + #else: + #for $i, $input in enumerate ( $norm_coeff.series ): + #set $index = os.popen("samtools index " + str($input.bamfile) + " " + str($input.bamfile) + ".bai") + ${input.bamfile} + #if str($input.description) != '': + --description-data=${input.description} + #end if + #end for + #end if #if str($strand) == 'yes': --inverseStrand @@ -53,21 +73,75 @@ #if $smoothLength --smoothLength=$smoothLength #end if - + --fileType=$filetype + --resolution=$resolution + #if str($filetype) == 'png' : + --output=$output_png + #elif str($filetype) == 'bmp' : + --output=$output_bmp + #elif str($filetype) == 'jpeg' : + --output=$output_jpg + #elif str($filetype) == 'tiff' : + --output=$output_tiff + #elif str($filetype) == 'pdf' : + --output=$output_pdf + #end if + - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - - + + + + + + + + + + + + + + @@ -77,19 +151,19 @@ - + - - - - + + + + - - - - - + + + + + @@ -97,13 +171,13 @@ - - + + - - + + @@ -111,16 +185,46 @@ - - + + - + + + + + + + + + + + + + + + - + + + filetype == 'png' + + + filetype == 'jpeg' + + + filetype == 'bmp' + + + filetype == 'tiff' + + + filetype == 'pdf' + +