=head1 NAME

iPE::SequenceReader::Load::FASTA - Simple fasta sequence reader.

=head1 DESCRIPTION

This is a semi-flexible fasta sequence reader.  It allows the option of splitting the sequence into a set of scalars.   It only keeps track of the current FASTA entry in the file and discards the old one as the next one is read.

=head1 FUNCTIONS

=over 8

=cut

package iPE::SequenceReader::Load::FASTA;
use iPE;
use base("iPE::SequenceReader::Load");
use strict;

=item new(memberHash)

This new function requires a hash reference with the filehandle for the sequence file defined as a typeglob reference.  An example of this might be \*STDIN.

The following keys are required to instantiate iPE::FASTA:

=over 8

=item filename

The name of the file to parse

=back

The following are optional keys:

=over 8

=item fh

Filehandle of the file if it is already opened.

=item split_string

If the split_string key is defined, the sequences will be split into arrays which can be accessed via the arr variable.

=back

=item def (), seqRef (), arrRef ()

def () returns the definition line of the current sequence.
seq () returns a reference to the string of the current sequence.
If split_string was supplied in new (), then arr () will return the array of items in the seuquence split on the string that was supplied.
All of these are returned as references, since duplicating a large sequence can be costly.

=cut
sub new
{
    my $class = shift;
	my ($m) = @_;
	my $this = $class->SUPER::new(@_); 

    my $fh = $this->{fh_};
    while(<$fh>) {
        if (/^>/) {
            $this->{next_def_} = $_;
            $this->{next_def_} =~ s/^>//;
            last;
        }
    }

    $this->next;

    die "Empty fasta file\n" 
        if not defined $this->{cur_def_};

    return $this;
}

sub def    { shift->{cur_def_} }
sub seqRef { shift->{cur_seq_} }
sub arrRef { shift->{cur_arr_} }

sub next {
    my ($this) = @_;

    #the if the next definition is undefined, then on the last call to 'next'
    #we hit the end of the file, and now we have nothing.
    if(!defined $this->{next_def_}) {
        $this->{cur_def_} = $this->{cur_seq_} = $this->{cur_arr_} = 
            undef;
        return $this;
    }
    #the header is discovered at the end of going through the previous 
    #sequence, so we have to set our current sequence definition
    #here before we gobble up the old one.
    $this->{cur_def_} = $this->{next_def_};
    $this->{next_def_} = undef;
    
    my $fh = $this->{fh_};
    my $cur_seq = "";

    while(<$fh>) {
        chomp;
        if(/^>/) { $this->{next_def_} = s/^>//; last; }
        else     { $cur_seq .= $_ if(/\S/); }
    }

    if(defined $this->{split_string_}) {
        my @cur_arr = split $this->{split_string_}, $cur_seq;
        $this->{cur_arr_} = \@cur_arr;
    }
    else {
        $cur_seq =~ s/\s//g;
        $this->{cur_seq_} = \$cur_seq;
    }

    return $this;
}

=head1 SEE ALSO

L<iPE::SequenceReader> L<iPE::SequenceReader::Load>

=head1 AUTHOR

Bob Zimmermann (rpz@cs.wustl.edu).

=cut

1;
