dongsheng@623: #!/usr/bin/perl -w
dongsheng@623: 
dongsheng@623: require Exporter;
dongsheng@623: 
dongsheng@623: package Locale::Po4a::TransTractor;
dongsheng@623: use DynaLoader;
dongsheng@623: 
dongsheng@623: use 5.006;
dongsheng@623: use strict;
dongsheng@623: use warnings;
dongsheng@623: 
dongsheng@623: use subs qw(makespace);
dongsheng@623: use vars qw($VERSION @ISA @EXPORT);
dongsheng@623: $VERSION="0.36";
dongsheng@623: @ISA = qw(DynaLoader);
dongsheng@623: @EXPORT = qw(new process translate 
dongsheng@623:              read write readpo writepo
dongsheng@623:              getpoout setpoout);
dongsheng@623: 
dongsheng@623: # Try to use a C extension if present.
dongsheng@623: eval("bootstrap Locale::Po4a::TransTractor $VERSION");
dongsheng@623: 
dongsheng@623: use Carp qw(croak);
dongsheng@623: use Locale::Po4a::Po;
dongsheng@623: use Locale::Po4a::Common;
dongsheng@623: 
dongsheng@623: use File::Path; # mkdir before write
dongsheng@623: 
dongsheng@623: use Encode;
dongsheng@623: use Encode::Guess;
dongsheng@623: 
dongsheng@623: =head1 NAME
dongsheng@623: 
dongsheng@623: Locale::Po4a::TransTractor - Generic trans(lator ex)tractor.
dongsheng@623: 
dongsheng@623: =head1 DESCRIPTION
dongsheng@623: 
dongsheng@623: The po4a (po for anything) project goal is to ease translations (and more
dongsheng@623: interestingly, the maintenance of translations) using gettext tools on
dongsheng@623: areas where they were not expected like documentation.
dongsheng@623: 
dongsheng@623: This class is the ancestor of every po4a parsers used to parse a document to
dongsheng@623: search translatable strings, extract them to a po file and replace them by
dongsheng@623: their translation in the output document. 
dongsheng@623: 
dongsheng@623: More formally, it takes the following arguments as input:
dongsheng@623: 
dongsheng@623: =over 2
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: a document to translate ;
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: a po file containing the translations to use.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: As output, it produces:
dongsheng@623: 
dongsheng@623: =over 2
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: another po file, resulting of the extraction of translatable strings from
dongsheng@623: the input document ;
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: a translated document, with the same structure than the one in input, but
dongsheng@623: with all translatable strings replaced with the translations found in the
dongsheng@623: po file provided in input.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: Here is a graphical representation of this:
dongsheng@623: 
dongsheng@623:    Input document --\                             /---> Output document
dongsheng@623:                      \                           /       (translated)
dongsheng@623:                       +-> parse() function -----+
dongsheng@623:                      /                           \
dongsheng@623:    Input po --------/                             \---> Output po
dongsheng@623:                                                          (extracted)
dongsheng@623: 
dongsheng@623: =head1 FUNCTIONS YOUR PARSER SHOULD OVERRIDE
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item parse()
dongsheng@623: 
dongsheng@623: This is where all the work takes place: the parsing of input documents, the
dongsheng@623: generation of output, and the extraction of the translatable strings. This
dongsheng@623: is pretty simple using the provided functions presented in the section
dongsheng@623: "INTERNAL FUNCTIONS" below. See also the synopsis, which present an
dongsheng@623: example.
dongsheng@623: 
dongsheng@623: This function is called by the process() function bellow, but if you choose
dongsheng@623: to use the new() function, and to add content manually to your document,
dongsheng@623: you will have to call this function yourself.
dongsheng@623: 
dongsheng@623: =item docheader()
dongsheng@623: 
dongsheng@623: This function returns the header we should add to the produced document,
dongsheng@623: quoted properly to be a comment in the target language.  See the section
dongsheng@623: "Educating developers about translations", from L<po4a(7)|po4a.7>, for what
dongsheng@623: it is good for.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub docheader {}
dongsheng@623: 
dongsheng@623: sub parse {}
dongsheng@623: 
dongsheng@623: =head1 SYNOPSIS
dongsheng@623: 
dongsheng@623: The following example parses a list of paragraphs beginning with "<p>". For the sake
dongsheng@623: of simplicity, we assume that the document is well formatted, i.e. that '<p>'
dongsheng@623: tags are the only tags present, and that this tag is at the very beginning
dongsheng@623: of each paragraph.
dongsheng@623: 
dongsheng@623:  sub parse {
dongsheng@623:    my $self = shift;
dongsheng@623: 
dongsheng@623:    PARAGRAPH: while (1) {
dongsheng@623:        my ($paragraph,$pararef)=("","");
dongsheng@623:        my $first=1;
dongsheng@623:        my ($line,$lref)=$self->shiftline();
dongsheng@623:        while (defined($line)) {
dongsheng@623: 	   if ($line =~ m/<p>/ && !$first--; ) {
dongsheng@623: 	       # Not the first time we see <p>. 
dongsheng@623: 	       # Reput the current line in input,
dongsheng@623: 	       #  and put the built paragraph to output
dongsheng@623: 	       $self->unshiftline($line,$lref);
dongsheng@623: 	      
dongsheng@623: 	       # Now that the document is formed, translate it:
dongsheng@623: 	       #   - Remove the leading tag
dongsheng@623: 	       $paragraph =~ s/^<p>//s;
dongsheng@623: 
dongsheng@623: 	       #   - push to output the leading tag (untranslated) and the
dongsheng@623: 	       #     rest of the paragraph (translated)
dongsheng@623: 	       $self->pushline(  "<p>"
dongsheng@623:                                . $document->translate($paragraph,$pararef)
dongsheng@623:                                );
dongsheng@623: 
dongsheng@623:  	       next PARAGRAPH;
dongsheng@623: 	   } else {
dongsheng@623: 	       # Append to the paragraph
dongsheng@623: 	       $paragraph .= $line;
dongsheng@623: 	       $pararef = $lref unless(length($pararef));
dongsheng@623: 	   }
dongsheng@623: 
dongsheng@623:            # Reinit the loop
dongsheng@623:            ($line,$lref)=$self->shiftline();
dongsheng@623:        }
dongsheng@623:        # Did not get a defined line? End of input file.
dongsheng@623:        return;
dongsheng@623:    }
dongsheng@623:  } 
dongsheng@623: 
dongsheng@623: Once you've implemented the parse function, you can use your document
dongsheng@623: class, using the public interface presented in the next section.
dongsheng@623: 
dongsheng@623: =head1 PUBLIC INTERFACE for scripts using your parser
dongsheng@623: 
dongsheng@623: =head2 Constructor
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item process(%)
dongsheng@623: 
dongsheng@623: This function can do all you need to do with a po4a document in one
dongsheng@623: invocation. Its arguments must be packed as a hash. ACTIONS:
dongsheng@623: 
dongsheng@623: =over 3
dongsheng@623: 
dongsheng@623: =item a.
dongsheng@623: 
dongsheng@623: Reads all the po files specified in po_in_name
dongsheng@623: 
dongsheng@623: =item b.
dongsheng@623: 
dongsheng@623: Reads all original documents specified in file_in_name
dongsheng@623: 
dongsheng@623: =item c.
dongsheng@623: 
dongsheng@623: Parses the document
dongsheng@623: 
dongsheng@623: =item d.
dongsheng@623: 
dongsheng@623: Reads and applies all the addenda specified
dongsheng@623: 
dongsheng@623: =item e.
dongsheng@623: 
dongsheng@623: Writes the translated document to file_out_name (if given)
dongsheng@623: 
dongsheng@623: =item f.
dongsheng@623: 
dongsheng@623: Writes the extracted po file to po_out_name (if given)
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: ARGUMENTS, beside the ones accepted by new() (with expected type):
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item file_in_name (@)
dongsheng@623: 
dongsheng@623: List of filenames where we should read the input document.
dongsheng@623: 
dongsheng@623: =item file_in_charset ($)
dongsheng@623: 
dongsheng@623: Charset used in the input document (if it isn't specified, it will try
dongsheng@623: to detect it from the input document).
dongsheng@623: 
dongsheng@623: =item file_out_name ($)
dongsheng@623: 
dongsheng@623: Filename where we should write the output document.
dongsheng@623: 
dongsheng@623: =item file_out_charset ($)
dongsheng@623: 
dongsheng@623: Charset used in the output document (if it isn't specified, it will use
dongsheng@623: the po file charset).
dongsheng@623: 
dongsheng@623: =item po_in_name (@)
dongsheng@623: 
dongsheng@623: List of filenames where we should read the input po files from, containing
dongsheng@623: the translation which will be used to translate the document.
dongsheng@623: 
dongsheng@623: =item po_out_name ($)
dongsheng@623: 
dongsheng@623: Filename where we should write the output po file, containing the strings
dongsheng@623: extracted from the input document.
dongsheng@623: 
dongsheng@623: =item addendum (@)
dongsheng@623: 
dongsheng@623: List of filenames where we should read the addenda from.
dongsheng@623: 
dongsheng@623: =item addendum_charset ($)
dongsheng@623: 
dongsheng@623: Charset for the addenda.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =item new(%)
dongsheng@623: 
dongsheng@623: Create a new Po4a document. Accepted options (but be in a hash):
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item verbose ($)
dongsheng@623: 
dongsheng@623: Sets the verbosity.
dongsheng@623: 
dongsheng@623: =item debug ($)
dongsheng@623: 
dongsheng@623: Sets the debugging.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub process {
dongsheng@623:     ## Determine if we were called via an object-ref or a classname
dongsheng@623:     my $self = shift;
dongsheng@623: 
dongsheng@623:     ## Any remaining arguments are treated as initial values for the
dongsheng@623:     ## hash that is used to represent this object.
dongsheng@623:     my %params = @_;
dongsheng@623:     
dongsheng@623:     # Build the args for new()
dongsheng@623:     my %newparams = ();
dongsheng@623:     foreach (keys %params) {
dongsheng@623: 	next if ($_ eq 'po_in_name' ||
dongsheng@623: 		 $_ eq 'po_out_name' ||
dongsheng@623: 		 $_ eq 'file_in_name' ||
dongsheng@623: 		 $_ eq 'file_in_charset' ||
dongsheng@623: 		 $_ eq 'file_out_name' ||
dongsheng@623: 		 $_ eq 'file_out_charset' ||
dongsheng@623: 		 $_ eq 'addendum' ||
dongsheng@623: 		 $_ eq 'addendum_charset');
dongsheng@623: 	$newparams{$_}=$params{$_};
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     $self->detected_charset($params{'file_in_charset'});
dongsheng@623:     $self->{TT}{'file_out_charset'}=$params{'file_out_charset'};
dongsheng@623:     if (defined($self->{TT}{'file_out_charset'}) and
dongsheng@623: 	length($self->{TT}{'file_out_charset'})) {
dongsheng@623: 	$self->{TT}{'file_out_encoder'} = find_encoding($self->{TT}{'file_out_charset'});
dongsheng@623:     }
dongsheng@623:     $self->{TT}{'addendum_charset'}=$params{'addendum_charset'};
dongsheng@623: 
dongsheng@623:     foreach my $file (@{$params{'po_in_name'}}) {
dongsheng@623: 	print STDERR "readpo($file)... " if $self->debug();
dongsheng@623: 	$self->readpo($file);
dongsheng@623: 	print STDERR "done.\n" if $self->debug()
dongsheng@623:     }
dongsheng@623:     foreach my $file (@{$params{'file_in_name'}}) {
dongsheng@623: 	print STDERR "read($file)..." if $self->debug();
dongsheng@623: 	$self->read($file);
dongsheng@623: 	print STDERR "done.\n"  if $self->debug();
dongsheng@623:     }
dongsheng@623:     print STDERR "parse..." if $self->debug();
dongsheng@623:     $self->parse();
dongsheng@623:     print STDERR "done.\n" if $self->debug();
dongsheng@623:     foreach my $file (@{$params{'addendum'}}) {
dongsheng@623: 	print STDERR "addendum($file)..." if $self->debug();
dongsheng@623: 	$self->addendum($file) || die "An addendum failed\n";
dongsheng@623: 	print STDERR "done.\n" if $self->debug();
dongsheng@623:     }
dongsheng@623:     if (defined $params{'file_out_name'}) {
dongsheng@623: 	print STDERR "write(".$params{'file_out_name'}.")... " 
dongsheng@623: 	    if $self->debug();
dongsheng@623: 	$self->write($params{'file_out_name'});
dongsheng@623: 	print STDERR "done.\n" if $self->debug();
dongsheng@623:     }
dongsheng@623:     if (defined $params{'po_out_name'}) {
dongsheng@623: 	print STDERR "writepo(".$params{'po_out_name'}.")... "
dongsheng@623: 	     if $self->debug();
dongsheng@623: 	$self->writepo($params{'po_out_name'});
dongsheng@623: 	print STDERR "done.\n" if $self->debug();
dongsheng@623:     }
dongsheng@623:     return $self;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: sub new {
dongsheng@623:     ## Determine if we were called via an object-ref or a classname
dongsheng@623:     my $this = shift;
dongsheng@623:     my $class = ref($this) || $this;
dongsheng@623:     my $self = { };
dongsheng@623:     my %options=@_;
dongsheng@623:     ## Bless ourselves into the desired class and perform any initialization
dongsheng@623:     bless $self, $class;
dongsheng@623:     
dongsheng@623:     ## initialize the plugin
dongsheng@623:     # prevent the plugin from croaking on the options intended for Po.pm
dongsheng@623:     $self->{options}{'porefs'} = '';
dongsheng@623:     # let the plugin parse the options and such
dongsheng@623:     $self->initialize(%options);
dongsheng@623: 
dongsheng@623:     ## Create our private data
dongsheng@623:     my %po_options;
dongsheng@623:     $po_options{'porefs'} = $self->{options}{'porefs'};
dongsheng@623:     
dongsheng@623:     # private data
dongsheng@623:     $self->{TT}=(); 
dongsheng@623:     $self->{TT}{po_in}=Locale::Po4a::Po->new();
dongsheng@623:     $self->{TT}{po_out}=Locale::Po4a::Po->new(\%po_options);
dongsheng@623:     # Warning, this is an array of array:
dongsheng@623:     #  The document is splited on lines, and for each
dongsheng@623:     #  [0] is the line content, [1] is the reference [2] the type
dongsheng@623:     $self->{TT}{doc_in}=();
dongsheng@623:     $self->{TT}{doc_out}=();
dongsheng@623:     if (defined $options{'verbose'}) {
dongsheng@623: 	$self->{TT}{verbose}  =  $options{'verbose'};
dongsheng@623:     }
dongsheng@623:     if (defined $options{'debug'}) {
dongsheng@623: 	$self->{TT}{debug}  =  $options{'debug'};
dongsheng@623:     }
dongsheng@623:     # Input document is in ascii until we prove the opposite (in read())
dongsheng@623:     $self->{TT}{ascii_input}=1;
dongsheng@623:     # We try not to use utf unless it's forced from the outside (in case the
dongsheng@623:     # document isn't in ascii)
dongsheng@623:     $self->{TT}{utf_mode}=0;
dongsheng@623: 
dongsheng@623:     
dongsheng@623:     return $self;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =head2 Manipulating document files
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item read($)
dongsheng@623: 
dongsheng@623: Add another input document at the end of the existing one. The argument is
dongsheng@623: the filename to read. 
dongsheng@623: 
dongsheng@623: Please note that it does not parse anything. You should use the parse()
dongsheng@623: function when you're done with packing input files into the document. 
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: #'
dongsheng@623: sub read() {
dongsheng@623:     my $self=shift;
dongsheng@623:     my $filename=shift
dongsheng@623: 	or croak wrap_msg(dgettext("po4a", "Can't read from file without having a filename"));
dongsheng@623:     my $linenum=0;
dongsheng@623: 
dongsheng@623:     open INPUT,"<$filename" 
dongsheng@623: 	or croak wrap_msg(dgettext("po4a", "Can't read from %s: %s"), $filename, $!);
dongsheng@623:     while (defined (my $textline = <INPUT>)) {
dongsheng@623: 	$linenum++;
dongsheng@623: 	my $ref="$filename:$linenum";
dongsheng@623: 	my @entry=($textline,$ref);
dongsheng@623: 	push @{$self->{TT}{doc_in}}, @entry;
dongsheng@623: 
dongsheng@623: 	if (!defined($self->{TT}{'file_in_charset'})) {
dongsheng@623: 	    # Detect if this file has non-ascii characters
dongsheng@623: 	    if($self->{TT}{ascii_input}) {
dongsheng@623: 		my $decoder = guess_encoding($textline);
dongsheng@623: 		if (!ref($decoder) or $decoder !~ /Encode::XS=/) {
dongsheng@623: 		    # We have detected a non-ascii line
dongsheng@623: 		    $self->{TT}{ascii_input} = 0;
dongsheng@623: 		    # Save the reference for future error message
dongsheng@623: 		    $self->{TT}{non_ascii_ref} ||= $ref;
dongsheng@623: 		}
dongsheng@623: 	    }
dongsheng@623: 	}
dongsheng@623:     }
dongsheng@623:     close INPUT 
dongsheng@623: 	or croak wrap_msg(dgettext("po4a", "Can't close %s after reading: %s"), $filename, $!);
dongsheng@623: 
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =item write($)
dongsheng@623: 
dongsheng@623: Write the translated document to the given filename.
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub write {
dongsheng@623:     my $self=shift;
dongsheng@623:     my $filename=shift
dongsheng@623: 	or croak wrap_msg(dgettext("po4a", "Can't write to a file without filename"));
dongsheng@623: 
dongsheng@623:     my $fh;
dongsheng@623:     if ($filename eq '-') {
dongsheng@623: 	$fh=\*STDOUT;
dongsheng@623:     } else {
dongsheng@623: 	# make sure the directory in which we should write the localized file exists
dongsheng@623: 	my $dir = $filename;
dongsheng@623: 	if ($dir =~ m|/|) {
dongsheng@623: 	    $dir =~ s|/[^/]*$||;
dongsheng@623: 	
dongsheng@623: 	    File::Path::mkpath($dir, 0, 0755) # Croaks on error
dongsheng@623: 	      if (length ($dir) && ! -e $dir);
dongsheng@623: 	}
dongsheng@623: 	open $fh,">$filename"
dongsheng@623: 	    or croak wrap_msg(dgettext("po4a", "Can't write to %s: %s"), $filename, $!);
dongsheng@623:     }
dongsheng@623:     
dongsheng@623:     map { print $fh $_ } $self->docheader();
dongsheng@623:     map { print $fh $_ } @{$self->{TT}{doc_out}};
dongsheng@623: 
dongsheng@623:     if ($filename ne '-') {
dongsheng@623: 	close $fh or croak wrap_msg(dgettext("po4a", "Can't close %s after writing: %s"), $filename, $!);
dongsheng@623:     }
dongsheng@623: 
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =head2 Manipulating po files
dongsheng@623: 
dongsheng@623: =over 4 
dongsheng@623: 
dongsheng@623: =item readpo($)
dongsheng@623: 
dongsheng@623: Add the content of a file (which name is passed in argument) to the
dongsheng@623: existing input po. The old content is not discarded.
dongsheng@623: 
dongsheng@623: =item writepo($)
dongsheng@623: 
dongsheng@623: Write the extracted po file to the given filename.
dongsheng@623: 
dongsheng@623: =item stats()
dongsheng@623: 
dongsheng@623: Returns some statistics about the translation done so far. Please note that
dongsheng@623: it's not the same statistics than the one printed by msgfmt
dongsheng@623: --statistic. Here, it's stats about recent usage of the po file, while
dongsheng@623: msgfmt reports the status of the file. It is a wrapper to the
dongsheng@623: Locale::Po4a::Po::stats_get function applied to the input po file. Example
dongsheng@623: of use:
dongsheng@623: 
dongsheng@623:     [normal use of the po4a document...]
dongsheng@623: 
dongsheng@623:     ($percent,$hit,$queries) = $document->stats();
dongsheng@623:     print "We found translations for $percent\%  ($hit from $queries) of strings.\n";
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub getpoout {
dongsheng@623:     return $_[0]->{TT}{po_out};
dongsheng@623: }
dongsheng@623: sub setpoout {
dongsheng@623:     $_[0]->{TT}{po_out} = $_[1];
dongsheng@623: }
dongsheng@623: sub readpo  { 
dongsheng@623:     $_[0]->{TT}{po_in}->read($_[1]);        
dongsheng@623: }
dongsheng@623: sub writepo { 
dongsheng@623:     $_[0]->{TT}{po_out}->write( $_[1] );    
dongsheng@623: }
dongsheng@623: sub stats   { 
dongsheng@623:     return $_[0]->{TT}{po_in}->stats_get(); 
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =head2 Manipulating addenda
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item addendum($)
dongsheng@623: 
dongsheng@623: Please refer to L<po4a(7)|po4a.7> for more information on what addenda are,
dongsheng@623: and how translators should write them. To apply an addendum to the translated
dongsheng@623: document, simply pass its filename to this function and you are done ;)
dongsheng@623: 
dongsheng@623: This function returns a non-null integer on error.
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: # Internal function to read the header.
dongsheng@623: sub addendum_parse {
dongsheng@623:     my ($filename,$header)=shift;
dongsheng@623: 
dongsheng@623:     my ($errcode,$mode,$position,$boundary,$bmode,$content)=
dongsheng@623: 	(1,"","","","","");
dongsheng@623: 
dongsheng@623:     unless (open (INS, "<$filename")) {
dongsheng@623: 	warn wrap_msg(dgettext("po4a", "Can't read from %s: %s"), $filename, $!);
dongsheng@623: 	goto END_PARSE_ADDFILE;
dongsheng@623:     } 
dongsheng@623: 
dongsheng@623:     unless (defined ($header=<INS>) && $header)  {
dongsheng@623: 	warn wrap_msg(dgettext("po4a", "Can't read Po4a header from %s."), $filename);
dongsheng@623: 	goto END_PARSE_ADDFILE;
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     unless ($header =~ s/PO4A-HEADER://i) {
dongsheng@623: 	warn wrap_msg(dgettext("po4a", "First line of %s does not look like a Po4a header."), $filename);
dongsheng@623: 	goto END_PARSE_ADDFILE;
dongsheng@623:     }
dongsheng@623:     foreach my $part (split(/;/,$header)) {
dongsheng@623: 	unless ($part =~ m/^\s*([^=]*)=(.*)$/) {
dongsheng@623: 	    warn wrap_msg(dgettext("po4a", "Syntax error in Po4a header of %s, near \"%s\""), $filename, $part);
dongsheng@623: 	    goto END_PARSE_ADDFILE;
dongsheng@623: 	}
dongsheng@623: 	my ($key,$value)=($1,$2);
dongsheng@623: 	$key=lc($key);
dongsheng@623:   	     if ($key eq 'mode')     {  $mode=lc($value);
dongsheng@623: 	} elsif ($key eq 'position') {  $position=$value;
dongsheng@623: 	} elsif ($key eq 'endboundary') {  
dongsheng@623: 	    $boundary=$value;
dongsheng@623: 	    $bmode='after';
dongsheng@623: 	} elsif ($key eq 'beginboundary') {  
dongsheng@623: 	    $boundary=$value;
dongsheng@623: 	    $bmode='before';
dongsheng@623: 	} else { 
dongsheng@623: 	    warn wrap_msg(dgettext("po4a", "Invalid argument in the Po4a header of %s: %s"), $filename, $key);
dongsheng@623: 	    goto END_PARSE_ADDFILE;
dongsheng@623: 	}
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     unless (length($mode)) {
dongsheng@623: 	warn wrap_msg(dgettext("po4a", "The Po4a header of %s does not define the mode."), $filename);
dongsheng@623: 	goto END_PARSE_ADDFILE;
dongsheng@623:     }
dongsheng@623:     unless ($mode eq "before" || $mode eq "after") {
dongsheng@623: 	warn wrap_msg(dgettext("po4a", "Mode invalid in the Po4a header of %s: should be 'before' or 'after' not %s."), $filename, $mode);
dongsheng@623: 	goto END_PARSE_ADDFILE;
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     unless (length($position)) {
dongsheng@623: 	warn wrap_msg(dgettext("po4a", "The Po4a header of %s does not define the position."), $filename);
dongsheng@623: 	goto END_PARSE_ADDFILE;
dongsheng@623:     }
dongsheng@623:     unless ($mode eq "before" || length($boundary)) {
dongsheng@623:     	warn wrap_msg(dgettext("po4a", "No ending boundary given in the Po4a header, but mode=after."));
dongsheng@623: 	goto END_PARSE_ADDFILE;
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     while (defined(my $line = <INS>)) {
dongsheng@623: 	$content .= $line;
dongsheng@623:     }
dongsheng@623:     close INS;
dongsheng@623: 
dongsheng@623:     $errcode=0;
dongsheng@623:   END_PARSE_ADDFILE: 
dongsheng@623:       return ($errcode,$mode,$position,$boundary,$bmode,$content);
dongsheng@623: }
dongsheng@623: 
dongsheng@623: sub mychomp {
dongsheng@623:     my ($str) = shift;
dongsheng@623:     chomp($str);
dongsheng@623:     return $str;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: sub addendum {
dongsheng@623:     my ($self,$filename) = @_;
dongsheng@623: 
dongsheng@623:     print STDERR "Apply addendum $filename..." if $self->debug();
dongsheng@623:     unless ($filename) {
dongsheng@623: 	warn wrap_msg(dgettext("po4a",
dongsheng@623: 	    "Can't apply addendum when not given the filename"));
dongsheng@623: 	return 0;
dongsheng@623:     }
dongsheng@623:     die wrap_msg(dgettext("po4a", "Addendum %s does not exist."), $filename)
dongsheng@623:       unless -e $filename;
dongsheng@623:   
dongsheng@623:     my ($errcode,$mode,$position,$boundary,$bmode,$content)=
dongsheng@623: 	addendum_parse($filename);
dongsheng@623:     return 0 if ($errcode);
dongsheng@623: 
dongsheng@623:     print STDERR "mode=$mode;pos=$position;bound=$boundary;bmode=$bmode;ctn=$content\n"
dongsheng@623:       if $self->debug();
dongsheng@623:     
dongsheng@623:     # We only recode the addendum if an origin charset is specified, else we
dongsheng@623:     # suppose it's already in the output document's charset
dongsheng@623:     if (defined($self->{TT}{'addendum_charset'}) &&
dongsheng@623:         length($self->{TT}{'addendum_charset'})) {
dongsheng@623: 	Encode::from_to($content,$self->{TT}{'addendum_charset'},
dongsheng@623: 	    $self->get_out_charset);
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     my $found = scalar grep { /$position/ } @{$self->{TT}{doc_out}};
dongsheng@623:     if ($found == 0) {
dongsheng@623: 	warn wrap_msg(dgettext("po4a",
dongsheng@623: 	    "No candidate position for the addendum %s."), $filename);
dongsheng@623: 	return 0;
dongsheng@623:     }
dongsheng@623:     if ($found > 1) {
dongsheng@623: 	warn wrap_msg(dgettext("po4a",
dongsheng@623: 	    "More than one candidate position found for the addendum %s."), $filename);
dongsheng@623: 	return 0;
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     if ($mode eq "before") {
dongsheng@623: 	if ($self->verbose() > 1 || $self->debug() ) {
dongsheng@623: 	    map { print STDERR wrap_msg(dgettext("po4a", "Addendum '%s' applied before this line: %s"), $filename, $_) if (/$position/);
dongsheng@623:  	        } @{$self->{TT}{doc_out}};
dongsheng@623: 	}
dongsheng@623: 	@{$self->{TT}{doc_out}} = map { /$position/ ? ($content,$_) : $_ 
dongsheng@623:                                         }  @{$self->{TT}{doc_out}};
dongsheng@623:     } else {
dongsheng@623: 	my @newres=();
dongsheng@623: 
dongsheng@623: 	do {
dongsheng@623: 	    # make sure it doesnt whine on empty document
dongsheng@623: 	    my $line = scalar @{$self->{TT}{doc_out}} ? shift @{$self->{TT}{doc_out}} : "";
dongsheng@623: 	    push @newres,$line;
dongsheng@623: 	    my $outline=mychomp($line);
dongsheng@623: 	    $outline =~ s/^[ \t]*//;
dongsheng@623: 	      
dongsheng@623: 	    if ($line =~ m/$position/) {
dongsheng@623: 		while ($line=shift @{$self->{TT}{doc_out}}) {
dongsheng@623: 		    last if ($line=~/$boundary/);
dongsheng@623: 		    push @newres,$line;
dongsheng@623: 		}
dongsheng@623: 		if (defined $line) {
dongsheng@623: 		    if ($bmode eq 'before') {
dongsheng@623: 			print wrap_msg(dgettext("po4a",
dongsheng@623: 			    "Addendum '%s' applied before this line: %s"),
dongsheng@623: 			    $filename, $outline)
dongsheng@623: 			  if ($self->verbose() > 1 || $self->debug());
dongsheng@623: 			push @newres,$content;
dongsheng@623: 			push @newres,$line;
dongsheng@623: 		    } else {
dongsheng@623: 			print wrap_msg(dgettext("po4a",
dongsheng@623: 			    "Addendum '%s' applied after the line: %s."),
dongsheng@623: 			    $filename, $outline)
dongsheng@623: 			  if ($self->verbose() > 1 || $self->debug());
dongsheng@623: 			push @newres,$line;
dongsheng@623: 			push @newres,$content;
dongsheng@623: 		    }
dongsheng@623: 		} else {
dongsheng@623: 		    print wrap_msg(dgettext("po4a", "Addendum '%s' applied at the end of the file."), $filename)
dongsheng@623: 		      if ($self->verbose() > 1 || $self->debug());
dongsheng@623: 		    push @newres,$content;
dongsheng@623: 		}
dongsheng@623: 	    }
dongsheng@623: 	} while (scalar @{$self->{TT}{doc_out}});
dongsheng@623: 	@{$self->{TT}{doc_out}} = @newres;
dongsheng@623:     }
dongsheng@623:     print STDERR "done.\n" if $self->debug();
dongsheng@623:     return 1;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =head1 INTERNAL FUNCTIONS used to write derivated parsers
dongsheng@623: 
dongsheng@623: =head2 Getting input, providing output
dongsheng@623: 
dongsheng@623: Four functions are provided to get input and return output. They are very
dongsheng@623: similar to shift/unshift and push/pop. The first pair is about input, while
dongsheng@623: the second is about output. Mnemonic: in input, you are interested in the
dongsheng@623: first line, what shift gives, and in output you want to add your result at
dongsheng@623: the end, like push does.
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item shiftline()
dongsheng@623: 
dongsheng@623: This function returns the next line of the doc_in to be parsed and its
dongsheng@623: reference (packed as an array).
dongsheng@623: 
dongsheng@623: =item unshiftline($$)
dongsheng@623: 
dongsheng@623: Unshifts a line of the input document and its reference. 
dongsheng@623: 
dongsheng@623: =item pushline($)
dongsheng@623: 
dongsheng@623: Push a new line to the doc_out.
dongsheng@623: 
dongsheng@623: =item popline()
dongsheng@623: 
dongsheng@623: Pop the last pushed line from the doc_out.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub shiftline   {  
dongsheng@623:     my ($line,$ref)=(shift @{$_[0]->{TT}{doc_in}},
dongsheng@623: 		     shift @{$_[0]->{TT}{doc_in}}); 
dongsheng@623:     return ($line,$ref);
dongsheng@623: }
dongsheng@623: sub unshiftline {
dongsheng@623: 	my $self = shift;
dongsheng@623: 	unshift @{$self->{TT}{doc_in}},@_;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: sub pushline    {  push @{$_[0]->{TT}{doc_out}}, $_[1] if defined $_[1]; }
dongsheng@623: sub popline     {  return pop @{$_[0]->{TT}{doc_out}};            }
dongsheng@623: 
dongsheng@623: =head2 Marking strings as translatable
dongsheng@623: 
dongsheng@623: One function is provided to handle the text which should be translated. 
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item translate($$$)
dongsheng@623: 
dongsheng@623: Mandatory arguments:
dongsheng@623: 
dongsheng@623: =over 2
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: A string to translate
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: The reference of this string (ie, position in inputfile)
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: The type of this string (ie, the textual description of its structural role
dongsheng@623: ; used in Locale::Po4a::Po::gettextization() ; see also L<po4a(7)|po4a.7>,
dongsheng@623: section I<Gettextization: how does it work?>)
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: This function can also take some extra arguments. They must be organized as
dongsheng@623: a hash. For example:
dongsheng@623: 
dongsheng@623:   $self->translate("string","ref","type",
dongsheng@623: 		   'wrap' => 1);
dongsheng@623: 
dongsheng@623: =over
dongsheng@623: 
dongsheng@623: =item wrap
dongsheng@623: 
dongsheng@623: boolean indicating whether we can consider that whitespaces in string are
dongsheng@623: not important. If yes, the function canonizes the string before looking for
dongsheng@623: a translation or extracting it, and wraps the translation.
dongsheng@623: 
dongsheng@623: =item wrapcol
dongsheng@623: 
dongsheng@623: The column at which we should wrap (default: 76).
dongsheng@623: 
dongsheng@623: =item comment
dongsheng@623: 
dongsheng@623: An extra comment to add to the entry.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: Actions:
dongsheng@623: 
dongsheng@623: =over 2
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: Pushes the string, reference and type to po_out.
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: Returns the translation of the string (as found in po_in) so that the
dongsheng@623: parser can build the doc_out.
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: Handles the charsets to recode the strings before sending them to
dongsheng@623: po_out and before returning the translations.
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub translate {
dongsheng@623:     my $self=shift;
dongsheng@623:     my ($string,$ref,$type)=(shift,shift,shift);
dongsheng@623:     my (%options)=@_;
dongsheng@623: 
dongsheng@623:     # my $validoption="wrap wrapcol";
dongsheng@623:     # my %validoption;
dongsheng@623: 
dongsheng@623:     return "" unless defined($string) && length($string);
dongsheng@623: 
dongsheng@623:     # map { $validoption{$_}=1 } (split(/ /,$validoption));
dongsheng@623:     # foreach (keys %options) {
dongsheng@623:     #	Carp::confess "internal error: translate() called with unknown arg $_. Valid options: $validoption"
dongsheng@623:     #	    unless $validoption{$_};
dongsheng@623:     # }
dongsheng@623: 
dongsheng@623:     my $in_charset;
dongsheng@623:     if ($self->{TT}{ascii_input}) {
dongsheng@623: 	$in_charset = "ascii";
dongsheng@623:     } else {
dongsheng@623: 	if (defined($self->{TT}{'file_in_charset'}) and
dongsheng@623: 	    length($self->{TT}{'file_in_charset'}) and
dongsheng@623: 	    $self->{TT}{'file_in_charset'} !~ m/ascii/i) {
dongsheng@623: 	    $in_charset=$self->{TT}{'file_in_charset'};
dongsheng@623: 	} else {
dongsheng@623: 	    # FYI, the document charset have to be determined *before* we see the first
dongsheng@623: 	    # string to recode.
dongsheng@623: 	    die wrap_mod("po4a", dgettext("po4a", "Couldn't determine the input document's charset. Please specify it on the command line. (non-ascii char at %s)"), $self->{TT}{non_ascii_ref})
dongsheng@623: 	}
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     if ($self->{TT}{po_in}->get_charset ne "CHARSET") {
dongsheng@623: 	$string = encode_from_to($string,
dongsheng@623: 	                         $self->{TT}{'file_in_encoder'},
dongsheng@623: 	                         $self->{TT}{po_in}{encoder});
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     if (defined $options{'wrapcol'} && $options{'wrapcol'} < 0) {
dongsheng@623: # FIXME: should be the parameter given with --width
dongsheng@623:         $options{'wrapcol'} = 76 + $options{'wrapcol'};
dongsheng@623:     }
dongsheng@623:     my $transstring = $self->{TT}{po_in}->gettext($string,
dongsheng@623: 					'wrap'      => $options{'wrap'}||0,
dongsheng@623: 					'wrapcol'   => $options{'wrapcol'});
dongsheng@623: 
dongsheng@623:     if ($self->{TT}{po_in}->get_charset ne "CHARSET") {
dongsheng@623: 	my $out_encoder = $self->{TT}{'file_out_encoder'};
dongsheng@623: 	unless (defined $out_encoder) {
dongsheng@623: 	    $out_encoder = find_encoding($self->get_out_charset)
dongsheng@623: 	}
dongsheng@623: 	$transstring = encode_from_to($transstring,
dongsheng@623: 	                              $self->{TT}{po_in}{encoder},
dongsheng@623: 	                              $out_encoder);
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     # If the input document isn't completely in ascii, we should see what to
dongsheng@623:     # do with the current string
dongsheng@623:     unless ($self->{TT}{ascii_input}) {
dongsheng@623:         my $out_charset = $self->{TT}{po_out}->get_charset;
dongsheng@623: 	# We set the output po charset 
dongsheng@623:         if ($out_charset eq "CHARSET") {
dongsheng@623: 	    if ($self->{TT}{utf_mode}) {
dongsheng@623: 		$out_charset="utf-8";
dongsheng@623: 	    } else {
dongsheng@623: 		$out_charset=$in_charset;
dongsheng@623: 	    }
dongsheng@623: 	    $self->{TT}{po_out}->set_charset($out_charset);
dongsheng@623: 	}
dongsheng@623: 	if ( $in_charset !~ /^$out_charset$/i ) {
dongsheng@623: 	    Encode::from_to($string,$in_charset,$out_charset);
dongsheng@623: 	    if (defined($options{'comment'}) and length($options{'comment'})) {
dongsheng@623: 		Encode::from_to($options{'comment'},$in_charset,$out_charset);
dongsheng@623: 	    }
dongsheng@623: 	}
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     # the comments provided by the modules are automatic comments from the PO point of view
dongsheng@623:     $self->{TT}{po_out}->push('msgid'     => $string,
dongsheng@623: 			      'reference' => $ref,
dongsheng@623: 			      'type'      => $type,
dongsheng@623: 	                      'automatic' => $options{'comment'},
dongsheng@623: 			      'wrap'      => $options{'wrap'}||0,
dongsheng@623: 			      'wrapcol'   => $options{'wrapcol'});
dongsheng@623: 
dongsheng@623: #    if ($self->{TT}{po_in}->get_charset ne "CHARSET") {
dongsheng@623: #	Encode::from_to($transstring,$self->{TT}{po_in}->get_charset,
dongsheng@623: #	    $self->get_out_charset);
dongsheng@623: #    }
dongsheng@623: 
dongsheng@623:     if ($options{'wrap'}||0) {
dongsheng@623:         $transstring =~ s/( *)$//s;
dongsheng@623:         my $trailing_spaces = $1||"";
dongsheng@623:         $transstring =~ s/ *$//gm;
dongsheng@623:         $transstring .= $trailing_spaces;
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     return $transstring;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =head2 Misc functions
dongsheng@623: 
dongsheng@623: =over 4
dongsheng@623: 
dongsheng@623: =item verbose()
dongsheng@623: 
dongsheng@623: Returns if the verbose option was passed during the creation of the
dongsheng@623: TransTractor.
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub verbose {
dongsheng@623:     if (defined $_[1]) {
dongsheng@623: 	$_[0]->{TT}{verbose} = $_[1];
dongsheng@623:     } else {
dongsheng@623: 	return $_[0]->{TT}{verbose} || 0; # undef and 0 have the same meaning, but one generates warnings
dongsheng@623:     }
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =item debug()
dongsheng@623: 
dongsheng@623: Returns if the debug option was passed during the creation of the
dongsheng@623: TransTractor.
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub debug {
dongsheng@623:     return $_[0]->{TT}{debug};
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =item detected_charset($)
dongsheng@623: 
dongsheng@623: This tells TransTractor that a new charset (the first argument) has been
dongsheng@623: detected from the input document. It can usually be read from the document
dongsheng@623: header. Only the first charset will remain, coming either from the
dongsheng@623: process() arguments or detected from the document.
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub detected_charset {
dongsheng@623:     my ($self,$charset)=(shift,shift);
dongsheng@623:     unless (defined($self->{TT}{'file_in_charset'}) and
dongsheng@623:             length($self->{TT}{'file_in_charset'}) ) {
dongsheng@623:         $self->{TT}{'file_in_charset'}=$charset;
dongsheng@623:         if (defined $charset) {
dongsheng@623:             $self->{TT}{'file_in_encoder'}=find_encoding($charset);
dongsheng@623:         }
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     if (defined $self->{TT}{'file_in_charset'} and
dongsheng@623:         length $self->{TT}{'file_in_charset'} and
dongsheng@623:         $self->{TT}{'file_in_charset'} !~ m/ascii/i) {
dongsheng@623: 	$self->{TT}{ascii_input}=0;
dongsheng@623:     }
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =item get_out_charset()
dongsheng@623: 
dongsheng@623: This function will return the charset that should be used in the output
dongsheng@623: document (usually useful to substitute the input document's detected charset
dongsheng@623: where it has been found).
dongsheng@623: 
dongsheng@623: It will use the output charset specified in the command line. If it wasn't
dongsheng@623: specified, it will use the input po's charset, and if the input po has the
dongsheng@623: default "CHARSET", it will return the input document's charset, so that no
dongsheng@623: encoding is performed.
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub get_out_charset {
dongsheng@623:     my $self=shift;
dongsheng@623:     my $charset;
dongsheng@623: 
dongsheng@623:     # Use the value specified at the command line
dongsheng@623:     if (defined($self->{TT}{'file_out_charset'}) and
dongsheng@623: 	length($self->{TT}{'file_out_charset'})) {
dongsheng@623: 	$charset=$self->{TT}{'file_out_charset'};
dongsheng@623:     } else {
dongsheng@623: 	if ($self->{TT}{utf_mode} && $self->{TT}{ascii_input}) {
dongsheng@623: 	    $charset="utf-8";
dongsheng@623: 	} else {
dongsheng@623: 	    $charset=$self->{TT}{po_in}->get_charset;
dongsheng@623: 	    $charset=$self->{TT}{'file_in_charset'}
dongsheng@623: 		if $charset eq "CHARSET" and
dongsheng@623: 		    defined($self->{TT}{'file_in_charset'}) and
dongsheng@623: 		    length($self->{TT}{'file_in_charset'});
dongsheng@623: 	    $charset="ascii"
dongsheng@623: 		if $charset eq "CHARSET";
dongsheng@623: 	}
dongsheng@623:     }
dongsheng@623:     return $charset;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =item recode_skipped_text($)
dongsheng@623: 
dongsheng@623: This function returns the recoded text passed as argument, from the input
dongsheng@623: document's charset to the output document's one. This isn't needed when
dongsheng@623: translating a string (translate() recodes everything itself), but it is when
dongsheng@623: you skip a string from the input document and you want the output document to
dongsheng@623: be consistent with the global encoding.
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: sub recode_skipped_text {
dongsheng@623:     my ($self,$text)=(shift,shift);
dongsheng@623:     unless ($self->{TT}{'ascii_input'}) {
dongsheng@623: 	if(defined($self->{TT}{'file_in_charset'}) and
dongsheng@623: 	    length($self->{TT}{'file_in_charset'}) ) {
dongsheng@623: 	    $text = encode_from_to($text,
dongsheng@623: 	                           $self->{TT}{'file_in_encoder'},
dongsheng@623: 	                           find_encoding($self->get_out_charset));
dongsheng@623: 	} else {
dongsheng@623: 	    die wrap_mod("po4a", dgettext("po4a", "Couldn't determine the input document's charset. Please specify it on the command line. (non-ascii char at %s)"), $self->{TT}{non_ascii_ref})
dongsheng@623: 	}
dongsheng@623:     }
dongsheng@623:     return $text;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: 
dongsheng@623: # encode_from_to($,$,$)
dongsheng@623: #
dongsheng@623: # Encode the given text from one encoding to another one.
dongsheng@623: # It differs from Encode::from_to because it does not take the name of the
dongsheng@623: # encoding in argument, but the encoders (as returned by the
dongsheng@623: # Encode::find_encoding(<name>) method). Thus it permits to save a bunch
dongsheng@623: # of call to find_encoding.
dongsheng@623: #
dongsheng@623: # If the "from" encoding is undefined, it is considered as UTF-8 (or
dongsheng@623: # ascii).
dongsheng@623: # If the "to" encoding is undefined, it is considered as UTF-8.
dongsheng@623: #
dongsheng@623: sub encode_from_to {
dongsheng@623:     my ($text,$from,$to) = (shift,shift,shift);
dongsheng@623: 
dongsheng@623:     if (not defined $from) {
dongsheng@623:         # for ascii and UTF-8, no conversion needed to get an utf-8
dongsheng@623:         # string.
dongsheng@623:     } else {
dongsheng@623:         $text = $from->decode($text, 0);
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     if (not defined $to) {
dongsheng@623:         # Already in UTF-8, no conversion needed
dongsheng@623:     } else {
dongsheng@623:         $text = $to->encode($text, 0);
dongsheng@623:     }
dongsheng@623: 
dongsheng@623:     return $text;
dongsheng@623: }
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: =head1 FUTURE DIRECTIONS
dongsheng@623: 
dongsheng@623: One shortcoming of the current TransTractor is that it can't handle
dongsheng@623: translated document containing all languages, like debconf templates, or
dongsheng@623: .desktop files.
dongsheng@623: 
dongsheng@623: To address this problem, the only interface changes needed are:
dongsheng@623: 
dongsheng@623: =over 2
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: take a hash as po_in_name (a list per language)
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: add an argument to translate to indicate the target language
dongsheng@623: 
dongsheng@623: =item -
dongsheng@623: 
dongsheng@623: make a pushline_all function, which would make pushline of its content for
dongsheng@623: all language, using a map-like syntax:
dongsheng@623: 
dongsheng@623:     $self->pushline_all({ "Description[".$langcode."]=".
dongsheng@623: 			  $self->translate($line,$ref,$langcode) 
dongsheng@623: 		        });
dongsheng@623: 
dongsheng@623: =back
dongsheng@623: 
dongsheng@623: Will see if it's enough ;)
dongsheng@623: 
dongsheng@623: =head1 AUTHORS
dongsheng@623: 
dongsheng@623:  Denis Barbier <barbier@linuxfr.org>
dongsheng@623:  Martin Quinson (mquinson#debian.org)
dongsheng@623:  Jordi Vilalta <jvprat@gmail.com>
dongsheng@623: 
dongsheng@623: =cut
dongsheng@623: 
dongsheng@623: 1;