306 lines
8.5 KiB
Plaintext
306 lines
8.5 KiB
Plaintext
=head1 NAME
|
|
|
|
perlstyle - Perl style guide
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
Each programmer will, of course, have his or her own preferences in
|
|
regards to formatting, but there are some general guidelines that will
|
|
make your programs easier to read, understand, and maintain.
|
|
|
|
The most important thing is to use L<strict> and L<warnings> in all your
|
|
code or know the reason why not to. You may turn them off explicitly for
|
|
particular portions of code via C<no warnings> or C<no strict>, and this
|
|
can be limited to the specific warnings or strict features you wish to
|
|
disable. The B<-w> flag and C<$^W> variable should not be used for this
|
|
purpose since they can affect code you use but did not write, such as
|
|
modules from core or CPAN.
|
|
|
|
Regarding aesthetics of code lay out, about the only thing Larry
|
|
cares strongly about is that the closing curly bracket of
|
|
a multi-line BLOCK should line up with the keyword that started the construct.
|
|
Beyond that, he has other preferences that aren't so strong:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
4-column indent.
|
|
|
|
=item *
|
|
|
|
Opening curly on same line as keyword, if possible, otherwise line up.
|
|
|
|
=item *
|
|
|
|
Space before the opening curly of a multi-line BLOCK.
|
|
|
|
=item *
|
|
|
|
One-line BLOCK may be put on one line, including curlies.
|
|
|
|
=item *
|
|
|
|
No space before the semicolon.
|
|
|
|
=item *
|
|
|
|
Semicolon omitted in "short" one-line BLOCK.
|
|
|
|
=item *
|
|
|
|
Space around most operators.
|
|
|
|
=item *
|
|
|
|
Space around a "complex" subscript (inside brackets).
|
|
|
|
=item *
|
|
|
|
Blank lines between chunks that do different things.
|
|
|
|
=item *
|
|
|
|
Uncuddled elses.
|
|
|
|
=item *
|
|
|
|
No space between function name and its opening parenthesis.
|
|
|
|
=item *
|
|
|
|
Space after each comma.
|
|
|
|
=item *
|
|
|
|
Long lines broken after an operator (except C<and> and C<or>).
|
|
|
|
=item *
|
|
|
|
Space after last parenthesis matching on current line.
|
|
|
|
=item *
|
|
|
|
Line up corresponding items vertically.
|
|
|
|
=item *
|
|
|
|
Omit redundant punctuation as long as clarity doesn't suffer.
|
|
|
|
=back
|
|
|
|
Larry has his reasons for each of these things, but he doesn't claim that
|
|
everyone else's mind works the same as his does.
|
|
|
|
Here are some other more substantive style issues to think about:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
Just because you I<CAN> do something a particular way doesn't mean that
|
|
you I<SHOULD> do it that way. Perl is designed to give you several
|
|
ways to do anything, so consider picking the most readable one. For
|
|
instance
|
|
|
|
open(my $fh, '<', $foo) || die "Can't open $foo: $!";
|
|
|
|
is better than
|
|
|
|
die "Can't open $foo: $!" unless open(my $fh, '<', $foo);
|
|
|
|
because the second way hides the main point of the statement in a
|
|
modifier. On the other hand
|
|
|
|
print "Starting analysis\n" if $verbose;
|
|
|
|
is better than
|
|
|
|
$verbose && print "Starting analysis\n";
|
|
|
|
because the main point isn't whether the user typed B<-v> or not.
|
|
|
|
Similarly, just because an operator lets you assume default arguments
|
|
doesn't mean that you have to make use of the defaults. The defaults
|
|
are there for lazy systems programmers writing one-shot programs. If
|
|
you want your program to be readable, consider supplying the argument.
|
|
|
|
Along the same lines, just because you I<CAN> omit parentheses in many
|
|
places doesn't mean that you ought to:
|
|
|
|
return print reverse sort num values %array;
|
|
return print(reverse(sort num (values(%array))));
|
|
|
|
When in doubt, parenthesize. At the very least it will let some poor
|
|
schmuck bounce on the % key in B<vi>.
|
|
|
|
Even if you aren't in doubt, consider the mental welfare of the person
|
|
who has to maintain the code after you, and who will probably put
|
|
parentheses in the wrong place.
|
|
|
|
=item *
|
|
|
|
Don't go through silly contortions to exit a loop at the top or the
|
|
bottom, when Perl provides the C<last> operator so you can exit in
|
|
the middle. Just "outdent" it a little to make it more visible:
|
|
|
|
LINE:
|
|
for (;;) {
|
|
statements;
|
|
last LINE if $foo;
|
|
next LINE if /^#/;
|
|
statements;
|
|
}
|
|
|
|
=item *
|
|
|
|
Don't be afraid to use loop labels--they're there to enhance
|
|
readability as well as to allow multilevel loop breaks. See the
|
|
previous example.
|
|
|
|
=item *
|
|
|
|
Avoid using C<grep()> (or C<map()>) or `backticks` in a void context, that is,
|
|
when you just throw away their return values. Those functions all
|
|
have return values, so use them. Otherwise use a C<foreach()> loop or
|
|
the C<system()> function instead.
|
|
|
|
=item *
|
|
|
|
For portability, when using features that may not be implemented on
|
|
every machine, test the construct in an eval to see if it fails. If
|
|
you know what version or patchlevel a particular feature was
|
|
implemented, you can test C<$]> (C<$PERL_VERSION> in C<English>) to see if it
|
|
will be there. The C<Config> module will also let you interrogate values
|
|
determined by the B<Configure> program when Perl was installed.
|
|
|
|
=item *
|
|
|
|
Choose mnemonic identifiers. If you can't remember what mnemonic means,
|
|
you've got a problem.
|
|
|
|
=item *
|
|
|
|
While short identifiers like C<$gotit> are probably ok, use underscores to
|
|
separate words in longer identifiers. It is generally easier to read
|
|
C<$var_names_like_this> than C<$VarNamesLikeThis>, especially for
|
|
non-native speakers of English. It's also a simple rule that works
|
|
consistently with C<VAR_NAMES_LIKE_THIS>.
|
|
|
|
Package names are sometimes an exception to this rule. Perl informally
|
|
reserves lowercase module names for "pragma" modules like C<integer> and
|
|
C<strict>. Other modules should begin with a capital letter and use mixed
|
|
case, but probably without underscores due to limitations in primitive
|
|
file systems' representations of module names as files that must fit into a
|
|
few sparse bytes.
|
|
|
|
=item *
|
|
|
|
You may find it helpful to use letter case to indicate the scope
|
|
or nature of a variable. For example:
|
|
|
|
$ALL_CAPS_HERE constants only (beware clashes with perl vars!)
|
|
$Some_Caps_Here package-wide global/static
|
|
$no_caps_here function scope my() or local() variables
|
|
|
|
Function and method names seem to work best as all lowercase.
|
|
E.g., C<$obj-E<gt>as_string()>.
|
|
|
|
You can use a leading underscore to indicate that a variable or
|
|
function should not be used outside the package that defined it.
|
|
|
|
=item *
|
|
|
|
If you have a really hairy regular expression, use the C</x> or C</xx>
|
|
modifiers and put in some whitespace to make it look a little less like
|
|
line noise.
|
|
Don't use slash as a delimiter when your regexp has slashes or backslashes.
|
|
|
|
=item *
|
|
|
|
Use the new C<and> and C<or> operators to avoid having to parenthesize
|
|
list operators so much, and to reduce the incidence of punctuation
|
|
operators like C<&&> and C<||>. Call your subroutines as if they were
|
|
functions or list operators to avoid excessive ampersands and parentheses.
|
|
|
|
=item *
|
|
|
|
Use here documents instead of repeated C<print()> statements.
|
|
|
|
=item *
|
|
|
|
Line up corresponding things vertically, especially if it'd be too long
|
|
to fit on one line anyway.
|
|
|
|
$IDX = $ST_MTIME;
|
|
$IDX = $ST_ATIME if $opt_u;
|
|
$IDX = $ST_CTIME if $opt_c;
|
|
$IDX = $ST_SIZE if $opt_s;
|
|
|
|
mkdir $tmpdir, 0700 or die "can't mkdir $tmpdir: $!";
|
|
chdir($tmpdir) or die "can't chdir $tmpdir: $!";
|
|
mkdir 'tmp', 0777 or die "can't mkdir $tmpdir/tmp: $!";
|
|
|
|
=item *
|
|
|
|
Always check the return codes of system calls. Good error messages should
|
|
go to C<STDERR>, include which program caused the problem, what the failed
|
|
system call and arguments were, and (VERY IMPORTANT) should contain the
|
|
standard system error message for what went wrong. Here's a simple but
|
|
sufficient example:
|
|
|
|
opendir(my $dh, $dir) or die "can't opendir $dir: $!";
|
|
|
|
=item *
|
|
|
|
Line up your transliterations when it makes sense:
|
|
|
|
tr [abc]
|
|
[xyz];
|
|
|
|
=item *
|
|
|
|
Think about reusability. Why waste brainpower on a one-shot when you
|
|
might want to do something like it again? Consider generalizing your
|
|
code. Consider writing a module or object class. Consider making your
|
|
code run cleanly with C<use strict> and C<use warnings> in
|
|
effect. Consider giving away your code. Consider changing your whole
|
|
world view. Consider... oh, never mind.
|
|
|
|
=item *
|
|
|
|
Try to document your code and use Pod formatting in a consistent way. Here
|
|
are commonly expected conventions:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
use C<CE<lt>E<gt>> for function, variable and module names (and more
|
|
generally anything that can be considered part of code, like filehandles
|
|
or specific values). Note that function names are considered more readable
|
|
with parentheses after their name, that is C<function()>.
|
|
|
|
=item *
|
|
|
|
use C<BE<lt>E<gt>> for commands names like B<cat> or B<grep>.
|
|
|
|
=item *
|
|
|
|
use C<FE<lt>E<gt>> or C<CE<lt>E<gt>> for file names. C<FE<lt>E<gt>> should
|
|
be the only Pod code for file names, but as most Pod formatters render it
|
|
as italic, Unix and Windows paths with their slashes and backslashes may
|
|
be less readable, and better rendered with C<CE<lt>E<gt>>.
|
|
|
|
=back
|
|
|
|
=item *
|
|
|
|
Be consistent.
|
|
|
|
=item *
|
|
|
|
Be nice.
|
|
|
|
=back
|