PGN Processing Specifications
There are several parts to a PGN processing program. There is the
parsing of a PGN input file, the resolution of moves, and maintenance of
the game state. Each can be dealt with separately with suitable
interfaces. Each of these modules can be built and tested in
isolation.
First, some preliminaries. In order to resolve moves, the game
state must be kept. This is a dictionary of locations and pieces, plus
the five other items of information that characterize the game state:
active color (w
or b
), castling
availability,
en passant
target, half-move draw
count and turn number. The board has an interface that accepts a move
and executes that move, updating the various elements of board
state.
Moves can use the Command design pattern to separate king-side
castle, queen-side castle, moves, captures and promotions. The Board
object will require a fully-specified move with source location and
destination location. The source location is produced by the source
resolution algorithm.
A well-defined Board object could be used either for a
single-player game (against the computer) or as part of a chess game
server for two-player games.
Second, the hard part: resolution of short notation moves. Based
on input in algebraic notation, a move can be transformed from a string
into a 7-tuple of color
, piece
,
fromHint
, moveType
,
toPosition
, checkIndicator
and
promotionIndicator
.
- The
color
is either w
or b
.
- The
piece
is omitted for pawns, or one
of RNBQK
for the other pieces.
- The
fromHint
is the from position,
either a file and rank or a file alone or a rank alone. The various
search algorithms are required to resolve the starting piece and
location from an incomplete hint.
- The
moveType
is either omitted for a
simple move or x
for a capturing move.
- The
toPosition
is the rank and file at
which the piece arrives.
- The
checkIndicator
is either nothing,
+
or #
.
- The
promotionIndicator
is either nothing
or a new piece name from QBRK
.
This information is used by Algorithm G to resolve the full
starting position information for the move, and then execute the move,
updating the board position.
Finally, input parsing and reporting. A PGN file contains a series
of games. Each game begins with identification tags of the form
[Label "value"]
. The labels include names like
Event
, Site
,
Date
, Round
,
White
, Black
,
Result
. Others labels may be present. After the
identification tags is a blank line followed by the text of the moves,
called the “movetext”. The movetext is supposed to be SAN
(short notation), but some files are LAN (long notation). The moves
should end with the result (1-0
,
0-1
, *
, or
1/2-1/2
), followed by 1 or more blank lines.
In order to handle various forms for the movetext, there have to
be two move parsing classes with identical interfaces. These polymorphic
classes implement long-notation and short-notation parsing. In the event
that a short-notation parser object fails, then the long-notation parser
object can be used instead. If both fail, the file is invalid.
A PGN processing program should be able to read in a file of
games, execute the moves, print logs in various forms (SAN, LAN and
Descriptive), print board positions in various forms. The program should
also be able to convert files from LAN or Descriptive to SAN.
Additionally, the processor should be able to validate logs, and produce
error messages when the chess notation is invalid.
Additionally, once the basic PGN capabilities are in place, a
program can be adapted to do analysis of games. For instance it should
be able to report only games that have specific openings, piece counts
at the end, promotions to queen, castling, checks, etc.