# Chess Game Notation¶

See Chessboard Locations for some additional background.

Chess is played on an 8x8 board. One player has white pieces, one has black pieces. Each player’s pieces include eight pawns, two rooks, two knights, two bishops, a king and a queen. The various pieces have different rules for movement. Players move alternately until one player’s king is in a position from which it cannot escape but must be taken, or there is a draw. There are a number of rules that lead to a draw, all beyond the scope of this problem. White moves first.

A game is recorded as a log of the numbered moves of pieces, first white then black. The Portable Game Notation (PGN) standard includes additional descriptive information about the players and venue.

There are two notations for logging a chess game. The newer, algebraic notation and the older descriptive notation. We will write a program that will process a log in either notation and play out the game, showing the chess board after each of black’s moves. It can also be extended to convert logs to completely standard PGN notation.

## Algebraic Notation¶

We’ll present the formal definition of algebraic notation including Algebraic Notation (LAN) and Short Algebraic Notation (SAN). We’ll follow this with a summary and some examples. This section will end with some Algorithm R, used to resolve which of the available pieces could perform a legal move.

Definition. Algebraic notations uses letters a-h for the files (columns across the board) from white’s left to right, and numbers for the ranks (rows of the board) from white (1) to black (8).

Piece symbols in the log are as follows:

 Piece Symbol Move Summary Pawn (omitted) 1 or 2 spaces forward Rook R anywhere in the same rank or same file Knight N 2 in one direction and 1 in the other (“L-shaped”) Bishop B diagonally, any distance Queen Q horizontal, vertical or diagonal, any distance King K 1 space in any direction

The game begins in the following starting position.

 White Black Piece a1 a8 rook b1 b8 knight c1 c8 bishop d1 d8 queen e1 e8 king f1 f8 bishop g1 g8 knight h1 h8 rook a2-h2 a7-h7 pawns

There are two forms of algebraic notation: short (or standard) algebraic notation (SAN), where only the destination is shown, and long algebraic notation (LAN) that shows the piece, the starting location and the destination.

Long Notation. The basic syntax for LAN is as follows:

```[ P ] f r m f r [ n ]
```
P: The piece name (omitted for pawns). The file (a-h) moving from and to. The rank (1-8) moving from and to. The move (- or x). any notes about the move (+, #, !, !!, ?, ??). The notes may include = and a piece letter when a pawn is promoted.

Short Notation. Short notation omits the starting file and rank unless they are essential for disambiguating a move. The basic syntax is as follows:

```[ P ] [ m ] [ d ] f r [ n ]
```
P: The piece name (omitted for pawns). The move is only specified for a capture (x). The discriminator: either a file (preferred) or a rank or both used to choose which piece moved when there are multiple possibilities. The file (a-h) moving to. The rank (1-8) moving to. any notes about the move (+, #, !, !!, ?, ??). The notes may include = and a piece letter when a pawn is promoted.

Additional Syntax. In both notations, the castle moves are written O-O or O-O-O (capital letters, not numbers). Similarly, the end of a game is often followed with a 1-0 (white wins), 0-1 (black wins), 1/2-1/2 (a draw), or * for a game that is unknown or abandoned.

Each turn is preceeded by a turn number. Typically the number and a . preceeds the white move. Sometimes (because of commentary), the number and ... preceed the black move.

The PGN standard for notes is \$ and a number, common numbers are as follows:

• \$1 good move (traditional “!” )
• \$2 poor move (traditional “?” )
• \$3 very good move (traditional “!!” )
• \$4 very poor move (traditional “??” )

Legal Moves. Each piece has a legal move. This is a critical part of processing abbreviated notation where the log gives the name of the piece and where it wound up. The legal moves to determine which of two (or eight) pieces could have made the requested move. This requires a simple search of pieces to see which could have made the move.

• Pawn. A pawn moves forward only, in its same file. For white, the rank number must increase by 1 or 2. It can increase by 2 when it is in its starting position; rank 2. For black, the rank number must decrease by 1 or 2. It can decrease by 2 when the pawn is in it’s starting position of rank 7.

A pawn captures on the diagonal: it will move into an adjacent file and forward one rank, replacing the piece that was there.

There is one origin for all but the opening pawn moves: one rank back on the file in which the pawn ended its move.

There is one origin for an opening pawn move that lands in rank 4 or 5: two ranks back on the file where the pawn ended its move.

There are two possible origins for any pawn capture (one position on a file adjacent to the one in which the pawn ended its move).

• Rook. A rook moves in ranks or files only, with no limit on distance. There are 16 possible origins for any rook move, including the entire rank or the entire file in which the rook ended its move.

• Knight. A knight makes an “L-shaped” move. It moves two spaces in one direction, turns 90-degrees and moves one more space. From g1, a knight can move to either f3 or h3. The rank changes by 2 and the file by 1; or the file changes by 2 and the rank changes by 1. There are 8 places a knight could start from relative to its final location.

• Bishop. A bishop moves diagonally. The amount of change in the rank must be the same as the change in the file. There are 16 places a bishop can start from on the two diagonals that intersect the final location.

• Queen. The queen’s move combines bishop and rook: any number of spaces diagonally, horizontally or vertically. There are 16 places on the diagonals, plus 16 more places on the horizontals and verticals where the queen could originate. Pawns that reach the opposite side of the board are often promoted to queens, meaning there can be multiple queens late in the game.

• King. The king is unique, there is only one. The king can only move one space hoizontally, vertically or diagonally.

• King and Rook. The king and a rook can also engage in a move called castling: both pieces move. When the king and the closest rook (the one in file h) castle, this is king side and annoted O-O . The king moves from file e to file g and the rook from fiel h to file f. When the king and the queen’s side rook (the one in file a) castle, this is annotated O-O-O . The king moves from file e to file c and the rook move from file a to file d.

Castling can only be accomplished if (a) neither piece has moved and (b) space between them is unoccupied by other pieces. Part a of this rule requires that the game remember when a king or rook moves, and eliminate that side from available castling moves. Moving the rook in file a eliminates queen-side castling; moving the rook in file h eliminates king-side castling. Moving the king eliminates all castling.

Summary and Examples. Here’s a suymmary of the algebraic notation symbols used for annotating chess games. This is followed by some examples.

 Symbol Meaning a-h file from white’s left to right 1-8 rank from white to black R, N, B, Q, K rook, knight, bishop, king, queen - move (non-SAN) x capture; the piece that was at this location is removed + check, a note that the king is threatened # checkmate, a note that this is the reason for the end of the game ++ checkmate (non-SAN), a note that this is the reason for the end of the game = promoted to; a pawn arriving at the opposite side of the board is promoted to another piece, often a queen. 0-0 castle on the king’s side; swap the king and the rook in positions e1 and h1 (if neither has moved before this point in the game) 0-0-0 castle on the queen’s side; swap the king and the rook in positions e1 and a1 (if neither has moved before this point in the game) e.p. en passant capture (non-SAN), a note that a pawn was taken by another pawn passing it. When a pawn’s first move is a two space move (from 7 to 5 for black or 2 to 4 for white) it can be captured my moving behind it to the 6th rank (white taking black) or 3rd rank (black taking white). ep en passant capture (non-SAN), see e.p. ?, ??, !, !! editorial comments (non-SAN), weak, very weak, strong, very strong

Here’s parts of an example game in abbreviated notation:

1. e4 e5. White pawn moves to e4 (search e3 and e2 for the pawn that could do this); black pawn moves to e5 (search e6 and e7 for a pawn that could do this)
2. Nf3 d6. White knight moves to f3 (search 8 positions: g1, h2, h4, g5, e5, d4, d2, e1 and g1 for the knight that could do this); black pawn moves to d6 (search d7 and d8 for the pawn).
3. d4 Bg4. White pawn moves from d4 (search d3 and d2 for the pawn); black bishop moves to g4 (search the four diagonals: f5, e6, d7, c8; h5; h3; f3, e3, and d3 for the bishop that could do this).
4. dxe5 Bxf3. A white pawn in d takes a piece at e5, the pawn must have been at d4, the black pawn at e5 is removed; a black bishop takes a piece at f3 (search the four radiating diagonals from f3: e4, d5, c6, b7, a8; g4, h5; g2, h1; e2, d1).
5. Qxf3 dxe5. The white queen takes the piece at f3; the black pawn in d4 takes the piece in e5.

Here’s a typical game in abbreviated notation:

```1.c4 e6 2.Nf3 d5 3.d4 Nf6 4.Nc3 Be7 5.Bg5 0-0 6.e3 h6 7.Bh4 b6
8.cxd5 Nxd5 9.Bxe7 Qxe7 10.Nxd5 exd5 11.Rc1 Be6 12.Qa4 c5
13.Qa3 Rc8 14.Bb5 a6 15.dxc5 bxc5 16.0-0 Ra7 17.Be2 Nd7
18.Nd4 Qf8 19.Nxe6 fxe6 20.e4 d4 21.f4 Qe7 22.e5 Rb8
23.Bc4 Kh8 24.Qh3 Nf8 25.b3 a5 26.f5 exf5 27.Rxf5 Nh7
28.Rcf1 Qd8 29.Qg3 Re7 30.h4 Rbb7 31.e6 Rbc7 32.Qe5 Qe8
33.a4 Qd8 34.R1f2 Qe8 35.R2f3 Qd8 36.Bd3 Qe8 37.Qe4 Nf6
38.Rxf6 gxf6 39.Rxf6 Kg8 40.Bc4 Kh8 41.Qf4 1-0```

Here’s a small game in full notation:

```1.f2-f4 e7-e5 2.f4xe5 d7-d6 3.e5xd6 Bf8xd6 4.g2-g3 Qd8-g5
5.Ng1-f3 Qg5xg3+ 6.h2xg3 Bd6xg3#```

## Algorithms for Resolving Moves¶

Algebraic notation is terse because it is focused on a human student of chess. It contains just enough information for a person to follow the game. Each individual move cannot be interpreted as a stand-alone (or “context-free” statement). Each move’s description only makes sense in the context of the game state established by all the moves that came before it. Therefore, in order to interpret a log of chess moves, we also need to maintain the state of the chess board.

Given that we have a model of the chess board, Algorithm G can locate the pieces and execute the moves an entire game.

Algorithm G

(Resolve chess moves in SAN notation, playing out the entire game.) We are given a block of text with a sequence of chess turns. Assume that line breaks have been removed and the game ending marker has been removed from the block of text.

1. Parse turn into moves. Locate move number, white move and black move. Lines that don’t have this form are some kind of commentary and can be ignored.
2. Parse each move. For each move, parse the move. Identify the piece (R, B, N, Q, K, pawn if none of these). Identify the optional file (a - h) or rank (1 - 8) for the source. Identifty the optional x for a capture. Identify the destination file (a - h) and rank (1 - 8 ). Identify any other material like + or # for checks, = x for promotions, or !, !! , ? , ?? for editorial comments.
3. Castling? If the move is simply 0-0 or O-O-O, move both the king (in file e) and the appropriate rook. For 0-0 it is the rook in file h moves to f, the king moves from e to g. For O-O-O it is the rook in file a moves to d, the king moves from e to c.
4. Fully specified from location? If a two-character from-position is given, this is the starting location. Execute the move with step 7.
5. Partially specified from location? If a one-character from-position is given (a-h or 1-8), restrict the search for the source to this rank for file. Use the piece-specific version of Algorithm S with rank or file restrictions for the search. After the starting location is found, execute the move with step 7.
6. Omitted from location? Search all possible origins for the from-position for this piece. Each piece has a unique search pattern based on the piece’s movement rules. Use the piece-specific version of Algorithm S with no restrictions for the search. After the starting location is found, execute the move with step 7.
7. Execute move. Move the piece, updating the state of the board, removing captured pieces. Periodically during game processing, print the board position. The board, by the way, is always oriented so that a1 is a dark square in the lower-left.
8. Next move. Loop to step 2, processing the black move after the white move in this turn. If the black move is omitted or is one of the ending strings, skip the black move.
9. Next turn. Loop to step 1, processing the next turn. If the turn number is omitted or is one of the ending strings, this is the end of the game.

We have to design six different kinds of searches for possible starting pieces. These searches include pawns, rooks, knights, bishops queens and the king. We’ll provide formal algorithms for pawns and rooks, and informal specifications for the other pieces.

Algorithm P

(Search for possible pawn starting locations.) Given a destination location, piece color, and optional restrictions on starting rank or starting file.

1. First move. If the destination is rank 4 (white) or rank 5 (black), search two spaces back for the first move of a pawn (from rank 7 or rank 2). If moving this piece will not put the king into check, this is the stating location.
2. Previous space. Search the previous space (rank -1 for white, rank +1 for black) for a move. If moving this piece will not put the same color king into check, this is the stating location.
3. Capture. Search the adjacent files one space back for a pawn which performed a capture. If moving this piece will not put the same color king into check, this is the stating location.
4. Error. If no source can be found, the game notation is incorrect.

Algorithm R

(Search for possible rook starting locations.) Given a destination location, piece color, and optional restrictions on starting rank or starting file.

1. To Right.
1. Initialize. Set .

2. Loop. Target position has file offset by r from destination.

3. On Board. If this is off the board, or a non-rook was found, break from this loop.

If moving this piece will not put the king into check, return this position as the stating location.

4. Loop. Increment r. Continue this loop.

1. To Left.

1. Initialize. Set .

2. Loop. Target position has file offset by r from destination.

3. On Board. If this is off the board, or a non-rook was found, break from this loop.

If moving this piece will not put the king into check, return this position as the stating location.

4. Loop. Decrement r. Continue this loop.

2. Toward Black.

1. Initialize. Set .

2. Loop. Target position has rank offset by r from destination.

3. On Board. If this is off the board, or a non-rook was found, break from this loop.

If moving this piece will not put the king into check, return this position as the stating location.

4. Loop. Increment r. Continue this loop.

3. Toward White.

1. Initialize. Set .

2. Loop. Target position has rank offset by r from destination.

3. On Board. If this is off the board, or a non-rook was found, break from this loop.

If moving this piece will not put the king into check, return this position as the stating location.

4. Loop. Decrement r. Continue this loop.

4. Error. If no source can be found, the game notation is incorrect.

Algorithm N

(Search for possible knight starting locations.) Given a destination location, piece color, and optional restrictions on starting rank or starting file.

There are as many as eight possible starting positions for a knight’s move.

1. Adjacent File. Four of the starting positions have a file offset of +1 or -1 and rank offsets of +2 or -2.

If the position is on the board, the piece is a knight, and moving this piece will not put the king into check, then this is the origin for this move.

2. Adjacent Rank. Four of the starting positions have file offsets of +2 or -2 and rank offsets of +1 or -1.

If the position is on the board, the piece is a knight, and moving this piece will not put the king into check, then this is the origin for this move.

Algorithm B

(Search for possible bishop starting locations.) Given a destination location, piece color, and optional restrictions on starting rank or starting file.

Search radially out the diagonals until edge of board or an intervening piece or the correct bishop was found.

This algorithm is similar to the rook algorithm, except the offsets apply to both rank and file. Applying +1 to both rank and file moves north-east; applying -1 to both rank and file moves south-west. Applying +1 to rank and -1 to file moves south east; applying -1 to rank and +1 to file moves north west.

When an opposing piece is found, the search along that diagonal ends.

If the position is on the board, the piece is a bishop, and moving this piece will not put the king into check, then this is the origin for this move.

Algorithm Q

Search for possible queen starting locations.) Given a destination location, piece color, and optional restrictions on starting rank or starting file.

Search radially out the ranks, files and diagonals until edge of board or an intervening piece or the correct queen was found. This combines the bishop and rook algorithms.

When an opposing piece is found, the search along that rank, file or diagonal ends.

If the position is on the board, the piece is a queen, and moving this piece will not put the king into check, then this is the origin for this move.

Algorithm K

Search for possible king starting locations.) Given a destination location, piece color, and optional restrictions on starting rank or starting file.

Search the 8 adjacent locations. These are all combinations of -1, 0, +1 for rank offset and -1, 0, +1 for file offset. Omit checking the combination with a 0 offset to rank and a 0 offset to file.

If the position is on the board, the piece is the king, this is the starting position.

## Descriptive Notation¶

Descriptive notation uses a different scheme for identifying locations on the board. Each file is named for the pieces at it’s top and bottom ends as the game begins. The board is divided into King’s side and Queen’s side. The files are KR, KKt, KB, K, Q, QB, QKt, QR. These are known as a, b, c, d, e, f, g, h in Algebraic notation.

The ranks are counted from the player’s point of view, from their back row to the far row. Consequently, white’s row 1 is black’s row 8. White’s Q1 is Black’s Q8; Black’s KB5 is White’s KB4.

The notation has the following format:

```piece [ (file rank) ] move [file rank] [note]
```

The symbol for the piece to be moved is one of p, B, N, R, Q, K.

If capturing, the move is x, followed by the symbol of the captured piece. Examples: pxp, NxQ. A search is required to determine which piece can be taken.

If not capturing, the move is -, followed by file rank to name the square moved to, from the perspective of whoever is moving, black or white

If 2 pieces are both be described by a move or capture, write the location of the intended piece in parentheses. Examples: p(Q4)xR means pawn at queen’s rook four takes Rook, N(KB3)-K5 means knight at KB3 moves to K5

Special moves include king’s side castling, designated O-O, Queen’s side castling, designated O-O-O.

Notes. If a pawn captures en passant or in passing it is designated ep in the note. A move resulting in a check of the king is followed by ch in the note. ! means good move; ? means bad move in the note .

If the pawn in front of the king is moved forward two spaces, it is described as P-K4. If the pawn in front of the queenside knight is moved forward one space, it is P-QN3. If a knight at K5 captures a rook on Q7, it would be NxR or if clarification is needed, NxR(Q7) or N(K5)xR.

## Game State¶

In order to process a log, a model of the chess board’s current position is essential. In addition to the basic 64 squares containing the pieces, several additional facts are necessary to capture the game state. The current state of a chess game is a 6-tuple of the following items:

Piece Placement. An 8-tuple shows pieces in each rank from 8 down to 1. Pieces are shown as single letters, upper case for white (PRNBQK), lower case for black (prnbqk). Pieces are coded P for pawn, R for rook, N for knight, B for bishop, Q for queen and K for king. Empty spaces are shown as the number of contiguous spaces.

The entire rank can be coded as a 1-8 character string. 8 means no pieces in this rank. 4p3 means four empty spaces (a-d), a black pawn in file e, and 3 empty spaces (f-h).

The entire 8-tuple of strings can be joined to make a string delimited by / ‘s. For example rnbqkbnr/pp1ppppp/8/2p5/4P3/5N2/PPPP1PPP/RNBQKB1R.

Active Color. A value (w or b) showing who’s turn to move is next. This color is active because this player is contemplating their move. The starting position, for instance, has an active color of w, because white moves first.

Castling Availability. A string with 1 to 4 characters showing which castling moves are still allowed. If none are allowed, a - is shown. White codes are capital letters, black are lower case. When king side castling is possible, a K (white) or k (black) is included. When queen side caslting is possible a Q (white) or q (black) is included. At the start of the game, there are four characters: KQkq. As the game progress and kings castle or are moved for other reason, or rooks are moved, the string reduces in size to -.

En Passant Target. Either - or a square in rank 6 or 3. When a pawn’s first move advances two spaces (from 7 to 5 for black or 2 to 4 for white), the skipped-over space is named here on the next turn only. If an opposing pawn moves to this space, an En Passant capture has occured. If no En Passant vulnerability, a - is given.

Half Move Count. How many 1/2 moves since a pawn was advanced or a piece captures. This is zero after a pawn moves or a piece is captured. This is incremented after each 1/2 move (white or black) where no pawn moves and no piece is captured. When this reaches 50, the game is technically a draw.

Turn. This is the turn count, it increments from 1, by 1, after black’s move.

## PGN Processing Specifications¶

There are three parts to a PGN processing program: the parsing of a PGN input file, the resolution of moves, and maintenance of the game state. Each can be dealt with separately with suitable interfaces. Each of these modules can be built and tested in isolation.

First, some preliminaries. In order to resolve moves, the game state must be kept. This is a dictionary of locations and pieces, plus the five other items of information that characterize the game state: active color (w or b), castling availability, en passant target, half-move draw count and turn number. The board has an interface that accepts a move and executes that move, updating the various elements of board state.

Moves can use the Command design pattern to separate king-side castle, queen-side castle, moves, captures and promotions. The Board object will require a fully-specified move with source location and destination location. The source location is produced by the source resolution algorithm.

A well-defined Board object could be used either for a single-player game (against the computer) or as part of a chess game server for two-player games.

Second, the hard part. Resolution of short notation moves requires several algorithms to locate the piece that made the move. Based on input in algebraic notation, a move can be transformed from a string into a 7-tuple of color, piece, fromHint, moveType, toPosition, checkIndicator and promotionIndicator.

• The color is either w or b.
• The piece is omitted for pawns, or one of RNBQK for the other pieces.
• The fromHint is the from position, either a file and rank or a file alone or a rank alone. The various search algorithms are required to resolve the starting piece and location from an incomplete hint.
• The moveType is either omitted for a simple move or x for a capturing move.
• The toPosition is the rank and file at which the piece arrives.
• The checkIndicator is either nothing, + or #.
• The promotionIndicator is either nothing or a new piece name from QBRK.

This information is used by Algorithm G to resolve the full starting position information for the move, and then execute the move, updating the board position.

Finally, input parsing and reporting. A PGN file contains a series of games.

Each game begins with identification tags of the form [Label "value"]. The labels include names like Event, Site, Date, Round, White, Black, Result. Others labels may be present.

After the identification tags is a blank line followed by the text of the moves, called the “movetext”. The movetext is supposed to be SAN (short notation), but some files are LAN (long notation). The moves should end with the result ( 1-0, 0-1, *, or 1/2-1/2), followed by 1 or more blank lines.

Here’s a sample game from a recent tournament.

```[Event "25th European Club Cup 2009"]
[Date "2009.10.04"]
[Round "1"]
[Board "1"]
[White "Aronian, Levon"]
[Black "Docx, Stefan"]
[Result "1-0"]

1.d4 d6 2.Nf3 g6 3.e4 Bg7 4.Bc4 Nf6 5.Qe2 O-O 6.O-O Bg4 7.Rd1 Nc6 8.h3 Bxf3
9.Qxf3 Nd7 10.c3 e5 11.Be3 a6 12.Na3 exd4 13.cxd4 Qh4 14.Rac1 Nf6 15.Bd3 Rfe8
16.Rc4 Nd7 17.Bb1 Rad8 18.b4 Nb6 19.Rcc1 d5 20.e5 Qe7 21.Nc2 Nc4 22.a3 b5 23.Ba2
Nb8 24.Re1 c6 25.Qg3 Qf8 26.h4 Nd7 27.h5 Ra8 28.Rcd1 a5 29.Bc1 Qe7 30.Bb1 Qe6
31.hxg6 fxg6 32.Rd3 axb4 33.Nxb4 c5 34.Nc2 Qc6 35.f4 Qb6 36.Qf3 Rad8 37.Ne3 Nxe3
38.dxc5 Nxc5 39.Bxe3 Qa5 40.Bd2 Qb6 41.Qf2 Re6 42.Rh3```

Design Considerations. In order to handle various forms for the movetext, there have to be two move parsing classes with identical interfaces. These polymorphic classes implement long-notation and short-notation parsing. In the event that a short-notation parser object fails, then the long-notation parser object can be used instead. If both fail, the file is invalid.

A PGN processing program should be able to read in a file of games, execute the moves, print logs in various forms (SAN, LAN and Descriptive), print board positions in various forms. The program should also be able to convert files from LAN or Descriptive to SAN. Additionally, the processor should be able to validate logs, and produce error messages when the chess notation is invalid.

Additionally, once the basic PGN capabilities are in place, a program can be adapted to do analysis of games. For instance it should be able to report only games that have specific openings, piece counts at the end, promotions to queen, castling, checks, etc.