Sign up FAST! Login

Baseball data bank

baseball analytics

Baseball Databank
Statement of Purpose

Desktop computers, powerful database programs and the Internet have
combined to fuel a revolution in baseball research. Legions now scour
the vast numerical legacy of the sport, finding new ways to look at
the game and how it has been played. Their efforts have resulted in
fundamental changes in the ways in which the game is analyzed, and
increasingly, how it is played on the field.

At the same time, software developers are slicing and dicing the
numbers, providing information freshly drawn from history's dusty
warrens for a market that craves the new, the clever, and the
useful. But there is a problem. Over the years baseball's numbers, its
arcane but integral history, have been compiled by a series of
organizations that didn't always maintain the strictest accuracy, and
didn't always see to it that mistakes were corrected
diligently. Errors introduced into the game's historical record years
ago remain, while other pertinent biographical and performance data
weren't collected at all.

The Baseball Databank (BDB) is dedicated to creating and maintaining a
comprehensive record of all baseball statistical data in a form that
makes them useful for researchers and product developers. This
databank, once it is fully normalized and proofed, will be the
standard source for those professionals creating new data products. By
providing this data to the public in a free and open format, the BDB
hopes to encourage the development of third-party applications,
including web sites, standalone query tools, games and simulations. Or
perhaps something completely different. The ultimate purpose is to
extend our understanding of the game of baseball.

The Baseball Databank's master file of names will include records for
all those who have played, coached, managed, umpired or worked as an
executive for a major league baseball team throughout history, with
biographical information and a comprehensive set of statistics
detailing their annual performances at all baseball levels, not
limited to Major League Baseball, Minor League Baseball, Negro
Leagues, Japanese Leagues, Mexican Leagues, NCAA and other worthy and
significant international leagues. Whenever possible, the Baseball
Databank will also maintain a complete set of records, or allow others
to link their databases, for all players in all leagues, including
those who never played at the major league level. The BDB will also
summarize these individual seasonal totals on a team level, including
standings and post-season records.

The Baseball Databank's database will be organized around the concept
of annual seasons and the BDB will maintain the annual stats. They
will be available to anyone who agrees to the Baseball Databank
license. The organization is staffed entirely by a volunteer group of
interested individuals who have compiled, designed and proofed the
most complete and accurate record of baseball history in
existence. The BDB will help people who want to add additional
elements (like game-by-game records, or pitch-by-pitch accounts that
may be of interest to smaller audiences) do so. But first and foremost,
the Baseball Databank is a library of authoritative baseball
statistics and information maintained in a simple-to-access format for
information providers and baseball researchers.

This task will be ongoing and never completed, but we will continually
strive to reach the goals listed above.

Written October 18, 2003
by Peter Kreutzer and Sean Forman (ed.)

Stashed in: Baseball, Big Data

To save this post, select a stash from drop-down menu or type in a new one:

You May Also Like: