API documentation for the database interface

This API is thread safe (it uses a multiple reader, single writer locking scheme). You can access this API like this:

from calibre.library import db
db = db('Path to calibre library folder').new_api

If you are in a calibre plugin that is part of the main calibre GUI, you get access to it like this instead:

db = self.gui.current_db.new_api
class calibre.db.cache.Cache(backend)[source]

An in-memory cache of the metadata.db file from a calibre library. This class also serves as a threadsafe API for accessing the database. The in-memory cache is maintained in normal form for maximum performance.

SQLITE is simply used as a way to read and write from metadata.db robustly. All table reading/sorting/searching/caching logic is re-implemented. This was necessary for maximum performance and flexibility.

add_books(books, add_duplicates=True, apply_import_tags=True, preserve_uuid=False, run_hooks=True, dbapi=None)[source]

Add the specified books to the library. Books should be an iterable of 2-tuples, each 2-tuple of the form (mi, format_map) where mi is a Metadata object and format_map is a dictionary of the form {fmt: path_or_stream}, for example: {'EPUB': '/path/to/file.epub'}.

Returns a pair of lists: ids, duplicates. ids contains the book ids for all newly created books in the database. duplicates contains the (mi, format_map) for all books that already exist in the database as per the simple duplicate detection heuristic used by has_book().

add_custom_book_data(name, val_map, delete_first=False)[source]

Add data for name where val_map is a map of book_ids to values. If delete_first is True, all previously stored data for name will be removed.

add_format(book_id, fmt, stream_or_path, replace=True, run_hooks=True, dbapi=None)[source]

Add a format to the specified book. Return True of the format was added successfully.

  • replace – If True replace existing format, otherwise if the format already exists, return False.
  • run_hooks – If True, file type plugins are run on the format before and after being added.
  • dbapi – Internal use only.
all_book_ids(type=<type ‘frozenset’>)[source]

Frozen set of all known book ids.

all_field_for(field, book_ids, default_value=None)[source]

Same as field_for, except that it operates on multiple books at once


Frozen set of ids for all values in the field name.


Frozen set of all fields names (should only be used for many-one and many-many fields)


Return author data as a dictionary with keys: name, sort, link

If no authors with the specified ids are found an empty dictionary is returned. If author_ids is None, data for all authors is returned.

author_sort_from_authors(authors, key_func=<function lower>)[source]

Given a list of authors, return the author_sort string for the authors, preferring the author sort associated with the author over the computed string.

books_for_field(name, item_id)[source]

Return all the books associated with the item identified by item_id, where the item belongs to the field name.

Returned value is a set of book ids, or the empty set if the item or the field does not exist.

books_in_virtual_library(vl, search_restriction=None)[source]

Return the set of books in the specified virtual library

copy_cover_to(book_id, dest, use_hardlink=False, report_file_size=None)[source]

Copy the cover to the file like object dest. Returns False if no cover exists or dest is the same file as the current cover. dest can also be a path in which case the cover is copied to it if and only if the path is different from the current path (taking case sensitivity into account).

copy_format_to(book_id, fmt, dest, use_hardlink=False, report_file_size=None)[source]

Copy the format fmt to the file like object dest. If the specified format does not exist, raises NoSuchFormat error. dest can also be a path, in which case the format is copied to it, iff the path is different from the current path (taking case sensitivity into account).

cover(book_id, as_file=False, as_image=False, as_path=False)[source]

Return the cover image or None. By default, returns the cover as a bytestring.

WARNING: Using as_path will copy the cover to a temp file and return the path to the temp file. You should delete the temp file when you are done with it.

  • as_file – If True return the image as an open file object (a SpooledTemporaryFile)
  • as_image – If True return the image as a QImage object
  • as_path – If True return the image as a path pointing to a temporary file

Return data that can be used to implement find_identical_books() in a worker process without access to the db. See db.utils for an implementation.


Return data suitable for use in has_book(). This can be used for an implementation of has_book() in a worker process without access to the db.

delete_custom_book_data(name, book_ids=())[source]

Delete data for name. By default deletes all data, if you only want to delete data for some book ids, pass in a list of book ids.

embed_metadata(book_ids, only_fmts=None, report_error=None, report_progress=None)[source]

Update metadata in all formats of the specified book_ids to current metadata in the database.

fast_field_for(field_obj, book_id, default_value=None)[source]

Same as field_for, except that it avoids the extra lookup to get the field object

field_for(name, book_id, default_value=None)[source]

Return the value of the field name for the book identified by book_id. If no such book exists or it has no defined value for the field name or no such field exists, then default_value is returned.

default_value is not used for title, title_sort, authors, author_sort and series_index. This is because these always have values in the db. default_value is used for all custom columns.

The returned value for is_multiple fields are always tuples, even when no values are found (in other words, default_value is ignored). The exception is identifiers for which the returned value is always a dict. The returned tuples are always in link order, that is, the order in which they were created.

field_ids_for(name, book_id)[source]

Return the ids (as a tuple) for the values that the field name has on the book identified by book_id. If there are no values, or no such book, or no such field, an empty tuple is returned.

find_identical_books(mi, search_restriction=u”, book_ids=None)[source]

Finds books that have a superset of the authors in mi and the same title (title is fuzzy matched). See also data_for_find_identical_books().

format(book_id, fmt, as_file=False, as_path=False, preserve_filename=False)[source]

Return the e-book format as a bytestring or None if the format doesn’t exist, or we don’t have permission to write to the e-book file.

  • as_file – If True the e-book format is returned as a file object. Note that the file object is a SpooledTemporaryFile, so if what you want to do is copy the format to another file, use copy_format_to() instead for performance.
  • as_path – Copies the format file to a temp file and returns the path to the temp file
  • preserve_filename – If True and returning a path the filename is the same as that used in the library. Note that using this means that repeated calls yield the same temp file (which is re-created each time)
format_abspath(book_id, fmt)[source]

Return absolute path to the e-book file of format format. You should almost never use this, as it breaks the threadsafe promise of this API. Instead use, copy_format_to().

Currently used only in calibredb list, the viewer, edit book, compare_format to original format, open with and the catalogs (via get_data_as_dict()).

Apart from the viewer, open with and edit book, I don’t believe any of the others do any file write I/O with the results of this call.

format_hash(book_id, fmt)[source]

Return the hash of the specified format for the specified book. The kind of hash is backend dependent, but is usually SHA-256.

format_metadata(book_id, fmt, allow_cache=True, update_db=False)[source]

Return the path, size and mtime for the specified format for the specified book. You should not use path unless you absolutely have to, since accessing it directly breaks the threadsafe guarantees of this API. Instead use the copy_format_to() method.

  • allow_cache – If True cached values are used, otherwise a slow filesystem access is done. The cache values could be out of date if access was performed to the filesystem outside of this API.
  • update_db – If True The max_size field of the database is updated for this book.
formats(book_id, verify_formats=True)[source]

Return tuple of all formats for the specified book. If verify_formats is True, verifies that the files exist on disk.

get_categories(sort=u’name’, book_ids=None, already_fixed=None, first_letter_sort=False)[source]

Used internally to implement the Tag Browser

get_custom_book_data(name, book_ids=(), default=None)[source]

Get data for name. By default returns data for all book_ids, pass in a list of book ids if you only want some data. Returns a map of book_id to values. If a particular value could not be decoded, uses default for it.


Return a mapping of id numbers to values for the specified field. The field must be a many-one or many-many field, otherwise a ValueError is raised.


Return the set of book ids for which name has data.

get_item_id(field, item_name)[source]

Return the item id for item_name (case-insensitive)

get_item_ids(field, item_names)[source]

Return the item id for item_name (case-insensitive)

get_item_name(field, item_id)[source]

Return the item name for the item specified by item_id in the specified field. See also get_id_map().

get_metadata(book_id, get_cover=False, get_user_categories=True, cover_as_data=False)[source]

Return metadata for the book identified by book_id as a calibre.ebooks.metadata.book.base.Metadata object. Note that the list of formats is not verified. If get_cover is True, the cover is returned, either a path to temp file as mi.cover or if cover_as_data is True then as mi.cover_data.

get_next_series_num_for(series, field=u’series’, current_indices=False)[source]

Return the next series index for the specified series, taking into account the various preferences that control next series number generation.

  • field – The series-like field (defaults to the builtin series column)
  • current_indices – If True, returns a mapping of book_id to current series_index value instead.

Like get_metadata() except that it returns a ProxyMetadata object that only reads values from the database on demand. This is much faster than get_metadata when only a small number of fields need to be accessed from the returned metadata object.


Return a mapping of id to usage count for all values of the specified field, which must be a many-one or many-many field.


Return True iff the database contains an entry with the same title as the passed in Metadata object. The comparison is case-insensitive. See also data_for_has_book().

has_format(book_id, fmt)[source]

Return True iff the format exists on disk


Return True iff the specified book_id exists in the db


Initialize this cache with data from the backend.

multisort(fields, ids_to_sort=None, virtual_fields=None)[source]

Return a list of sorted book ids. If ids_to_sort is None, all book ids are returned.

fields must be a list of 2-tuples of the form (field_name, ascending=True or False). The most significant field is the first 2-tuple.

pref(name, default=None)[source]

Return the value for the specified preference or the value specified as default if the preference is not set.


Return the OPF metadata backup for the book as a bytestring or None if no such backup exists.

remove_books(book_ids, permanent=False)[source]

Remove the books specified by the book_ids from the database and delete their format files. If permanent is False, then the format files are placed in the recycle bin.

remove_formats(formats_map, db_only=False)[source]

Remove the specified formats from the specified books.

  • formats_map – A mapping of book_id to a list of formats to be removed from the book.
  • db_only – If True, only remove the record for the format from the db, do not delete the actual format file from the filesystem.
remove_items(field, item_ids, restrict_to_book_ids=None)[source]

Delete all items in the specified field with the specified ids. Returns the set of affected book ids. restrict_to_book_ids is an optional set of books ids. If specified the items will only be removed from those books.

rename_items(field, item_id_to_new_name_map, change_index=True, restrict_to_book_ids=None)[source]

Rename items from a many-one or many-many field such as tags or series.

  • change_index – When renaming in a series-like field also change the series_index values.
  • restrict_to_book_ids – An optional set of book ids for which the rename is to be performed, defaults to all books.
restore_book(book_id, mi, last_modified, path, formats)[source]

Restore the book entry in the database for a book that already exists on the filesystem

restore_original_format(book_id, original_fmt)[source]

Restore the specified format from the previously saved ORIGINAL_FORMAT, if any. Return True on success. The ORIGINAL_FORMAT is deleted after a successful restore.


A safe read lock is a lock that does nothing if the thread already has a write lock, otherwise it acquires a read lock. This is necessary to prevent DowngradeLockErrors, which can happen when updating the search cache in the presence of composite columns. Updating the search cache holds an exclusive lock, but searching a composite column involves reading field values via ProxyMetadata which tries to get a shared lock. There may be other scenarios that trigger this as well.

This property returns a new lock object on every access. This lock object is not recursive (for performance) and must only be used in a with statement as with cache.safe_read_lock: otherwise bad things will happen.

save_original_format(book_id, fmt)[source]

Save a copy of the specified format as ORIGINAL_FORMAT, overwriting any existing ORIGINAL_FORMAT.

search(query, restriction=u”, virtual_fields=None, book_ids=None)[source]

Search the database for the specified query, returning a set of matched book ids.

  • restriction – A restriction that is ANDed to the specified query. Note that restrictions are cached, therefore the search for a AND b will be slower than a with restriction b.
  • virtual_fields – Used internally (virtual fields such as on_device to search over).
  • book_ids – If not None, a set of book ids for which books will be searched instead of searching all books.
set_conversion_options(options, fmt=u’PIPE’)[source]

options must be a map of the form {book_id:conversion_options}


Set the cover for this book. data can be either a QImage, QPixmap, file object or bytestring. It can also be None, in which case any existing cover is removed.

set_field(name, book_id_to_val_map, allow_case_change=True, do_path_update=True)[source]

Set the values of the field specified by name. Returns the set of all book ids that were affected by the change.

  • book_id_to_val_map – Mapping of book_ids to values that should be applied.
  • allow_case_change – If True, the case of many-one or many-many fields will be changed. For example, if a book has the tag tag1 and you set the tag for another book to Tag1 then the both books will have the tag Tag1 if allow_case_change is True, otherwise they will both have the tag tag1.
  • do_path_update – Used internally, you should never change it.
set_metadata(book_id, mi, ignore_errors=False, force_changes=False, set_title=True, set_authors=True, allow_case_change=False)[source]

Set metadata for the book id from the Metadata object mi

Setting force_changes=True will force set_metadata to update fields even if mi contains empty values. In this case, ‘None’ is distinguished from ‘empty’. If mi.XXX is None, the XXX is not replaced, otherwise it is. The tags, identifiers, and cover attributes are special cases. Tags and identifiers cannot be set to None so they will always be replaced if force_changes is true. You must ensure that mi contains the values you want the book to have. Covers are always changed if a new cover is provided, but are never deleted. Also note that force_changes has no effect on setting title or authors.

set_pref(name, val)[source]

Set the specified preference to the specified value. See also pref().

tags_older_than(tag, delta=None, must_have_tag=None, must_have_authors=None)[source]

Return the ids of all books having the tag tag that are older than the specified time. tag comparison is case insensitive.

  • delta – A timedelta object or None. If None, then all ids with the tag are returned.
  • must_have_tag – If not None the list of matches will be restricted to books that have this tag
  • must_have_authors – A list of authors. If not None the list of matches will be restricted to books that have these authors (case insensitive).
user_categories_for_books(book_ids, proxy_metadata_map=None)[source]

Return the user categories for the specified books. proxy_metadata_map is optional and is useful for a performance boost, in contexts where a ProxyMetadata object for the books already exists. It should be a mapping of book_ids to their corresponding ProxyMetadata objects.