The ebook editing tools consist of a
calibre.ebooks.oeb.polish.container.Container object that represents a
book as a collection of HTML + resource files, and various tools that can be
used to perform operations on the container. All the tools are in the form of
module level functions in the various
You obtain a container object for a book at a path like this:
from calibre.ebooks.oeb.polish.container import get_container container = get_container('Path to book file', tweak_mode=True)
If you are writing a plugin for the ebook editor, you get the current container for the book being edited like this:
from calibre.gui2.tweak_book import current_container container = current_container() if container is None: report_error # No book has been opened yet
Container(rootpath, opfpath, log, clone_data=None)¶
A container represents an Open EBook as a directory full of files and an opf file. There are two important concepts:
- The root directory. This is the base of the ebook. All the ebooks files are inside this directory or in its sub-directories.
- Names: These are paths to the books’ files relative to the root directory. They always contain POSIX separators and are unquoted. They can be thought of as canonical identifiers for files in the book. Most methods on the container object work with names. Names are always in the NFC unicode normal form.
- Clones: the container object supports efficient on-disk cloning, which is used to implement checkpoints in the ebook editor. In order to make this work, you should never access files on the filesystem directly. Instead, use
open()to read/write to component files in the book.
When converting between hrefs and names use the methods provided by this class, they assume all hrefs are quoted.
Convert an absolute path to a canonical name relative to
|Parameters:||root – The base directory. By default the root for this container object is used.|
add_file(name, data, media_type=None, spine_index=None, modify_name_if_needed=False, process_manifest_item=None)¶
Add a file to this container. Entries for the file are automatically created in the OPF manifest and spine (if the file is a text document)
Add an entry to the manifest for a file with the specified name. Returns the manifest id.
Ensure that the specified properties are set on only the manifest item identified by name. You can pass None as the name to remove the property from all items.
The type of book (epub for EPUB files and azw3 for AZW3 files)
Commit all dirtied parsed objects to the filesystem and write out the ebook file at outpath.
Commit a parsed object to disk (it is serialized and written to the
underlying file). If
keep_parsed is True the parsed representation
is retained in the cache. See also:
True iff a file/directory corresponding to the canonical name exists. Note
that this function suffers from the limitations of the underlying OS
filesystem, in particular case (in)sensitivity. So on a case
insensitive filesystem this will return True even if the case of name
is different from the case of the underlying filesystem file. See also
Return the size in bytes of the file represented by the specified
canonical name. Automatically handles dirtied parsed objects. See also:
generate_item(name, id_prefix=None, media_type=None, unique_href=True)¶
Add an item to the manifest with href derived from the given name. Ensures uniqueness of href and id automatically. Returns generated item.
Similar to open() except that it returns a file path, instead of an open file object.
Mapping of guide type to canonical name
Return True iff a file with the same canonical name as that specified exists. Unlike
exists() this method is always case-sensitive.
Convert an href (relative to base) to a name. base must be a name or None, in which case self.root is used.
insert_into_xml(parent, item, index=None)¶
Insert item into parent (or append if index is None), fixing indentation. Only works with self closing items.
If this container represents an unzipped book (a directory)
Iterate over all links in name. If get_line_numbers is True the yields results of the form (link, line_number, offset). Where line_number is the line_number at which the link occurs and offset is the number of characters from the start of the line. Note that offset could actually encompass several lines if not zero.
Ensure that name does not already exist in this book. If it does, return a modified version that does not exist.
Return True if the manifest has an entry corresponding to name
Mapping of manifest id to canonical names
The names of all manifest items whose media-type matches predicate. predicate can be a set, a list, a string or a function taking a single argument, which will be called with the media-type.
All manifest items that have the specified property
Mapping of manifest media-type to list of canonical names of that media-type
The metadata of this book as a Metadata object. Note that this object is constructed on the fly every time this property is requested, so use it sparingly.
Convert a canonical name to an absolute OS dependant path
Convert a name to a href relative to base, which must be a name or None in which case self.root is used as the base
Set of names that must never be renamed. Depends on the ebook file format.
Set of names that must never be deleted from the container. Depends on the ebook file format.
Set of names that are allowed to be missing from the manifest. Depends on the ebook file format.
Open the file pointed to by name for direct read/write. Note that this will commit the file if it is dirtied and remove it from the parse cache. You must finish with this file before accessing the parsed version of it again, or bad things will happen.
The parsed OPF file
Convenience method to either return the first XML element with the specified name or create it under the opf:package element and then return it, if it does not already exist.
The version set on the OPF’s <package> element
The version set on the OPF’s <package> element as a tuple of integers
Convenience method to evaluate an XPath expression on the OPF file, has the opf: and dc: namespace prefixes pre-defined.
Return a parsed representation of the file specified by name. For
HTML and XML files an lxml tree is returned. For CSS files a cssutils
stylesheet is returned. Note that parsed objects are cached for
performance. If you make any changes to the parsed object, you must
dirty() so that the container knows to update the cache. See also
raw_data(name, decode=True, normalize_to_nfc=True)¶
Return the raw data corresponding to the file specified by name
Convert an absolute path (with os separators) to a path relative to
base (defaults to self.root). The relative path is not a name. Use
abspath_to_name() for that.
Remove the specified items (by canonical name) from the spine. If
is True, the items are also deleted from the book, not just from the spine.
Removes item from parent, fixing indentation (works only with self closing items)
Remove the item identified by name from this container. This removes all references to the item in the OPF manifest, guide and spine as well as from any internal caches.
Renames a file from current_name to new_name. It automatically rebases all links inside the file if the directory the file is in changes. Note however, that links are not updated in the other files that could reference this file. This is for performance, such updates should be done once, in bulk.
Replace the parsed object corresponding to name with obj, which must be a similar object, i.e. an lxml tree for HTML/XML or a cssutils stylesheet for a CSS file.
Replace all links in name using replace_func, which must be a
callable that accepts a URL and returns the replaced URL. It must also
have a ‘replaced’ attribute that is set to True if any actual
replacement is done. Convenient ways of creating such callables are
Convert a parsed object (identified by canonical name) into a bytestring. See
Set the spine to be spine_items where spine_items is an iterable of the form (name, linear). Will raise an error if one of the names is not present in the manifest.
An iterator that yields item, name is_linear for every item in the
books’ spine. item is the lxml element, name is the canonical file name
and is_linear is True if the item is linear. See also:
Replace links to files in the container. Will iterate over all files in the container and change the specified links in them.
Rename files in the container, automatically updating all links to them.
|Parameters:||file_map – A mapping of old canonical name to new canonical name, for
Return the folders that are recommended for the given filenames. The recommendation is based on where the majority of files of the same type are located in the container. If no files of a particular type are present, the recommended folder is assumed to be the folder containing the OPF file.
Fix any parsing errors in the HTML represented as a string in raw. Fixing is done using the HTML5 parsing algorithm.
Fix any parsing errors in all HTML files in the container. Fixing is done using the HTML5 parsing algorithm.
pretty_html(container, name, raw)¶
Pretty print the HTML represented as a string in raw
pretty_css(container, name, raw)¶
Pretty print the CSS represented as a string in raw
pretty_xml(container, name, raw)¶
Pretty print the XML represented as a string in raw. If
name is the name of the OPF, extra OPF-specific prettying is performed.
Pretty print all HTML/CSS/XML files in the container
Remove an existing jacket, if any. Returns False if no existing jacket was found.
Either create a new jacket from the book’s metadata or replace an existing jacket. Returns True if an existing jacket was replaced.
split(container, name, loc_or_xpath, before=True, totals=None)¶
Split the file specified by name at the position specified by loc_or_xpath. Splitting automatically migrates all links and references to the affected files.
multisplit(container, name, xpath, before=True)¶
Split the specified file at multiple locations (all tags that match the specified XPath expression). See also:
Splitting automatically migrates all links and references to the affected
|Parameters:||before – If True the splits occur before the identified element otherwise after it.|
merge(container, category, names, master)¶
Merge the specified files into a single file, automatically migrating all links and references to the affected files. The file must all either be HTML or CSS files.
set_cover(container, cover_path, report=None, options=None)¶
Set the cover of the book to the image pointed to by cover_path.
Mark the specified image as the cover image.
mark_as_titlepage(container, name, move_to_start=True)¶
Mark the specified HTML file as the titlepage of the EPUB.
|Parameters:||move_to_start – If True the HTML file is moved to the start of the spine|
change_font(container, old_name, new_name=None)¶
Change a font family from old_name to new_name. Changes all occurrences of the font family in stylesheets, style tags and style attributes. If the old_name refers to an embedded font, it is removed. You can set new_name to None to remove the font family instead of changing it.
remove_unused_css(container, report=None, remove_unused_classes=False, merge_rules=False)¶
Remove all unused CSS rules from the book. An unused CSS rule is one that does not match any actual content.
filter_css(container, properties, names=())¶
Remove the specified CSS properties from all CSS rules in the book.
Generate a Table of Contents from a list of XPath expressions. Each
expression in the list corresponds to a level of the generate ToC. For
['//h:h1', '//h:h2', '//h:h3'] will generate a three level
table of contents from the
Generate a Table of Contents from links in the book.
Generate a Table of Contents from files in the book.
Create an inline (HTML) Table of Contents from an existing NCX table of contents.
|Parameters:||title – The title for this table of contents.|
The base class for individual tools in an Edit Book plugin. Useful members include:
Methods that must be overridden in sub classes:
Set this to a unique name it will be used as a key
If True the user can choose to place this tool in the plugins toolbar
If True the user can choose to place this tool in the plugins menu
The popup mode for the menu (if any) of the toolbar button. Possible values are ‘delayed’, ‘instant’, ‘button’
The main window of the user interface
Return the current
calibre.ebooks.oeb.polish.container.Container object that represents the book being edited.
register_shortcut(qaction, unique_name, default_keys=(), short_text=None, description=None, **extra_data)¶
Register a keyboard shortcut that will trigger the specified
qaction. This keyboard shortcut
will become automatically customizable by the user in the Keyboard section of the editor preferences.
Create a QAction that will be added to either the plugins toolbar or
the plugins menu depending on
for_toolbar. For example:
def create_action(self, for_toolbar=True): ac = QAction(get_icons('myicon.png'), 'Do something') if for_toolbar: # We want the toolbar button to have a popup menu menu = QMenu() ac.setMenu(menu) menu.addAction('Do something else') subaction = menu.addAction('And another') # Register a keyboard shortcut for this toolbar action be # careful to do this for only one of the toolbar action or # the menu action, not both. self.register_shortcut(ac, 'some-unique-name', default_keys=('Ctrl+K',)) return ac
The ebook editor’s user interface is controlled by a single global Boss object. This has many useful methods that can be used in plugin code to perform common tasks.
Create a restore checkpoint with the name specified as
Update all the components of the user interface to reflect the latest data in the current book container.
|Parameters:||mark_as_modified – If True, the book will be marked as modified, so the user will be prompted to save it when quitting.|
Close the editor that is editing the file specified by
Commit any changes that the user has made to files open in editors to the container. You should call this method before performing any actions on the current container
Return the name of the file being edited currently or None if no file is being edited
edit_file(name, syntax=None, use_template=None)¶
Open the file specified by name in an editor
open_book(path=None, edit_file=None, clear_notify_data=True, open_folder=False)¶
Open the ebook at
path for editing. Will show an error if the ebook is not in a supported format or the current book has unsaved changes.
|Parameters:||edit_file – The name of a file inside the newly opened book to start editing. Can also be a list of names.|
Undo the previous creation of a restore checkpoint, useful if you create a checkpoint, then abort the operation with no changes
Save the book. Saving is performed in the background
Mark the book as having been modified
Show the changes to the book from its last checkpointed state
Show the editor that is editing the file specified by
Sync the position of the preview panel to the current cursor position in the current editor