Conversione E-book

Calibre ha un sistema di conversione progettato per essere molto semplice, generalmente basta aggiungere un e-book ed il sistema farà di tutto per generare un risultato più fedele possibile all’originale. Calibre accetta vari formati in ingresso ma alcuni sono più adatti di altri alla conversione, se hai questo tipo di file e desideri un maggior controllo sul risultato finale, Calibre dispone di molte opzioni per ottimizzare il processo. Ricorda che il sistema di conversione non può sostituire un editor di e-book dedicato. Per creare un e-book ti suggerisco di convertirlo prima in EPUB o AZW3 e quindi usare la funzione :guilabel:”Modifica Libro” per ottenere l’aspetto migliore, utilizza il file così generato per ottenere altri formati in Calibre.

Questo documento farà riferimento alle impostazioni che si trovano nella finestra di dialogo «Converti Libro» come nella foto qui sotto. Tutte le impostazioni sono disponibili anche da riga di comando come descritto su “ebook-convert”. In Calibre puoi ottenere aiuto tenendo il puntatore del mouse sopra ogni singola voce, si aprirà una finestra che descrive l’impostazione.

Finestra di conversione e-book

Introduzione

La prima cosa da comprendere è che la conversione è concepita come un sistema a cascata che schematicamente si presenta così:

Il processo di conversione

Il formato in ingresso viene inizialmente convertito in XHTML dall” Input plugin appropriato. Questo HTML è quindi trasformato. Nell’ultima fase l’XHTML elaborato viene convertito nel formato di uscita desiderato dall” Output plugin specifico. Alcuni formati in ingresso si convertono meglio di altri e questo può incidere notevolmente sui risultati, una lista dei migliori formati di input è disponibile quì.

Tutto il lavoro si svolge nella fase di modifica dell” XHTML in uscita. Ci sono varie operazioni da fare, ad esempio si possono inserire i metadata del libro come pagina iniziale, riconoscere l’intestazione dei capitoli e creare automaticamente una tavola dei contenuti «TOC», regolare la grandezza del font in modo proporzionale, ecc… E” importante ricordare che tutte le operazioni di modifica vengono effettuate dall”Input plugin sull” XHTML e non sul file d’origine. Per fare un esempio: se chiedi a Calibre di convertire un RTF in EPUB, questi convertirà prima l’RTF in XHTML internamente e quindi, dopo aver applicato le modifiche desiderate, verrà creato l’EPUB tramite l”Output plugin generando automaticamente i metadata, la TOC ecc..

You can see this process in action by using the debug option Icona di debug. Just specify the path to a folder for the debug output. During conversion, calibre will place the XHTML generated by the various stages of the conversion pipeline in different sub-folders. The four sub-folders are:

Fasi del processo di conversione

Cartella

Descrizione

input

Contiene l’HTML in uscita prodotto dal Plugin di Input. Usala per il debug del Plugin di Input.

parsed

Il risultato di pre-elaborazione e conversione in XHTML del risultato del Plugin di Input. Usalo per il debug del rilevamento struttura.

structure

Dopo il rilevamento della struttura ma prima dell’appiattimento del CSS e della conversione della dimensione carattere. Usala per il debug di dimensione carattere e trasformazioni CSS.

processed

Appena prima che l’e-book sia passato al Plugin di Output. Usala per il debug di quest’ultimo passo.

If you want to edit the input document a little before having calibre convert it, the best thing to do is edit the files in the input sub-folder, then zip it up, and use the ZIP file as the input format for subsequent conversions. To do this use the Edit meta information dialog to add the ZIP file as a format for the book and then, in the top left corner of the conversion dialog, select ZIP as the input format.

This document will deal mainly with the various transforms that operate on the intermediate XHTML and how to control them. At the end are some tips specific to each input/output format.

Aspetto

This group of options controls various aspects of the look and feel of the converted e-book.

Caratteri

One of the nicest features of the e-reading experience is the ability to easily adjust font sizes to suit individual needs and lighting conditions. calibre has sophisticated algorithms to ensure that all the books it outputs have a consistent font sizes, no matter what font sizes are specified in the input document.

The base font size of a document is the most common font size in that document, i.e., the size of the bulk of text in that document. When you specify a Base font size, calibre automatically rescales all font sizes in the document proportionately, so that the most common font size becomes the specified base font size and other font sizes are rescaled appropriately. By choosing a larger base font size, you can make the fonts in the document larger and vice versa. When you set the base font size, for best results, you should also set the font size key.

Normally, calibre will automatically choose a base font size appropriate to the output profile you have chosen (see Impostazioni di pagina). However, you can override this here in case the default is not suitable for you.

The Font size key option lets you control how non-base font sizes are rescaled. The font rescaling algorithm works using a font size key, which is simply a comma-separated list of font sizes. The font size key tells calibre how many «steps» bigger or smaller a given font size should be compared to the base font size. The idea is that there should be a limited number of font sizes in a document. For example, one size for the body text, a couple of sizes for different levels of headings and a couple of sizes for super/sub scripts and footnotes. The font size key allows calibre to compartmentalize the font sizes in the input documents into separate «bins» corresponding to the different logical font sizes.

Illustriamo con un esempio. Supponiamo che il documento di partenza che stiamo convertendo sia stato prodotto da una persona con una vista eccellente e che abbia una dimensione base dei caratteri di 8pt. Ciò significa che la maggior parte del testo del documento ha una dimensione di 8pt, mentre le intestazioni sono leggermente più grandi (ad esempio 10 e 12pt) e le note a piè di pagina sono leggermente più piccole (6pt). Ora, se utilizziamo le seguenti impostazioni:

Base font size : 12pt
Font size key  : 7, 8, 10, 12, 14, 16, 18, 20

Il documento di output avrà una dimensione di base di 12pt, titoli di 14 e 16pt e note a piè di pagina di 8pt. Supponiamo ora di voler far risaltare maggiormente il titolo più grande e di rendere un po” più grandi anche le note a piè di pagina. Per ottenere questo risultato, l’opzione del font deve essere cambiata in:

New font size key : 7, 9, 12, 14, 18, 20, 22

I titoli più grandi diventeranno ora di 18pt, mentre le note a piè di pagina diventeranno di 9pt. È possibile giocare con queste impostazioni per cercare di capire quale sia quella ottimale per voi utilizzando la procedura guidata di ridimensionamento dei caratteri, a cui si accede facendo clic sul piccolo pulsante accanto all’impostazione Tasto dimensione carattere.

È inoltre possibile disattivare tutti i ridimensionamenti delle dimensioni dei caratteri durante la conversione, se si desidera conservare le dimensioni dei caratteri nel documento di input.

A related setting is Line height. Line height controls the vertical height of lines. By default, (a line height of 0), no manipulation of line heights is performed. If you specify a non-default value, line heights will be set in all locations that don’t specify their own line heights. However, this is something of a blunt weapon and should be used sparingly. If you want to adjust the line heights for some section of the input, it’s better to use the Extra CSS.

In this section you can also tell calibre to embed any referenced fonts into the book. This will allow the fonts to work on reader devices even if they are not available on the device.

Testo

Text can be either justified or not. Justified text has extra spaces between words to give a smooth right margin. Some people prefer justified text, others do not. Normally, calibre will preserve the justification in the original document. If you want to override it you can use the Text justification option in this section.

You can also tell calibre to Smarten punctuation which will replace plain quotes, dashes and ellipses with their typographically correct alternatives. Note that this algorithm is not perfect so it is worth reviewing the results. The reverse, namely, Unsmarted punctuation is also available.

Finally, there is Input character encoding. Older documents sometimes don’t specify their character encoding. When converted, this can result in non-English characters or special characters like smart quotes being corrupted. calibre tries to auto-detect the character encoding of the source document, but it does not always succeed. You can force it to assume a particular character encoding by using this setting. cp1252 is a common encoding for documents produced using Windows software. You should also read Come posso convertire il mio file contenente caratteri non inglesi o virgolette intelligenti? for more on encoding issues.

Disposizione

Normally, paragraphs in XHTML are rendered with a blank line between them and no leading text indent. calibre has a couple of options to control this. Remove spacing between paragraphs forcefully ensure that all paragraphs have no inter paragraph spacing. It also sets the text indent to 1.5em (can be changed) to mark the start of every paragraph. Insert blank line does the opposite, guaranteeing that there is exactly one blank line between each pair of paragraphs. Both these options are very comprehensive, removing spacing, or inserting it for all paragraphs (technically <p> and <div> tags). This is so that you can just set the option and be sure that it performs as advertised, irrespective of how messy the input file is. The one exception is when the input file uses hard line breaks to implement inter-paragraph spacing.

If you want to remove the spacing between all paragraphs, except a select few, don’t use these options. Instead add the following CSS code to Extra CSS:

p, div { margin: 0pt; border: 0pt; text-indent: 1.5em }
.spacious { margin-bottom: 1em; text-indent: 0pt; }

Then, in your source document, mark the paragraphs that need spacing with class=»spacious». If your input document is not in HTML, use the Debug option, described in the Introduction to get HTML (use the input sub-folder).

Another useful options is Linearize tables. Some badly designed documents use tables to control the layout of text on the page. When converted these documents often have text that runs off the page and other artifacts. This option will extract the content from the tables and present it in a linear fashion. Note that this option linearizes all tables, so only use it if you are sure the input document does not use tables for legitimate purposes, like presenting tabular information.

Stile

The Extra CSS option allows you to specify arbitrary CSS that will be applied to all HTML files in the input. This CSS is applied with very high priority and so should override most CSS present in the input document itself. You can use this setting to fine tune the presentation/layout of your document. For example, if you want all paragraphs of class endnote to be right aligned, just add:

.endnote { text-align: right }

o se si vuole cambiare l’indentazione di tutti i paragrafi:

p { text-indent: 5mm; }

Extra CSS è un’opzione molto potente, ma è necessario conoscere il funzionamento dei CSS per poterla utilizzare al meglio. Si può usare l’opzione pipeline di debug descritta sopra per vedere quali CSS sono presenti nel documento di input.

A simpler option is to use Filter style information. This allows you to remove all CSS properties of the specified types from the document. For example, you can use it to remove all colors or fonts.

Trasforma stili

This is the most powerful styling related facility. You can use it to define rules that change styles based on various conditions. For example you can use it to change all green colors to blue, or remove all bold styling from the text or color all headings a certain color, etc.

Trasforma HTML

Similar to transform styles, but allows you to make changes to the HTML content of the book. You can replace one tag with another, add classes or other attributes to tags based on their content, etc.

Impostazioni di pagina

The Page setup options are for controlling screen layout, like margins and screen sizes. There are options to setup page margins, which will be used by the output plugin, if the selected output format supports page margins. In addition, you should choose an Input profile and an output profile. Both sets of profiles basically deal with how to interpret measurements in the input/output documents, screen sizes and default font rescaling keys.

If you know that the file you are converting was intended to be used on a particular device/software platform, choose the corresponding input profile, otherwise just choose the default input profile. If you know the files you are producing are meant for a particular device type, choose the corresponding output profile. Otherwise, choose one of the Generic output profiles. If you are converting to MOBI or AZW3 then you will almost always want to choose one of the Kindle output profiles. Otherwise, your best bet for modern E-book reading devices is to choose the Generic e-ink HD output profile.

The output profile also controls the screen size. This will cause, for example, images to be auto-resized to be fit to the screen in some output formats. So choose a profile of a device that has a screen size similar to your device.

Elaborazione euristica

Heuristic processing provides a variety of functions which can be used to try and detect and correct common problems in poorly formatted input documents. Use these functions if your input document suffers from poor formatting. Because these functions rely on common patterns, be aware that in some cases an option may lead to worse results, so use with care. As an example, several of these options will remove all non-breaking-space entities, or may include false positive matches relating to the function.

Abilita l’elaborazione euristica

This option activates calibre’s Heuristic processing stage of the conversion pipeline. This must be enabled in order for various sub-functions to be applied

Unwrap lines

Enabling this option will cause calibre to attempt to detect and correct hard line breaks that exist within a document using punctuation clues and line length. calibre will first attempt to detect whether hard line breaks exist, if they do not appear to exist calibre will not attempt to unwrap lines. The line-unwrap factor can be reduced if you want to “force” calibre to unwrap lines.

Line-unwrap factor

This option controls the algorithm calibre uses to remove hard line breaks. For example, if the value of this option is 0.4, that means calibre will remove hard line breaks from the end of lines whose lengths are less than the length of 40% of all lines in the document. If your document only has a few line breaks which need correction, then this value should be reduced to somewhere between 0.1 and 0.2.

Rileva e qualifica titoli e sottotitoli di capitolo non formattati

If your document does not have chapter headings and titles formatted differently from the rest of the text, calibre can use this option to attempt to detect them and surround them with heading tags. <h2> tags are used for chapter headings; <h3> tags are used for any titles that are detected.

This function will not create a TOC, but in many cases it will cause calibre’s default chapter detection settings to correctly detect chapters and build a TOC. Adjust the XPath under Structure detection if a TOC is not automatically created. If there are no other headings used in the document then setting «//h:h2» under Structure detection would be the easiest way to create a TOC for the document.

Le intestazioni inserite non sono formattate, per applicare la formattazione, utilizzare l’opzione Extra CSS nelle impostazioni di conversione Look and Feel. Per esempio, per centrare i tag di intestazione, usare il seguente:

h2, h3 { text-align: center }
Renumber sequences of <h1> or <h2> tags

Some publishers format chapter headings using multiple <h1> or <h2> tags sequentially. calibre’s default conversion settings will cause such titles to be split into two pieces. This option will re-number the heading tags to prevent splitting.

Elimina le righe vuote tra i paragrafi

This option will cause calibre to analyze blank lines included within the document. If every paragraph is interleaved with a blank line, then calibre will remove all those blank paragraphs. Sequences of multiple blank lines will be considered scene breaks and retained as a single paragraph. This option differs from the Remove paragraph spacing option under Look and Feel in that it actually modifies the HTML content, while the other option modifies the document styles. This option can also remove paragraphs which were inserted using calibre’s Insert blank line option.

Assicurati che i salti di scena siano formattati in modo coerente

With this option calibre will attempt to detect common scene-break markers and ensure that they are center aligned. “Soft” scene break markers, i.e. scene breaks only defined by extra white space, are styled to ensure that they will not be displayed in conjunction with page breaks.

Sostituisci i salti di scena

If this option is configured then calibre will replace scene break markers it finds with the replacement text specified by the user. Please note that some ornamental characters may not be supported across all reading devices.

In general you should avoid using HTML tags, calibre will discard any tags and use pre-defined markup. <hr /> tags, i.e. horizontal rules, and <img> tags are exceptions. Horizontal rules can optionally be specified with styles, if you choose to add your own style be sure to include the “width” setting, otherwise the style information will be discarded. Image tags can used, but calibre does not provide the ability to add the image during conversion, this must be done after the fact using the “Edit book” feature.

Example image tag (place the image within an “Images” folder inside the EPUB after conversion):

<img style=»width:10%» src=»../Images/scenebreak.png» />

Esempio di regola orizzontale con stili:

<hr style=»width:20%;padding-top: 1px;border-top: 2px ridge black;border-bottom: 2px groove black;»/>

Rimuovi trattini superflui

calibre will analyze all hyphenated content in the document when this option is enabled. The document itself is used as a dictionary for analysis. This allows calibre to accurately remove hyphens for any words in the document in any language, along with made-up and obscure scientific words. The primary drawback is words appearing only a single time in the document will not be changed. Analysis happens in two passes, the first pass analyzes line endings. Lines are only unwrapped if the word exists with or without a hyphen in the document. The second pass analyzes all hyphenated words throughout the document, hyphens are removed if the word exists elsewhere in the document without a match.

Applica il corsivo a parole e schemi comuni

When enabled, calibre will look for common words and patterns that denote italics and italicize them. Examples are common text conventions such as ~word~ or phrases that should generally be italicized, e.g. latin phrases like “etc.” or “et cetera”.

Sostituisci i rientri forzati con rientri CSS

Alcuni documenti seguono una convenzione che costruisce i rientri usando spazi unificatori (che non vanno a capo). Quando questa opzione è abilitata calibre proverà a rilevare questa formattazione e a convertirla in un rientro CSS del 3%.

Cerca e sostituisci

Queste opzioni sono utili soprattutto per la conversione di documenti PDF o per le conversioni OCR, ma possono anche essere utilizzate per risolvere molti problemi specifici del documento. Ad esempio, alcune conversioni possono lasciare intestazioni e piè di pagina nel testo. Queste opzioni utilizzano espressioni regolari per cercare di individuare intestazioni, piè di pagina o altro testo arbitrario e rimuoverli o sostituirli. Ricordate che operano sull’XHTML intermedio prodotto dalla pipeline di conversione. È disponibile una procedura guidata che consente di personalizzare le espressioni regolari per il documento. Fare clic sulla bacchetta magica accanto alla casella di espressione e fare clic sul pulsante “Prova” dopo aver composto l’espressione di ricerca. Le corrispondenze corrette saranno evidenziate in Giallo.

The search works by using a Python regular expression. All matched text is simply removed from the document or replaced using the replacement pattern. The replacement pattern is optional, if left blank then text matching the search pattern will be deleted from the document. You can learn more about regular expressions and their syntax at Tutto sull’utilizzo delle espressioni regolari in calibre.

Rilevazione struttura

Structure detection involves calibre trying its best to detect structural elements in the input document, when they are not properly specified. For example, chapters, page breaks, headers, footers, etc. As you can imagine, this process varies widely from book to book. Fortunately, calibre has very powerful options to control this. With power comes complexity, but if once you take the time to learn the complexity, you will find it well worth the effort.

Capitoli e interruzioni di pagina

calibre has two sets of options for chapter detection and inserting page breaks. This can sometimes be slightly confusing, as by default, calibre will insert page breaks before detected chapters as well as the locations detected by the page breaks option. The reason for this is that there are often location where page breaks should be inserted that are not chapter boundaries. Also, detected chapters can be optionally inserted into the auto generated Table of Contents.

calibre uses XPath, a powerful language to allow the user to specify chapter boundaries/page breaks. XPath can seem a little daunting to use at first, fortunately, there is a XPath tutorial in the User Manual. Remember that Structure detection operates on the intermediate XHTML produced by the conversion pipeline. Use the debug option described in the Introduzione to figure out the appropriate settings for your book. There is also a button for a XPath wizard to help with the generation of simple XPath expressions.

By default, calibre uses the following expression for detecting chapters:

//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\s+', 'i')) or @class = 'chapter']

This expression is rather complex, because it tries to handle a number of common cases simultaneously. What it means is that calibre will assume chapters start at either <h1> or <h2> tags that have any of the words (chapter, book, section or part) in them or that have the class=»chapter» attribute.

A related option is Chapter mark, which allows you to control what calibre does when it detects a chapter. By default, it will insert a page break before the chapter. You can have it insert a ruled line instead of, or in addition to the page break. You can also have it do nothing.

L’impostazione predefinita per il rilevamento delle interruzioni di pagina è:

//*[name()='h1' or name()='h2']

which means that calibre will insert page breaks before every <h1> and <h2> tag by default.

Nota

Le espressioni predefinite possono cambiare a seconda del formato di input che si sta convertendo.

Altro

In questa sezione sono presenti altre opzioni.

Insert metadata as page at start of book

One of the great things about calibre is that it allows you to maintain very complete metadata about all of your books, for example, a rating, tags, comments, etc. This option will create a single page with all this metadata and insert it into the converted e-book, typically just after the cover. Think of it as a way to create your own customised book jacket.

Rimuovi la prima immagine

Sometimes, the source document you are converting includes the cover as part of the book, instead of as a separate cover. If you also specify a cover in calibre, then the converted book will have two covers. This option will simply remove the first image from the source document, thereby ensuring that the converted book has only one cover, the one specified in calibre.

Indice

When the input document has a Table of Contents in its metadata, calibre will just use that. However, a number of older formats either do not support a metadata based Table of Contents, or individual documents do not have one. In these cases, the options in this section can help you automatically generate a Table of Contents in the converted e-book, based on the actual content in the input document.

Nota

Using these options can be a little challenging to get exactly right. If you prefer creating/editing the Table of Contents by hand, convert to the EPUB or AZW3 formats and select the checkbox at the bottom of the Table of Contents section of the conversion dialog that says Manually fine-tune the Table of Contents after conversion. This will launch the ToC Editor tool after the conversion. It allows you to create entries in the Table of Contents by simply clicking the place in the book where you want the entry to point. You can also use the ToC Editor by itself, without doing a conversion. Go to Preferences → Interface → Toolbars and add the ToC Editor to the main toolbar. Then just select the book you want to edit and click the ToC Editor button.

The first option is Force use of auto-generated Table of Contents. By checking this option you can have calibre override any Table of Contents found in the metadata of the input document with the auto generated one.

The default way that the creation of the auto generated Table of Contents works is that, calibre will first try to add any detected chapters to the generated table of contents. You can learn how to customize the detection of chapters in the Rilevazione struttura section above. If you do not want to include detected chapters in the generated table of contents, check the Do not add detected chapters option.

If less than the Chapter threshold number of chapters were detected, calibre will then add any hyperlinks it finds in the input document to the Table of Contents. This often works well: many input documents include a hyperlinked Table of Contents right at the start. The Number of links option can be used to control this behavior. If set to zero, no links are added. If set to a number greater than zero, at most that number of links is added.

calibre will automatically filter duplicates from the generated Table of Contents. However, if there are some additional undesirable entries, you can filter them using the TOC Filter option. This is a regular expression that will match the title of entries in the generated table of contents. Whenever a match is found, it will be removed. For example, to remove all entries titles «Next» or «Previous» use:

Next|Previous

Le opzioni Level 1,2,3 TOC consentono di creare un sofisticato indice multilivello. Sono espressioni XPath che corrispondono a tag nell’XHTML intermedio prodotto dalla pipeline di conversione. Si veda Introduzione per sapere come accedere a questo XHTML. Leggere anche Guida di XPath, per imparare a costruire le espressioni XPath. Accanto a ogni opzione c’è un pulsante che lancia una procedura guidata per aiutare a creare le espressioni XPath di base. Il semplice esempio che segue illustra l’uso di queste opzioni.

Supponiamo di avere un documento di input che risulta in XHTML come questo:

<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <title>Sample document</title>
    </head>
    <body>
        <h1>Chapter 1</h1>
        ...
        <h2>Section 1.1</h2>
        ...
        <h2>Section 1.2</h2>
        ...
        <h1>Chapter 2</h1>
        ...
        <h2>Section 2.1</h2>
        ...
    </body>
</html>

Quindi, impostiamo le opzioni come:

Level 1 TOC : //h:h1
Level 2 TOC : //h:h2

Si otterrà così un indice a due livelli, generato automaticamente, che avrà l’aspetto di:

Chapter 1
    Section 1.1
    Section 1.2
Chapter 2
    Section 2.1

Avvertimento

Not all output formats support a multi level Table of Contents. You should first try with EPUB output. If that works, then try your format of choice.

Usare immagini come titolo di capitolo nella conversione di documenti sorgente HTML

Suppose you want to use an image as your chapter title, but still want calibre to be able to automatically generate a Table of Contents for you from the chapter titles. Use the following HTML markup to achieve this:

<html>
    <body>
        <h2>Chapter 1</h2>
        <p>chapter 1 text...</p>
        <h2 title="Chapter 2"><img src="chapter2.jpg" /></h2>
        <p>chapter 2 text...</p>
    </body>
</html>

Set the Level 1 TOC setting to //h:h2. Then, for chapter two, calibre will take the title from the value of the title attribute on the <h2> tag, since the tag has no text.

Usare gli attributi dei tag per fornire il testo delle voci dell’Indice

If you have particularly long chapter titles and want shortened versions in the Table of Contents, you can use the title attribute to achieve this, for example:

<html>
    <body>
        <h2 title="Chapter 1">Chapter 1: Some very long title</h2>
        <p>chapter 1 text...</p>
        <h2 title="Chapter 2">Chapter 2: Some other very long title</h2>
        <p>chapter 2 text...</p>
    </body>
</html>

Set the Level 1 TOC setting to //h:h2/@title. Then calibre will take the title from the value of the title attribute on the <h2> tags, instead of using the text inside the tag. Note the trailing /@title on the XPath expression, you can use this form to tell calibre to get the text from any attribute you like.

Come vengono impostate e salvate le opzioni di conversione

There are two places where conversion options can be set in calibre. The first is in Preferences->Conversion. These settings are the defaults for the conversion options. Whenever you try to convert a new book, the settings set here will be used by default.

You can also change settings in the conversion dialog for each book conversion. When you convert a book, calibre remembers the settings you used for that book, so that if you convert it again, the saved settings for the individual book will take precedence over the defaults set in Preferences. You can restore the individual settings to defaults by using the Restore defaults button in the individual book conversion dialog. You can remove the saved settings for a group of books by selecting all the books and then clicking the Edit metadata button to bring up the bulk metadata edit dialog, near the bottom of the dialog is an option to remove stored conversion settings.

When you bulk convert a set of books, settings are taken in the following order (last one wins):

  • From the defaults set in Preferences->Conversion

  • From the saved conversion settings for each book being converted (if any). This can be turned off by the option in the top left corner of the Bulk conversion dialog.

  • From the settings set in the Bulk conversion dialog

Note that the final settings for each book in a Bulk conversion will be saved and re-used if the book is converted again. Since the highest priority in Bulk Conversion is given to the settings in the Bulk conversion dialog, these will override any book specific settings. So you should only bulk convert books together that need similar settings. The exceptions are metadata and input format specific settings. Since the Bulk conversion dialog does not have settings for these two categories, they will be taken from book specific settings (if any) or the defaults.

Nota

You can see the actual settings used during any conversion by clicking the rotating icon in the lower right corner and then double clicking the individual conversion job. This will bring up a conversion log that will contain the actual settings used, near the top.

Suggerimenti per formati specifici

Here you will find tips specific to the conversion of particular formats. Options specific to particular format, whether input or output are available in the conversion dialog under their own section, for example TXT input or EPUB output.

Conversione di documenti Microsoft Word

calibre can automatically convert .docx files created by Microsoft Word 2007 and newer. Just add the file to calibre and click convert.

Nota

There is a demo .docx file that demonstrates the capabilities of the calibre conversion engine. Just download it and convert it to EPUB or AZW3 to see what calibre can do.

calibre will automatically generate a Table of Contents based on headings if you mark your headings with the Heading 1, Heading 2, etc. styles in Microsoft Word. Open the output e-book in the calibre E-book viewer and click the Table of Contents button to view the generated Table of Contents.

Vecchi file .doc

For older .doc files, you can save the document as HTML with Microsoft Word and then convert the resulting HTML file with calibre. When saving as HTML, be sure to use the «Save as Web Page, Filtered» option as this will produce clean HTML that will convert well. Note that Word produces really messy HTML, converting it can take a long time, so be patient. If you have a newer version of Word available, you can directly save it as .docx as well.

Un’altra alternativa è quella di usare il programma gratuito LibreOffice. Apri il tuo file .doc in LibreOffice e salvalo come .docx, che può essere convertito direttamente in calibre.

Conversione di documenti TXT

TXT documents have no well defined way to specify formatting like bold, italics, etc, or document structure like paragraphs, headings, sections and so on, but there are a variety of conventions commonly used. By default calibre attempts automatic detection of the correct formatting and markup based on those conventions.

TXT input supports a number of options to differentiate how paragraphs are detected.

Paragraph style: Auto

Analyzes the text file and attempts to automatically determine how paragraphs are defined. This option will generally work fine, if you achieve undesirable results try one of the manual options.

Paragraph style: Block

Assumes one or more blank lines are a paragraph boundary:

This is the first.

This is the
second paragraph.
Paragraph style: Single

Assume che ogni riga sia un paragrafo:

This is the first.
This is the second.
This is the third.
Paragraph style: Print

Assumes that every paragraph starts with an indent (either a tab or 2+ spaces). Paragraphs end when the next line that starts with an indent is reached:

  This is the
first.
  This is the second.

  This is the
third.
Paragraph style: Unformatted

Assumes that the document has no formatting, but does use hard line breaks. Punctuation and median line length are used to attempt to re-create paragraphs.

Formatting style: Auto

Attempts to detect the type of formatting markup being used. If no markup is used then heuristic formatting will be applied.

Formatting style: Heuristic

Analyzes the document for common chapter headings, scene breaks, and italicized words and applies the appropriate HTML markup during conversion.

Formatting style: Markdown

calibre also supports running TXT input though a transformation preprocessor known as Markdown. Markdown allows for basic formatting to be added to TXT documents, such as bold, italics, section headings, tables, lists, a Table of Contents, etc. Marking chapter headings with a leading # and setting the chapter XPath detection expression to «//h:h1» is the easiest way to have a proper table of contents generated from a TXT document. You can learn more about the Markdown syntax at daringfireball.

Formatting style: None

Applies no special formatting to the text, the document is converted to HTML with no other changes.

Conversione di documenti PDF

PDF documents are one of the worst formats to convert from. They are a fixed page size and text placement format. Meaning, it is very difficult to determine where one paragraph ends and another begins. calibre will try to unwrap paragraphs using a configurable, Line un-wrapping factor. This is a scale used to determine the length at which a line should be unwrapped. Valid values are a decimal between 0 and 1. The default is 0.45, just under the median line length. Lower this value to include more text in the unwrapping. Increase to include less. You can adjust this value in the conversion settings under PDF Input.

Also, they often have headers and footers as part of the document that will become included with the text. Use the Search and replace panel to remove headers and footers to mitigate this issue. If the headers and footers are not removed from the text it can throw off the paragraph unwrapping. To learn how to use the header and footer removal options, read Tutto sull’utilizzo delle espressioni regolari in calibre.

Some limitations of PDF input are:

  • Complex, multi-column, and image based documents are not supported.

  • Neanche l’estrazione di immagini vettoriali e tabelle da dentro il documento è supportata.

  • Some PDFs use special glyphs to represent ll or ff or fi, etc. Conversion of these may or may not work depending on just how they are represented internally in the PDF.

  • Links and Tables of Contents are not supported

  • PDFs that use embedded non-Unicode fonts to represent non-English characters will result in garbled output for those characters

  • Some PDFs are made up of photographs of the page with OCRed text behind them. In such cases calibre uses the OCRed text, which can be very different from what you see when you view the PDF file

  • PDFs that are used to display complex text, like right to left languages and math typesetting will not convert correctly

To re-iterate PDF is a really, really bad format to use as input. If you absolutely must use PDF, then be prepared for an output ranging anywhere from decent to unusable, depending on the input PDF.

Collezioni di fumetti

A comic book collection is a .cbc file. A .cbc file is a ZIP file that contains other CBZ/CBR files. In addition the .cbc file must contain a simple text file called comics.txt, encoded in UTF-8. The comics.txt file must contain a list of the comics files inside the .cbc file, in the form filename:title, as shown below:

one.cbz:Chapter One
two.cbz:Chapter Two
three.cbz:Chapter Three

Il file .cbc conterrà:

comics.txt
one.cbz
two.cbz
three.cbz

calibre will automatically convert this .cbc file into a e-book with a Table of Contents pointing to each entry in comics.txt.

Esempio di formattazione avanzata EPUB

Various advanced formatting for EPUB files is demonstrated in this demo file. The file was created from hand coded HTML using calibre and is meant to be used as a template for your own EPUB creation efforts.

The source HTML it was created from is available demo.zip. The settings used to create the EPUB from the ZIP file are:

ebook-convert demo.zip .epub -vv --authors "Kovid Goyal" --language en --level1-toc '//*[@class="title"]' --disable-font-rescaling --page-breaks-before / --no-default-epub-cover

Note that because this file explores the potential of EPUB, most of the advanced formatting is not going to work on readers less capable than calibre’s built-in EPUB viewer.

Conversione di documenti ODT

calibre can directly convert ODT (OpenDocument Text) files. You should use styles to format your document and minimize the use of direct formatting. When inserting images into your document you need to anchor them to the paragraph, images anchored to a page will all end up in the front of the conversion.

To enable automatic detection of chapters, you need to mark them with the built-in styles called Heading 1, Heading 2, …, Heading 6 (Heading 1 equates to the HTML tag <h1>, Heading 2 to <h2>, etc). When you convert in calibre you can enter which style you used into the Detect chapters at box. Example:

  • If you mark Chapters with style Heading 2, you have to set the “Detect chapters at” box to //h:h2

  • For a nested TOC with Sections marked with Heading 2 and the Chapters marked with Heading 3 you need to enter //h:h2|//h:h3. On the Convert - TOC page set the Level 1 TOC box to //h:h2 and the Level 2 TOC box to //h:h3.

Well-known document properties (Title, Keywords, Description, Creator) are recognized and calibre will use the first image (not to small, and with good aspect-ratio) as the cover image.

There is also an advanced property conversion mode, which is activated by setting the custom property opf.metadata (“Yes or No” type) to Yes in your ODT document (File->Properties->Custom Properties). If this property is detected by calibre, the following custom properties are recognized (opf.authors overrides document creator):

opf.titlesort
opf.authors
opf.authorsort
opf.publisher
opf.pubdate
opf.isbn
opf.language
opf.series
opf.seriesindex

In addition to this, you can specify the picture to use as the cover by naming it opf.cover (right click, Picture->Options->Name) in the ODT. If no picture with this name is found, the “smart” method is used. As the cover detection might result in double covers in certain output formats, the process will remove the paragraph (only if the only content is the cover!) from the document. But this works only with the named picture!

To disable cover detection you can set the custom property opf.nocover (“Yes or No” type) to Yes in advanced mode.

Conversione in PDF

The first, most important, setting to decide on when converting to PDF is the page size. By default, calibre uses a page size of «U.S. Letter». You can change this to another standard page size or a completely custom size in the PDF Output section of the conversion dialog. If you are generating a PDF to be used on a specific device, you can turn on the option to use the page size from the output profile instead. So if your output profile is set to Kindle, calibre will create a PDF with page size suitable for viewing on the small Kindle screen.

Intestazioni e piè di pagina

You can insert arbitrary headers and footers on each page of the PDF by specifying header and footer templates. Templates are just snippets of HTML code that get rendered in the header and footer locations. For example, to display page numbers centered at the bottom of every page, in green, use the following footer template:

<footer><div style="margin: auto; color: green">_PAGENUM_</div></footer>

calibre will automatically replace _PAGENUM_ with the current page number. You can even put different content on even and odd pages, for example the following header template will show the title on odd pages and the author on even pages:

<header style="justify-content: flex-end">
    <div class="even-page">_AUTHOR_</div>
    <div class="odd-page"><i>_TITLE_</i></div>
</header>

calibre will automatically replace _TITLE_ and _AUTHOR_ with the title and author of the document being converted. Setting justify-content to flex-end will cause the text to be right aligned.

You can also display text at the left and right edges and change the font size, as demonstrated with this header template:

<header style="justify-content: space-between; font-size: smaller">
    <div>_TITLE_</div>
    <div>_AUTHOR_</div>
</header>

This will display the title at the left and the author at the right, in a font size smaller than the main text.

You can also use the current section in templates, as shown below:

<header><div>_SECTION_</div></header>

_SECTION_ is replaced by whatever the name of the current section is. These names are taken from the metadata Table of Contents in the document (the PDF Outline). If the document has no table of contents then it will be replaced by empty text. If a single PDF page has multiple sections, the first section on the page will be used. Similarly, there is a variable named _TOP_LEVEL_SECTION_ that can be used to get the name of the current top-level section.

You can even use JavaScript inside the header and footer templates, for example, the following template will cause page numbers to start at 4 instead of 1:

<footer>
    <div></div>
    <script>document.currentScript.parentNode.querySelector("div").innerHTML = "" + (_PAGENUM_ + 3)</script>
</footer>

In addition there are some more variables you can use in the headers and footers, documented below:

  • _TOTAL_PAGES_ - total number of pages in the PDF file, useful for implementing a progress counter, for example.

  • _TOP_LEVEL_SECTION_PAGES_ - total number of pages in the current top level section

  • _TOP_LEVEL_SECTION_PAGENUM_ - the page number of the current page within the current top level section

Nota

When adding headers and footers make sure you set the page top and bottom margins to large enough values, under the PDF Output section of the conversion dialog.

Printable Table of Contents

You can also insert a printable Table of Contents at the end of the PDF that lists the page numbers for every section. This is very useful if you intend to print out the PDF to paper. If you wish to use the PDF on an electronic device, then the PDF Outline provides this functionality and is generated by default.

You can customize the look of the generated Table of contents by using the Extra CSS conversion setting under the Look & feel part of the conversion dialog. The default CSS used is listed below, simply copy it and make whatever changes you like.

.calibre-pdf-toc table { width: 100%% }

.calibre-pdf-toc table tr td:last-of-type { text-align: right }

.calibre-pdf-toc .level-0 {
    font-size: larger;
}

.calibre-pdf-toc .level-1 td:first-of-type { padding-left: 1.4em }
.calibre-pdf-toc .level-2 td:first-of-type { padding-left: 2.8em }

Custom page margins for individual HTML files

If you are converting an EPUB or AZW3 file with multiple individual HTML files inside it and you want to change the page margins for a particular HTML file you can add the following style block to the HTML file using the calibre E-book editor:

<style>
@page {
    margin-left: 10pt;
    margin-right: 10pt;
    margin-top: 10pt;
    margin-bottom: 10pt;
}
</style>

Then, in the PDF output section of the conversion dialog, turn on the option to Use page margins from the document being converted. Now all pages generated from this HTML file will have 10pt margins.