Preprocessors
Remove header
Ignores repeating elements at the tops of pages.
Sensible recognizes headers in one of two ways:
- (Default) Sensible searches for repeated text at the top of the page. For more information about automatic recognition, see Notes.
- (Configurable) To bypass automatic recognition, for example to recognize header text that varies slightly, configure a text match. Sensible removes all text above the top boundary of the matched text. The preprocessor removes text on pages in which it finds the match, and ignores pages missing the match.
Parameters
key | value | description |
---|---|---|
type (required) | removeHeader | For an example, see the Examples section. |
startsOnPage | integer. default: 1 | The first page number on which to start checking for repeated elements. Note this is the page number, not the page’s zero-based index in the pages array. To filter out end pages that lack a repeating element, use the Page Range preprocessor to define an End Page parameter. |
match | Match object or array of Match objects | Bypasses automatic header recognition.Removes all text on the page above the top boundary of the matched line.If Sensible doesn’t find the match, it doesn’t perform header removal. |
offsetY | number in inches. default: 0 | Bypasses automatic footer recognition.Defines a point at which to start text removal. Positive values offset down the page, negative values offset up the page.If used with no Match parameter defined, offsets from the top of the page.If used with the Match parameter, offsets from the top boundary of the matched line. |
Examples
The following example shows:
- A repeating header with an incrementing page number. Sensible removes this.
- A repeating sidebar that overlaps the y-extent of both repeating and variable elements:
- Where it overlaps a repeating element, Sensible treats it as repeating and removes it.
- Where it overlaps variable text, Sensible treats it as nonrepeating and retains it.
Config
JSON
Example document
The following images show the example document used with this example config:
Example document | Download link |
---|
Output
JSON
Notes
Automatic header recognition
To recognize a header, this preprocessor starts at the top of the page and moves down the page, stopping as soon as it finds a nonrepeating element.
Sensible recognizes these elements as “repeating”:
- Elements whose y-extent doesn’t overlap with any variable element
- Positively incrementing page numbers
These elements aren’t recognized as “repeating”:
- Elements that change their alignment on alternate pages (for example, page numbers aligned alternately left and right, as in a book)
- A repeating element that’s missing from even one page (for example, from an intentionally blank page).