Skip to main content

ITextractBlock

Interface in AWS.Textract

Interface for Block.

Properties

BlockType

property BlockType: string

Specifies the type of text item recognized.

Valid values for text detection operations: - PAGE: Contains a list of the LINE Block objects that are detected on a document page. - WORD: A word detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces. - LINE: A string of tab-delimited, contiguous words that are detected on a document page. Valid values for text analysis operations: - PAGE: Contains a list of child Block objects that are detected on a document page. - KEY_VALUE_SET: Stores the KEY and VALUE Block objects for linked text that's detected on a document page. Use the EntityType field to determine if a KEY_VALUE_SET object is a KEY or VALUE Block object. - WORD: A word detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces. - LINE: A string of tab-delimited, contiguous words that are detected on a document page. - TABLE: A table detected on a document page. A table is grid-based information with two or more rows or columns, with a cell span of one row and one column each. - CELL: A cell within a detected table. The cell is the parent of the block that containes the text in the cell. - SELECTION_ELEMENT: A selection element such as an option button (radio button) or a check box that's detected on a document page. Use the value of SelectionStatus to determine the status of the selection element.

ColumnIndex

property ColumnIndex: TOptional<Integer>

Specifies the column in which a table cell appears.

The first column position is 1.

ColumnSpan

property ColumnSpan: TOptional<Integer>

Specifies the number of columns the cell spans.

Currently this value is always 1, even if the number of columns the cell spans is greater than 1.

Confidence

property Confidence: TOptional<Single>

Specifies the confidence score that Amazon Textract has in the accuracy of the recognized text and the accuracy of the geometry points around the recognized text.

EntityTypes

property EntityTypes: TList<string>

Specifies the type of the entity.

Valid values: - KEY: An identifier for a field on the document. - VALUE: The field's text.

Geometry

property Geometry: ITextractGeometry

Type: ITextractGeometry

Specifies the location of the recognized text on the image.

Id

property Id: string

Specifies an identifier for the recognized text.

The identifier is only unique for a single operation.

Page

property Page: TOptional<Integer>

Specifies the page on which the block was detected.

Page is returned by asynchronous operations. Page values greater than 1 are only returned for multipage documents that are in PDF or TIFF format.

Query

property Query: ITextractQuery

Type: ITextractQuery

A query.

Relationships

property Relationships: TList<ITextractRelationship>

Type: ITextractRelationship

Specifies a list of child blocks of the current block.

For example, a LINE object has child blocks for each WORD block that's part of the line of text.

RowIndex

property RowIndex: TOptional<Integer>

Specifies the row in which a table cell is located.

Tje first row is position 1.

RowSpan

property RowSpan: TOptional<Integer>

Specifies the number of rows that the table cell spans.

Currently, this value is always 1.

SelectionStatus

property SelectionStatus: string

Specifies the selection status of a selection element, such as an option button or check box.

Text

property Text: string

Specifies the word or line of text that's recognized by Amazon Textract.

TextType

property TextType: string

Specifies the kind of text that Amazon Textract has detected.

Valid values: HANDWRITING | PRINTED.