This API provides services for searching the Haufe Content for documents and retrieving them.
API Settings | Help |
Note: This API only supports the Client Credentials flow.
This means that you will log in only using your application's client ID and client secret, without actually authenticating a
user. In effect, any /token
end point of the below Authentication Method(s) can be used for that.
Log in using a local username and password.
Token Endpoint
This API supports the following OAuth2 authorization flows:
The first thing you need is a contact person at Haufe-Lexware (e.g. _HL_Panama@haufe-lexware.com) that knows your use-case for ContentHub and can decide which scope is the right one for you (see below).
Once this is clear you need to
After that, the admin at Haufe-Lexware will choose a scope for your use-case and inform you about it. You can also notice the assigned scopes at any time by selecting to view an application under the "Applications" tab, then by pressing on the "Select" button to choose for which subscription you are interested to view the scopes.
Now you are ready to use the API. There are two important URLs:
Both of them you can find right above at the top of this page - the token endpoint URL is shown when you unfold the section with the greenish background (entitled "Username and Password (local)").
Now you need to use this command to request a token:
curl --location --request POST '<token-endpoint-url>' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'grant_type=client_credentials' \
--data-urlencode 'client_id=...' \
--data-urlencode 'client_secret=...' \
--data-urlencode 'scope=...'
The response to this request consists of a small JSON object that contains the token in the access_token
property.
To make a simple search request you need to pass the token as part of the Authorization
header like this:
curl --location --request GET '<api-url>/search?q=dienstreise' \
--header 'Authorization: Bearer <access-token>'
The q
parameter contains the query string, you can find all the details about it in the next sections.
The response contains a list of search results, each entry having an ID in the form of a URI starting with
contenthub://
. You can retrieve the so-called baseline content, which is an XHTML document, using a
request like this:
curl --location --request GET '<api-url>/retrieval/baseline?contentHubId=...' \
--header 'Authorization: Bearer <access-token>'
Congratulations! Now you have succeeded with the first steps of using the ContentHub!
If you want to know more about the details of searching and retrieving content, just read on...
Parameter | Mandatory | Description |
---|---|---|
q | yes | query – Search expression (syntax described further bellow). The request may contain an arbitrary number of q parameters |
field | no | defines weighted field references for use in (full-text) search expressions; see Local Field References |
offset | no | Index of the first result to return, allows to page through result lists |
limit | no | Maximum number of results to return, allows to page through result lists |
preview | no | indicates if the preview element of the original documents appears in the result entries |
At the core of any search is a query which indicates the criteria
matching documents should satisfy. A query can be as simple as a single
word (income
) or a (potentially complex) composition of several
search expressions
(("income" OR "revenue" -(tags:"revoked") title:"\*statement*")
).
Search expressions are applied to the fields specified in the search request. If no fields have been specified the default is the document's title and the document's baselineContent and baselineSearchableText.
The list of supported search expressions is as follows:
Just written as a word without any quotation marks. For instance income
.
Income
or INCOME
will be found as well)incomes
will be found as well)One or many words encapsulated by double quotation marks. For instance:
"income tax"
. Like words:
"income tax"
will only match documents containing both words income
and tax
in that sequence.Denoted by a minus sign -
in front of a sub-expression. For instance -tax
will match documents not containing the word tax
.
Denoted by whitespace between adjacent search expressions. For instance
tax income
will only match documents containing both words tax
and
income
. Contrary to the similar Phrase expression the order of both words
and the distance between both words in the matching document is not relevant.
Denoted by the keyword OR between adjacent search expressions. For instance
tax OR income
will match documents containing either the word tax
or the
word income
, or both.
Denoted by a pair of parenthesis around a sequence of sub-expressions. Expression grouping will mostly happen implicitly according to the Expression Precedence Rules stated below. However, sometimes it will be required to use explicit grouping in order to get the correct result, namely when implicit grouping and precedence would result in an incorrect query.
Compair:
tax OR income statement
will match documents containing either
both words tax
and statement
or the words income
and
statement
tax
(whether or not containing the word statement
) or both
words tax
and statement
the query would have to be written
as tax OR (income statement)
Denoted by a scope name and a colon :
in front of a sub-expression. Examples:
documentType:News
will match documents having a metadata field
documentType with value News, ordocumentType:(News OR Article)
will match documents having a metadata field
documentType with a value of either News or Article, ordocumentType:News application:portals
will match documents having a
metadata field documentType with the value News and a metadata field
application with the value portals.However, Scope expressions must not be nested, e.g. title:(author:someone)
will be flagged as invalid.
The list of valid scope names is given in Local Field References and covers
metadata fields (and thus search facets) and local field references prefixed by
field
separated by a single period. Additional scopes may be added in future.
As a general rule, fields which belong to the ContentHub namespace
(http://contenthub.haufe-lexware.com/haufe-document
) are considered
"well known" and have no special prefix. Fields which have been introduced by
a content-producer (and are only used by that content-producer) are prefixed
with the corresponding application id, separated by period (e.g.,
portals.category
).
Please have a look in the Search fields which fields allow scoping with an exact value, a range-expression or a search-expression.
Denoted either by
<=
, >=
, <
or >
as prefix in front of a word or...
or ..
as infix between two wordsFor instance:
sortDate:<=2014-12-31
will match all documents created or updated before or
at the end of the year 2014, whilesortDate:2014...2015-12-31
will match all documents created or updated
between the begin of the year 2014 and end of the year 2015.Range expressions can only be used as expressions in a scope-expression. Using a range-expression outside of a scope-expression will be rejected.
Dates can be given in an absolute manner or a relative one. While an absolute date is a fixed point in time a relative date is always calculated by using the current time.
The syntax for absolute dates is
absoluteDate : year ( '-' month ( '-' day ( 'T' ( time | any ) )? )? )? ;
year : Digit Digit Digit Digit ;
month : Digit Digit ;
day : Digit Digit ;
time : hours ( ':' minutes ( ':' seconds ( '.' milliseconds )? )? )? timezone? ;
hours : Digit Digit ;
minutes : Digit Digit ;
seconds : Digit Digit ;
milliseconds : Digit+ ;
timezone : 'Z' | ( plusOrMinus? Digit+ ':' Digit+ ) ;
any : '*' ;
plusOrMinus : ( '+' | '-' ) ;
Examples:
2014
the first millisecond in 20142020-10
the first millisecond in October 20202021-12-31T23:59:50Z
10s before new years eve 2022 in LondonRelative dates are caculated in relation to "now" (which is the current time). The syntax for relative dates is
relativeDate : dateOffset ( 'T' timeOffset )? ;
dateOffset : plusOrMinus? ( yearOffset | monthOffset | weekOffset | dayOffset )+ ;
yearOffset : Digit+ 'y' ;
monthOffset : Digit+ 'm' ;
weekOffset : Digit+ 'w' ;
dayOffset : Digit+ 'd' ;
timeOffset : hours ( ':' minutes ( ':' seconds )? )? ;
hours : Digit Digit ;
minutes : Digit Digit ;
seconds : Digit Digit ;
plusOrMinus : ( '+' | '-' ) ;
Examples:
-1d
exactly 24 hours ago+0dT06:30
6 hours and 30 minutes from nowSearch all documents that have been ingested in the past 2 hours
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=ingestionDate:>-0dT02:00'
--header 'Authorization: Bearer <token>'
Fulltext search
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=onboarding'
--header 'Authorization: Bearer <token>'
Search with multiple terms
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=onboarding prozess plattform'
--header 'Authorization: Bearer <token>'
Filtering for application (content source)
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=onboarding application:transformation'
--header 'Authorization: Bearer <token>'
Filtering for documentType
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=onboarding documentType:News'
--header 'Authorization: Bearer <token>'
Filter by one tag
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=tag:Organisationsentwicklung'
--header 'Authorization: Bearer <token>'
Filter by multiple tags - enumeration
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=tag:(Organisationsentwicklung OR Digitalisierung OR "New Work")'
--header 'Authorization: Bearer <token>'
Filter by multiple tags - wildcard
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=tag:*rganisation*'
--header 'Authorization: Bearer <token>'
While parsing queries the following expression precedence will be used (highest to lowest):
Level | Expression |
---|---|
6 | Group |
5 | Phrase, Range |
4 | Scope |
3 | Not (Negation) |
2 | OR (Disjunction) |
1 | AND (Conjunction) |
While most of the time expression precedence will follow intuition there are some possibly surprising scenarios, mostly around the combined use of AND and OR expressions.
E.g. income OR tax statement OR fine
will be parsed equivalent to
(income OR tax) (statement or fine)
. The surprise may be that OR
expressions take precedence over AND expressions, contrary to e.g.
Boolean Logic. However, this choice was made as it more closely aligns
with the common use of natural language as e.g. reflected by
left OR right handed
.
There are two sets of fields that can be used in the query expressions - fields defined by ContentHub that reference elements in the document that are defined in the ContentHub document schema and on the other hand there are fields that reference elements that are defined by the content providers. The custom fields from content providers are prefixed by the application name of their content.
title
- The title of the documents. Note that title elements can occur
multiple times with several types. This field will point to all elements
which are read as title (such as ch:title or ch:title[name='compoundTitle']).
This means searches/constraints will apply to the text of all such elements
of a document. This is included in the default fields that will be searched
in with a weight of 2.baseline
- A combination of the baselineContent and the baselineSearchableText
elements of the documents. This is included in the default fields with a weight of 1.native
- Referencing the nativeSearchableText element.appDocId
- Referencing the appDocId elementapplication
- Referencing the application element. This field can be used as
search facet.documentType
- Referencing the documentType element. This field can be used
as search facet.tag
- Referencing the tag elements and searching in all of them.
This field can be used as search facet.quickSearchPhrase
- Referencing the quickSearchPhrase elements and searching
in all of them.packageId
- Referencing the packageId element and searching in all of them.
This field can be used as search facet.publisher
- Referencing the publisher elements and searching in all of them.creator
- Referencing the creator elements and searching in all of them.ingestionDate
- Date when this document was ingested into ContentHub.sortDate
- Referencing the chronologicalSortDate element.revisionDate
- Referencing the revisionDate element.publicationDate
- Referencing the publicationDate element.visible
- Referencing the visible element.fingerprint
- Referencing the fingerprint element.idesk.documentType
- Referencing the idesk specific documentType.
This can be used as a search facet.idesk.documentCategory
- Referencing the idesk specific documentCategory.
Can be used as a search facet.idesk.documentSubcategory
- Referencing the idesk specific documentSubcategory.
Can be used as a search facet.idesk.documentClassification
- Referencing the idesk specific documentClassification.
Can be used as a search facet.idesk.subjectAreaId
- Referencing the idesk specific subjectAreaId.idesk.isRoot
- Referencing the idesk specific isRoot element.
Can be used as a search facet.idesk.quickSearchField
idesk.rootId
- Referencing the idesk specific rootId.portals.category
- Referencing the portals specific category element.
Can be used as search facet.portals.subcategory
- Referencing the portals specific subcategory element.
Can be used as search facet.portals.visibleInSuite
academy.subjectArea
- Referencing the academy specific subjectArea element.
Can be used as search facet.haufeshop.mediaType
- Referencing the portals specific category element.
Can be used as search facet.haufeshop.topProduct
hot.category_title
hot.category_id
hot.parent_category_id
hot.sold_out
Local field references are used to bind names to locations within content hub documents and to specify the relative relevance weight associated with these locations, respectively. These names can then be used to specify scopes of full-text queries in Search Expressions.
Each occurrence of the query parameter field
defines a single local
field reference. (Note that a search request can include an arbitrary
number of field
query parameters!)
The syntax of a local field reference is as follows (in EBNF syntax)
local-field-reference = reference-id ':' field-spec { sep field-spec } ;<1>
reference-id = ref-start-char ref-char\* ;<2> <3>
ref-start-char = letter | '\_' ; ref-char = digit | '-' | ref-start-char
; field-spec = field-name { ',' qualification } ;<4>
field-name = ? database field name, in practice an NCName ? ;<5>
qualification = weight ;
weight = 'weight:' decimal ;<6> <7>
decimal = \[ '+' | '-' \] ( '0' | ( non-zero-digit digit\* )) \[ '.'
digit\* \] ; sep = ( ' ' | '+' ) { sep } ; letter = ? any of the
characters 'A' - 'Z' and 'a' - 'z' ? ; digit = '0' | non-zero-digit ;
non-zero-digit = '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
If more than one field specification is given, than the scope defined by this local field reference is the union of all specified fields (with their respective weights taken into account for the computation of relevance scores in case of matches).
The reference id is used to refer to this local field reference in a
field query scope. If, say, the reference id is tags
, then the
search expression field.tags:term
will search for occurrences of
term
in the fields specified by the local field reference with if
tags
.
Reference identifiers must be unique within each search request;
i.e., there must not be two field
query parameters that define
references with the same id.
If no qualification is specified, then weight:1.0
is assumed.
So far, the list of applicable field names is static and available from the content hub team. Once additional fields can be defined more or less on the fly, there will be an API to fetch this list.
A weight equal to 0.0
means that matches in the respective field
do not contribute to the relevance score.
The current implementation cuts off values that fall outside the interval [-16, 64]. Values with an absolute value smaller than 1/16 = 0.0625 are rounded to 0.
By default, documents within search results will be ordered by decreasing relevance. Relevance will be scored by an algorithm considering things like search term position and frequency, document type and age and various other attributes into account and generally provides good results.
However, at some occasions, users might want to have tighter control
about the result sorting and therefore an additional parameter sorting
can be passed to a search request. Pass a field name here to sort the
result set ascending by the values of this field. In order to sort
descending prefix the field with a hyphen -
.
All the fields from the ContentHub fields
section can be used for sorting
e.g. title, documentType
- in these cases the documents will be sorted alphabetically (by the specified field).
In order to sort for the date that content providers set as the date
to sort by for their documents use the field sortDate. So in order to
get the most recent documents on top use -sortDate
.
Concrete example (with curl)
Search for onboarding
and sort the results in alphabetical order by title
.
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=onboarding&sorting=title'
--header 'Authorization: Bearer <token>'
Search for onboarding
and sort by date (most recent documents on top).
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=onboarding&sorting=-sortDate'
--header 'Authorization: Bearer <token>'
Sometimes it is desired to prefer certain kinds of documents over others (e.g. recent News over Textbooks), assuming both match the actual query. This process of rule based adjustment of document relevance is referred to as boosting.
To enable boosting, search requests may contain a list of boost parameters. Each boost parameter associates a number (the boost weight) combined with a search expression, stating: if a matching document also matches the boost's search expression it should gain the associated extra weight in relevance.
The syntax of a boost rule is as follows (in EBNF syntax)
boost = 'x' boost-weight ',' search-expression ;
boost-weight = ? double using '.' as comma in the range (0, 64] ? <1>
search-expression = ? a searchexpression ? <2>
{{host}}/content/v1/search?q=something&boost=x5,documentType:News
will assign an extra boost of factor 5 to any matching document that happens to
contain the value News in its documentType metadata field
{{host}}/content/v1/search?q=dog&boost=x0.5,"shepherd"
will down-boost all documents containing the phrase "shepherd"
Finding the right search expressions tends to be an iterative process. During this process, users typically don’t want to read matching documents front-to-end but instead just need to get a quick glimpse at which parts of a document matched a certain search term and an extract of content surrounding that match. This information, document extracts with an indication which search term produced a match, is typically referred to as snippets.
The Search API provides a snippets
parameter that allows to control
the number, complexity and the location within documents of these
snippets. Its syntax is as follows (in EBNF snytax)
snippets = count {',' qualification} ;
count = 'count:' ( '0' <1>
| posint <2>
) ;
qualification = numberOfTokens
| location ;
numberOfTokens = 'numberOfTokens:' posint ; <3>
location = 'location:' ( locationDef | '(' locationDef ( ',' locationDef )* ')' ); <4>
locationDef = locationName ( '(count:' count ')' )? ; <5>
posint = ? positive integer ? ;
locationName = ( 'title' | 'baselineContent' ) ;
Defining a count of 0
disables snippet generation and thus can improve
response times.
The count given will be the maximum number of snippets returned in the search response. The actual number of snippets may be lower if not enough hits in the document were found. A snippet still may have one or many highlights.
The numberOfTokens defines the maximum number of tokens (typically words) contained in each snippet. The tokens within a snippet may be less if the content has not enough tokens. The highlighted token is included in that count.
With location
all locations to generate snippets from can be defined. As
visible in the locationName
definition only title
and baselineContent
are supported and available. The order in which the locations are given does
define the order the snippets appear in the search result.
A locationDef
must specify a locationName
. Aditionally defining a count
is possible. The count given in the locationDef
defines the maximum
number of snippets generated for this location.
By default, up to 3 snippets with maximum 10 tokens (words) for the locations
title
and baselineContent
are generated. Each snippet may contain up to 2
highlights each.
count:0
disables snippetting for this request (will improve response times)
count:5
Returns up to 5 snippets.
count:5,numberOfTokens:25
Returns up to 5 snippets with a maximum number of 25 words.
location:(title,baselineContent)
Returns matches only in title
and baselineContent
.
count:3,location:(title,baselineContent)
Returns up to 3 snippets, preferring matches from the document’s title
,
followed by matches from the document’s baselineContent
section.
count:5,location:(title(count:1),baselineContent)
Returns up to 5 snippets, matches from the document’s title
first,
but limited to only one snippet, the rest of snippets will be created
from the document’s baselineContent
section.
A common question surrounding the use of constraints when formulating a search request is how to obtain a list of valid or reasonable constraint values. One way of obtaining such values is to facet listing.
For this, search requests may additionally contain instructions about
which constraint values should be listed and how this listing should
occur. More specifically, search requests support an arbitrary list of
facet
parameters (e.g. facet
query parameters when using HTTP GET).
The content of each facet
parameter has to follow EBNF syntax of
facet = constraint-name {',' qualification} ; <1>
qualification = sorting
| limit
| drill-in
| match-count ;
sorting = 'sort:' ( 'count' | '-count' | 'value' | '-value' ) ; <2>
limit = 'limit:' posint ;
drill-in = 'drill:' ( xname | '(' xname {','xname} ')' ) ; <3>
match-count = 'count:' ( '>=' posint <4>
| '<=' posint <5>
| posint '...' posint <6>
) ;
posint = ? positive integer ? ;
xname = ? expression name ? ; <7>
Indicates the constraint values should be listed for. If the name
does not match any of the defined constraints the whole facet
parameter will be ignored.
By default, facet values are listed with decreasing relevance
(equivalent to sort:-count
).
For further details on using facet drill-in please refer to Advanced Facet Browsing.
Allows constraining facet values to those occurring in e.g. at least 5 documents
Allows constraining facet values to those occurring in e.g. at most 10 documents
Allows constraining facet values to those occurring in e.g. more than 5 but less than 10 documents
Refers to a named expressions within the list of search expressions.
documentType
Requests a listing of documentType
values accepting a possibly defined
default sorting and limit.
documentType,limit:20
Requests a listing of up to 20 documentType
values (first 20 values
according to default sorting.
documentType,sort:value
Requests a listing of documentType
values, to be retrieved in
increasing value order.
documentType,sort:-value
Requests a listing of documentType
values, to be retrieved in
decreasing value order.
documentType,limit:20,sort:-count
Requests a listing of the 20 most common documentType
values (up to 20
values in order of decreasing relevance).
documentType,count:>=3
Requests a listing of documentType
values, skipping the long tail of
values that occur in only two or less documents.
Facet concrete example (with curl)
In order to find possible values for filtering you can explore the data using the faceting feature. Here for example we query a facet for tags and sort descending by the count of documents. That gives us the most frequently used tags in the set of documents we can access. In the result the buckets of the tag facet will hold the values of tags.
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?facet=tag,limit:50,sort:-count'
--header 'Authorization: Bearer <token>'
facet=documentType
Requests a listing of documentType
values accepting a possibly defined
default sorting and limit.
facet=documentType,drill:documentType
combined with q:documentType=documentType:ENTSCH
Returns hits only where the documentType
is ENTSCH
and the bucket has two count attributes:
documentType:ENTSCH
constraintdocumentType:ENTSCH
constraint. This can be useful
when displaying the facets results, on a web page, when clicking on a particular facet (documentType), the result of
the search with that documentType constraint is displayed on the right side of the page, but, by using the countNonDrill
,
the numbers will not change for the categories (the facets) on the left side of the page.The drill feature can also be used to get the number of hits per category and subcategory.
Facet with drill concrete example (with curl)
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=steuer&q:docTypeString=documentType:BEITRAG&facet=packageId,limit:100,sort:-count,drill:docTypeString'
--header 'Authorization: Bearer <token>'
The buckets from the response of a request like this will again have two count values:
packageId
i.e categorypackageId
and documentType
i.e. subcategoryStemming maps a word to its common lemma (stem). For example, Kinder stems to the noun Kind and gearbeitet stems to the verb arbeiten.
An unstemmed search matches only the word form you’re searching for. For example, searching for Kind will not match a document containing Kinder. With stemming, the search matches the exact term, plus words with the same stem. Thus, a search for arbeiten will also match documents containing arbeitend or gearbeitet because they all share the stem arbeiten in German.
Use the aggregation
keyword to group search results and return the
most relevant hits per group.
aggregation = group-by {',' match-count ',' discriminator} ; <1>
match-count = 'sample:' posint ; <2>
discriminator = 'discriminator:' QName ; <3>
Aggregation is a group by clause where the most relevant hit per discriminator is selected.
The grouped sample size, this is set to 1 at the moment (sample:1
is the only implemented value)
The discriminator specifies the QName of the element by which the hits will be grouped. Valid values are:
{http://idesk.haufe-lexware.com/document-meta}rootId
{http://contenthub.haufe-lexware.com/haufe-document}documentType
Example:
group-by,sample:1,discriminator:{http://idesk.haufe-lexware.com/document-meta}rootId
Invisible documents are all documents with the metadata field visible
set to "false".
By default, ContentHub will exclude invisible documents, unless the used search expression
explicitly contains a scoped expression for the field visible
.
Examples:
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=Steuer'
--header 'Authorization: Bearer <token>'
Searches for documents containing "Steuer". Only visible documents will be returned.
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=linklist visible:false'
--header 'Authorization: Bearer <token>'
Searches explicitly for invisible documents containing "linklist".
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/search?q=Arbeit visible:(true OR false)'
--header 'Authorization: Bearer <token>'
Searches for visible and invisible documents containing "Arbeit".
The Retrieval API allows consumers to retrieve full documents (including
attachments) or specific parts of documents. The root context of the
following RESTful paths is /content/v1
.
Issue GET
on the /retrieval
URI with the following parameters
query parameter contentHubId
to identify the document to be
retrieved
optional query parameter withBlobs
to additionally retrieve any
images or blobs embedded in the document. Values are true
or
false
with default false.
Issue GET
on the following URIs (query parameters are exactly as
before unless specified)
/retrieval/meta
retrieves the document meta data section
/retrieval/baseline
retrieves the baseline content section. A
document teaser can be additionally requested using the query
parameter teaser
. This parameter is of type int and specifies
number of characters in the final teaser.
/retrieval/native
retrieves the native content section
Blobs in this content refers to any kind of binary attachment to a main
document. The parameter contentHubId
must be specified with the
following paths in order to identify the main document.
/retrieval/blobs
retrieves all binary attachments in one multipart
response
/retrieval/blobs/{blobId}
retrieves the named binary attachment
where blobId
is a path variable and represents the id of the
attachment.
/retrieval/blobs/{blobId}/meta
retrieves the meta data of the
named binary attachment
/content/v1/retrieval?contentHubId=contenthub://portals/content/215516&withBlobs=true
: retrieve a full document and all its binary attachments.
/content/v1/retrieval/baseline?contentHubId=contenthub://portals/content/215516&teaser=100
: retrieve the baseline Content of a document and a 100 character
teaser.
/content/v1/retrieval/blobs/2214.pdf?contentHubId=contenthub://portals/content/215516
: retrieve blob 2214.pdf from the given document
Concrete examples (with curl)
Full document retrieval
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/retrieval?contentHubId=contenthub://idesk/HI13538112'
--header 'Authorization: Bearer <token>'
Baseline content retrieval
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/retrieval/baseline?contentHubId=contenthub://idesk/HI13538112'
--header 'Authorization: Bearer <token>'
Baseline content with teaser retrieval
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/retrieval/baseline?contentHubId=contenthub://idesk/HI1856776&teaser=500'
--header 'Authorization: Bearer <token>'`
This operation returns an atom collection of all contentHub documents of a compound document
Issue GET
on the /retrieval/compound/{documentPart}
URI with the following
parameters
path parameter documentPart
is the part that should be returned.
This can be either:
query parameter contentHubId
to identify the document to be retrieved
optional query parameter withBlobs
to additionally retrieve any
images or blobs embedded in the document. Values are true
or
false
with default false.
optional query parameter constrainingQuery
is an additional query
can be defined that all returned document parts have to match. This
can be used to filter the returned compound doc for those parts that
should be visible to the user.
Concrete example (with curl)
curl --location --request GET 'https://api.contenthub.haufe.io/content/v1/retrieval/compound/full?contentHubId=contenthub://idesk/LI7635254' \
--header 'Authorization: Bearer <token>'
This operation registers a new job that will be processed in the background. It returns a job descriptor containing the properties of the job.
Issue POST
on the /retrieval/bulk/job
URI with a JSON object that may contain
a search descriptor specifying the documents to be included in the export
a retention period that defines how long the export will be kept available
whether to include the searchable text of the documents in the export
This operation returns a list of all existing export jobs.
Issue GET
on the /retrieval/bulk/job
URI to get a JSON list of job descriptors.
This operation returns the descriptor of a bulk export job to examine whether it has been finished already and can be fetched.
Issue GET
on the /retrieval/bulk/job/{bulkExportJobId}
URI with the
bulkExportJobId
variable referring to the UUID of the export job.
If the status
property in the response object is set to FINISHED
the export can
be downloaded from the link given in the fetchUrl
property.
This operation allows to schedule an export job for cancellation. The job will not be canceled asynchronously, usually after the operation has returned.
Issue DELETE
on the /retrieval/bulk/job/{bulkExportJobId}
URI with the
bulkExportJobId
variable referring to the UUID of the export job.
You are currently not logged in, so we can't display your registered applications. Please log in first.