The Content Search endpoint allows you to quickly and easily search for content from any of your enabled content sources.
This allows you to
- Search for content by query string
- Retrieve content published within a given date range
- Search for content by publisher or content type
- Search for content mentioning a specific security, keyword, or topic
Prerequisites
Before calling the content API's ensure that
- Ensure you have you API
access_token
. Find out how to get your token in Authentication- Enable your content sources, by following our configuration guide
Examples
Date Range for filtering content or scrolling through content
To query the content API for a specific range it as simple as making a call to the endpoint, including a fromdate
and todate
date strings in the format YYYY-mm-ddTHH:MM:SS
as demonstrated here
{
"maxresults": 1,
"fromdate": "2023-08-12T01:01:01",
"todate": "2023-08-15T01:01:01"
}
{
"resultsfound": 2032,
"resultsreturned": 1,
"results": [
{
"fxcid": "4407e4f460c243b1a951016a1bf363c8",
...,
"datetimepublished": "2023-08-02T08:27:15+00:00",
"components": [
{
"role": "story",
...,
"content": "Piper Sandler analyst Brent Bracelin maintains Freshworks (NASDAQ:<a class=\"ticker\" href=\"https://www.benzinga.com/stock/FRSH#NASDAQ\">FRSH</a>) with a Overweight and raises the price target from $22 to $27."
}
],
"timetaken": 5,
"searchinfo": 0
}
The above shows just a snippet of the response. Full anatomy of the content response objects given in Anatomy of a Content Object below
Paging through results
- We can utilize the date range arguments to page through our results.
- To achieve this, take the last item from the results list returned and retrieve the datetimepublished value (shown above)
- Then feed this back into the todate argument of the subsequent API call.
- Repeat as necessary as paging
Filtering content by text match
The simplest and quickest use case for the endpoint is to search for some content referencing a particular query string. Below, we get search for content mentioning "Tesla" that was published within a defined date range:
{
"apikey": "<access_token>",
"contentquery": "Tesla",
"maxresults": 2,
"fromdate": "2023-08-12",
"todate": "2023-08-15"
}
{
"resultsfound": 171,
"resultsreturned": 2,
"results": [
{
"fxcid": "5f3f1961e9234129a299a9a03ef0db71",
"publisherid": "6f956d5af2d94bc095081c5916a13df6",
"contenettypeid": "01a65965e78d44ddbc508729f7c22b63",
"publishergontentid": "1657884957941944320",
"version": "v1",
"type": "Text",
"datetimepublished": "2023-05-14T23:05:50+00:00",
"unixdatetimepublished": 1684105550,
"datetimecreated": "2023-05-14T23:05:50+00:00",
"unixdatetimecreated": 1684105550,
"datetimeupdated": null,
"unixdatetimeupdated": null,
"entities": [],
"topics": [],
"components": [
{
"role": "Tweet",
"contentmetadata": {
"language": "en",
"slugline": null,
"headline": null,
"description": null,
"keywords": [
{
"value": "sspencer_smb",
"type": "User"
},
{
"value": "WholeMarsBlog",
"type": "User"
},
{
"value": "twitter.com/WholeMarsBlog/…",
"type": "url"
}
],
"authors": [
{
"name": "Steven Spencer",
"username": "sspencer_smb",
"role": "TwitterUser",
"profileImageurl": "https://pbs.twimg.com/profile_images/897435372044648449/2v0EJLku_normal.jpg"
}
],
"securities": []
},
"content": "\"The strategy of getting an ordinary Tesla to drive itself with just computer vision. It wasn't so crazy after all.\" $TSLA https://t.co/Z5OpLK4z7R"
}
],
"images": [],
"timestampprocessed": "2023-05-14T23:05:56.9817902+00:00",
"unixtimestampprocessed": 1684105556
}, ...
],
"timetaken": 323,
"searchinfo": 0
}
Retrieve content tagged with a specific security
To return just content tagged by a specific security value, you can add that value to the securitynames
or securityidentifiers
fields, like so:
{
"apikey": "<access_token>",
"securitynames": ["AAPL"],
"securityidentifiers": ["AAPL"],
"maxresults": 2,
"fromdate": "2023-08-12",
"todate": "2023-08-15"
}
Note
Only content sources that contain the “securities” field in their response will be returned when filtering by security.
Filter by a specific publisher (data source)
To retrieve content from one or more specific publishers (data sources) we should utilize the publisherids field.
To do this we first need to retrieve the available publishers using the List Publishers Endpoint and then feed them into the publisherids
field of the news feed endpoint
Getting a list of publishers
The List publishers endpoint returns a list of publishers and content types that are enabled for your api key. The
publisherid
field includes the id that can be used to filter content by publishercurl --location 'https://personafinai.azurewebsites.net/v1/content/publishers/list/enabled' \ --header 'Content-Type: application/json' \ --header 'Cookie: ARRAffinity=628356fae902f3f844f9e9113bb6432b5013900ff654c4981f9460b163e412d2; ARRAffinitySameSite=628356fae902f3f844f9e9113bb6432b5013900ff654c4981f9460b163e412d2' \ --data '{ "apikey": "<access_token>" }'
[ { "publisher": "Benzinga Inc", "publisherid": "6299a0723d954934980299568664f74a", "owner": "Benzinga Inc", "contenttype": "Story", "datasetname": "Benzinga News", "contenttypeid": "daa7632ac81a4cb5a96f4bcee228e527" }, { "publisher": "Benzinga Inc", "publisherid": "6299a0723d954934980299568664f74a", "owner": "Benzinga Inc", "contenttype": "Story", "datasetname": "Why is it Moving", "contenttypeid": "036df6e258e1427aa41f79945f8b7a16" } ]
Once you have the id of the target publisher you wish target, we can easily filter the results. For example, the following request body passed into the content search endpoint, will return the latest 10 news items, filtered to the Twitter data source/publisher
{
"apikey": "<access_token>",
"publisherids": [
"6f956d5af2d94bc095081c5916a13df6"
]
}
Filter by by one or more content types
You may want to go one step further than filtering down to a particular publisher, and instead filter down to one or more content sets (types) that are provided by that publisher. For example, if you want to populate a Benzinga "Why Is It Moving" feed, excluding any other content from other Benzinga content sets or other publishers.
To do this we first need to retrieve the available content types using the List Publishers Endpoint and then feed them into the contenttypeids
field of the news feed endpoint
Getting a list of content types
The List publishers endpoint returns a list of publishers and content types that are enabled for your api key. The
contenttypeid
field includes the id that can be used to filter content by content typecurl --location 'https://personafinai.azurewebsites.net/v1/content/publishers/list/enabled' \ --header 'Content-Type: application/json' \ --header 'Cookie: ARRAffinity=628356fae902f3f844f9e9113bb6432b5013900ff654c4981f9460b163e412d2; ARRAffinitySameSite=628356fae902f3f844f9e9113bb6432b5013900ff654c4981f9460b163e412d2' \ --data '{ "apikey": "<access_token>" }'
[ { "publisher": "Benzinga Inc", "publisherid": "6299a0723d954934980299568664f74a", "owner": "Benzinga Inc", "contenttype": "Story", "datasetname": "Benzinga News", "contenttypeid": "daa7632ac81a4cb5a96f4bcee228e527" }, { "publisher": "Benzinga Inc", "publisherid": "6299a0723d954934980299568664f74a", "owner": "Benzinga Inc", "contenttype": "Story", "datasetname": "Why is it Moving", "contenttypeid": "036df6e258e1427aa41f79945f8b7a16" } ]
Once you've identified the content type id's you wish to filter down to, they can be included in the contenttypeids
field to retrieve the content of interest. For example, below we will just return the Benzinga "Why is it Moving" content set.
{
"apikey": "<access_token>",
"contenttypeids": [
"daa7632ac81a4cb5a96f4bcee228e527"
]
}
Search for a specific piece of content by publisher content id
Often we will want to retrieve a specific piece of content that we want to display on a frontend. For example, if we retrieve a list of popular content items using the Popular Content API, we can pass that list of content Id's into the publishercontentid
argument to retrieve the full body for those articles.
{
"apikey": "<access_token>",
"publishercontentid": ["33084401", "33089399", "33091769"],
"maxresults": 2
}
Anatomy of a Content Object
The Content objects returned by the endpoint can have varying components
depending on the source (publisher) of the content, and the structure of the content on ingestion. However, they all have certain common elements
Response Object
The API itself will respond with the following major elements
Field | Data type | Description |
---|---|---|
resultsfound | int | Number of content items found given filter arguments |
results | list | List of content item objects returned by the api. See Anatomy of a Content Object for details on content item objects |
timetaken | int | Time for the API to generate response, in milliseconds |
Common elements - Content Object
All content items returned by the content API contain the following common elements
Field | Data Type | Description |
---|---|---|
fxcid | guid | Internal GUID assigned to the piece of content |
publisherid | guid | Internal GUID assigned to the publisher (data source) of the content |
publishercontentid | string | ID assigned to the content item by the publisher |
version | string | |
type | string | Type of content this includes Currently supports: “Text” |
datetimepublished | datetime | Datetime that the content was published. In the format YYYY-mm-ddTHH:MM:SS |
unixdatetimepublished | int | Datetime that the content was published. Unix epoch value. |
datetimecreated | datetime | Datetime that the content was added to the system. In the format YYYY-mm-ddTHH:MM:SS |
unixdatetimecreated | int | Datetime that the content was added to the system. Unix epoch value. |
datetimeupdated | datetime | Datetime that the content was updated in the system. In the format YYYY-mm-ddTHH:MM:SS |
unixdatetimeupdated | int | Datetime that the content was updated in the system. Unix epoch value. |
entities | list | Financial entities that the system has associated with the content Not currently supported |
topics | list | Financial topics that the system has associated with the content Not currently supported |
components | list | Contains details of the content returned itself |
components.role | string | Object category, that can be used to distinguish between content types. Examples include: “Tweet”, “story” |
components.contentmetadata | list | Meta data that were included in the content object on ingestion These values can differ depending on content source. For notable elements see Content MetaData - Notable values below |
components.content | string | The text body or HTML form of the returned content Key field for displaying content |
images | list | Images included in the content |
timestampprocessed | datetime | |
unixtimestampprocessed | int |
Content MetaData - Notable values
The content object returned holds the original object payload/details from the publisher (data source) in the components.contentmetadata
element of the content object.
These elements can differ and are largely dependent on data source.
Some notable values contained in the MetaData include:
Field | Data Type | Description |
---|---|---|
language | string | Language code for the content |
headline | string | The headline of the article |
keywords | list | List of keyword value - type objects contained in the content |
authors | list | List of author objects including - name: str - username: str - role: str - profileImageurl: url Note: for tweets, this can be used to extract the author profile image |
securities | list | A list of security symbols tagged in the content by the source. Includes the following information - name: str - the name of the security or the identifier - identifier: str - the symbol tagged in the content - identifiertype: str - the type of identifier (e.g. Symbol) - exchange: str - the symbol exchange (if known) |