July 2024
AssemblyAI Integration for Audio Transcriptions
We are excited to announce that Carbon now supports multiple audio transcription services. In addition to our existing integration with Deepgram, we have added support for AssemblyAI, providing our users with more options and flexibility when transcribing audio files.
To accommodate the new transcription service, we have updated the following endpoints to accept the new parameters
transcription_service
that allow you to specify which service to use. Valid values aredeepgram
andassemblyai
. If no value is specified, Deepgram will be used as the default transcription service.For local files, the endpoints are:
/uploadfile
/upload_file_from_url
For external files,
transcription_service
is set within thefile_sync_config
parameter, under:/integrations/oauth_url
/integrations/connect
/integrations/files/sync
Similar to files transcribed by Deepgram, files transcribed by AssemblyAI also have an additional saved file containing the full JSON response from the AssemblyAI service. To access the transcription response, query the files using the
user_files_v2
endpoint with theinclude_additional_files
parameter set totrue
.
Carbon Webhook Libraries
We have released our official webhook libraries for handling the verification of webhook signatures. You can find our updated documentation here, and access our libraries on GitHub here.
Zendesk Auto-Sync Update
We are thrilled to announce that the Zendesk connector now supports auto-sync.
Carbon can now sync any new articles with auto-sync enabled.
Help Center Categories are now synced into Carbon as files, and Help Center Categories and articles form a parent-child relationship.
Reconnecting Existing Zendesk Connections:I
If you have existing Zendesk connections in Carbon, please note that you will need to reconnect them to enable the updates above.
Organization Connector Settings
The
/organization
endpoint now includesconnector_settings
in the response, providing additional information about the organization’s connector configurations, starting with permitted file formats.The
/organization/update
endpoint has been updated to accept thedata_source_config
parameter, allowing customers to configure permitted file formats for organization users. Thedata_source_config
parameter should be provided in the following format:
{ "data_source_configs": { "GOOGLE_DRIVE": { "allowed_file_formats": ["PDF", "DOCX"] }, "DROPBOX": { "allowed_file_formats": ["XLSX", "CSV"] }, "DEFAULT": { "allowed_file_formats": ["PDF", "DOCX", "XLSX", "NOTION"] } } }
DEFAULT
is applied to all data sources that do not have configs defined.If the
data_source_config
parameter includes file formats that are not supported by Carbon, those formats will be ignored, and only the supported formats from each data source will synced.
Carbon Self-Hosting on AWS
Starting today, customers have the option to host a Carbon instance on their own cloud, with full access to all features of our managed solution, including data connectors, hybrid search, and more.
We’re launching on Microsoft Azure and Google Cloud later next month!
Book a demo if you’re interested to learn more:https://cal.com/carbon-ai/30min
Confluence Enhancements
We’ve made improvements to the Confluence Connector related to the following:
Auto-Sync Improvements
Auto-syncs process will now index new pages that are added to a previously synced parent page. If a user syncs their entire Confluence account, then the space will be the top-most file.
If pages are deleted from a synced parent page in Confluence, the scheduled sync will remove them from the synced content.
File Metadata Enhancements
The
file_metadata
property now includes additional information about the type of Confluence item each file represents (spaces and pages).The
file_metadata
property will also record theexternal_id
of the file’s parent and root, providing better context and hierarchy information.
To take advantage of these updates, users will need to reconnect their Confluence account and re-sync their Confluence files.
Reranker Models for Search
We are excited to introduce native support for reranker models. With this release, customers now have the option to rerank search result chunks to provide more relevant and accurate results.
How it works:
When making a search query via the
embeddings
endpoint, customers can control the reranking behavior by setting thererank
parameter in the payload.If
rerank
is set to"JINA_MULTILINGUAL_BASE_V2"
the search result chunks will be reranked using the Jina reranking algorithm.If
rerank
is set to"COHERE_RERANK_MULTILINGUAL_V3"
, the search result chunks will be reranked using the Cohere reranking algorithm.If the
rerank
parameter is not specified or set to any other value, the default ranking will be used.
The response format from the
embeddings
endpoint remains consistent regardless of whetherrerank
is enabled or not.
We’ll be adding support for more reranker models in the weeks to come!New Webhook: WEBSCRAPE_URLS_READY
We’ve added a new webhook named WEBSCRAPE_URLS_READY
that triggers each time a specific web page from a web scrape request is finished processing.
Introducing Carbon Connect 3.0
We’re thrilled to announce the beta
release of Carbon Connect 3.0, packed with exciting updates and improvements, based on customer feedback.Key Features and Improvements
1. Seamless File and Folder Uploads
Carbon Connect 3.0 now supports both file and folder uploads by default, eliminating the need for the filePickerMode
property. Uploading entire folder directories is now a breeze with our new drag-and-drop functionality.
2. Carbon’s In-House File Picker
We’re excited to introduce Carbon’s in-house file picker is now available for all connectors, except for Slack, Gmail, and Outlook (currently in development). To use Carbon’s file picker instead of the source’s file picker, simply set the new useCarbonFilePicker
property to true
.
3. Enhanced In-Modal Notifications
We’ve completely replaced toast notifications with in-modal notifications, providing a more cohesive and user-friendly experience. As a result, the enableToasts
property has been removed.
4. Customizable Theme Options
Personalize your Carbon Connect experience with our new theme options. Use the theme
property to set the application’s theme to light
, dark
, or auto
(default). When set to auto
, Carbon Connect will automatically adapt to your system’s theme.
5. Simplified File Limit Control
Limiting the number of files is now easier than ever. Simply set the maxFilesCount
property to 1
to restrict uploads to a single file. The allowMultipleFiles
property has been removed for a more straightforward approach.
Upcoming Enhancements
We’re continuously working to improve Carbon Connect and have exciting plans for the near future:
1. Enhanced Customization Options
We’re working on bringing back customization options from Carbon Connect 2.0, including loadingIconColor
, primaryBackgroundColor
, primaryTextColor
, secondaryBackgroundColor
, and secondaryTextColor
.
2. Expanded In-House File Pickers
In the coming weeks, we’ll be launching Carbon’s in-house file pickers for Outlook, Slack, and Gmail, providing a consistent and seamless experience across all connectors.
Installation
You can install the new component for testing via the command npm install carbon-connect@beta
. We plan to bring 3.0 out of beta
by the end of the month!
Here’s a Loom video providing a quick walkthrough of the new modal: https://www.loom.com/share/b7b241fa5e5e4d0a92fb5e748d3d6ec3
External URLs Filter
A new external_urls
filter has been added to the user_files_v2
endpoint.This filter allows you to refine the results returned by the endpoint based on a list of external_urls
passed.
File Deletion Enhancements
When a customer deletes a file from Carbon (via
delete_files_v2
), they have the flexibility to control whether the file row in the database is preserved or marked as deleted when deleting a file.This behavior is managed by the
preserve_file_record
flag. Ifpreserve_file_record
is set totrue
, then we delete the files stored in our S3/GCS while keeping the file record and metadata to allow for re-syncs and auto-syncs.We also added a
file_contents_deleted
field to theuser_files_v2
endpoint. If the field is returned astrue
, then the file record still exists, but the stored file content is deleted.
Find more details here.
High Accuracy Mode
We’ve introduced a new optional boolean parameter to the
/embeddings
endpoint calledhigh_accuracy
. If set totrue
, then vector search may give more accurate results at a slight performance penalty. By default, it’sfalse
.Find more details here.
To
And From
Filters for Outlook and Gmail
We added 2 more filters for syncing emails from Outlook and Gmail:
to
: Supports an email (email@address.com
) as a string to which the email was sent.from
: Supports an email (email@address.com
) as a string from which the email was sent.
Note: Outlook only supports
from
filters.
Intercom Auto-Sync Update
We are thrilled to announce 2 updates to our Intercom connector:
Carbon can now sync multiple Intercom Help Centers:
Help Centers are now synced into Carbon as files, and Help Center and articles form a parent-child relationship.
Just as only published articles are synced, only activated Help Centers will be synced.
Carbon can now sync any new published articles with auto-sync is enabled.
Reconnecting Existing Intercom Connections:
If you have existing Intercom connections in Carbon, please note that you will need to reconnect them to enable the updates above.
CARBON
Data Connectors for LLMs
COPYRIGHT @ 2024 JCDT DBA CARBON