July 2024

AssemblyAI Integration for Audio Transcriptions

  • We are excited to announce that Carbon now supports multiple audio transcription services. In addition to our existing integration with Deepgram, we have added support for AssemblyAI, providing our users with more options and flexibility when transcribing audio files.

  • To accommodate the new transcription service, we have updated the following endpoints to accept the new parameters transcription_service that allow you to specify which service to use. Valid values are deepgram and assemblyai. If no value is specified, Deepgram will be used as the default transcription service.

  • For local files, the endpoints are:

    • /uploadfile

    • /upload_file_from_url

  • For external files, transcription_service is set within the file_sync_config parameter, under:

    • /integrations/oauth_url

    • /integrations/connect

    • /integrations/files/sync

  • Similar to files transcribed by Deepgram, files transcribed by AssemblyAI also have an additional saved file containing the full JSON response from the AssemblyAI service. To access the transcription response, query the files using the user_files_v2 endpoint with the include_additional_files parameter set to true.

Carbon Webhook Libraries

  • We have released our official webhook libraries for handling the verification of webhook signatures. You can find our updated documentation here, and access our libraries on GitHub here.

Zendesk Auto-Sync Update

We are thrilled to announce that the Zendesk connector now supports auto-sync.

  • Carbon can now sync any new articles with auto-sync enabled.

    • Help Center Categories are now synced into Carbon as files, and Help Center Categories and articles form a parent-child relationship.

  • Reconnecting Existing Zendesk Connections:I

    • If you have existing Zendesk connections in Carbon, please note that you will need to reconnect them to enable the updates above.

Organization Connector Settings

  • The /organization endpoint now includes connector_settings in the response, providing additional information about the organization’s connector configurations, starting with permitted file formats.

  • The /organization/update endpoint has been updated to accept the data_source_config parameter, allowing customers to configure permitted file formats for organization users. The data_source_config parameter should be provided in the following format:

{ "data_source_configs": { "GOOGLE_DRIVE": { "allowed_file_formats": ["PDF", "DOCX"] }, "DROPBOX": { "allowed_file_formats": ["XLSX", "CSV"] }, "DEFAULT": { "allowed_file_formats": ["PDF", "DOCX", "XLSX", "NOTION"] } } }

  • DEFAULT is applied to all data sources that do not have configs defined.

  • If the data_source_config parameter includes file formats that are not supported by Carbon, those formats will be ignored, and only the supported formats from each data source will synced.

Carbon Self-Hosting on AWS

  • Starting today, customers have the option to host a Carbon instance on their own cloud, with full access to all features of our managed solution, including data connectors, hybrid search, and more.

  • We’re launching on Microsoft Azure and Google Cloud later next month!

  • Book a demo if you’re interested to learn more:https://cal.com/carbon-ai/30min

Confluence Enhancements

We’ve made improvements to the Confluence Connector related to the following:

  • Auto-Sync Improvements

    • Auto-syncs process will now index new pages that are added to a previously synced parent page. If a user syncs their entire Confluence account, then the space will be the top-most file.

    • If pages are deleted from a synced parent page in Confluence, the scheduled sync will remove them from the synced content.

  • File Metadata Enhancements

    • The file_metadata property now includes additional information about the type of Confluence item each file represents (spaces and pages).

    • The file_metadata property will also record the external_id of the file’s parent and root, providing better context and hierarchy information.

  • To take advantage of these updates, users will need to reconnect their Confluence account and re-sync their Confluence files.

Reranker Models for Search

We are excited to introduce native support for reranker models. With this release, customers now have the option to rerank search result chunks to provide more relevant and accurate results.
How it works:

  • When making a search query via the embeddings endpoint, customers can control the reranking behavior by setting the rerank parameter in the payload.

    • If rerank is set to "JINA_MULTILINGUAL_BASE_V2" the search result chunks will be reranked using the Jina reranking algorithm.

    • If rerank is set to "COHERE_RERANK_MULTILINGUAL_V3", the search result chunks will be reranked using the Cohere reranking algorithm.

    • If the rerank parameter is not specified or set to any other value, the default ranking will be used.

  • The response format from the embeddings endpoint remains consistent regardless of whether rerank is enabled or not.

We’ll be adding support for more reranker models in the weeks to come!New Webhook: WEBSCRAPE_URLS_READY
We’ve added a new webhook named WEBSCRAPE_URLS_READY that triggers each time a specific web page from a web scrape request is finished processing.

Introducing Carbon Connect 3.0

We’re thrilled to announce the beta release of Carbon Connect 3.0, packed with exciting updates and improvements, based on customer feedback.Key Features and Improvements

1. Seamless File and Folder Uploads
Carbon Connect 3.0 now supports both file and folder uploads by default, eliminating the need for the filePickerMode property. Uploading entire folder directories is now a breeze with our new drag-and-drop functionality.

2. Carbon’s In-House File Picker
We’re excited to introduce Carbon’s in-house file picker is now available for all connectors, except for Slack, Gmail, and Outlook (currently in development). To use Carbon’s file picker instead of the source’s file picker, simply set the new useCarbonFilePicker property to true.

3. Enhanced In-Modal Notifications
We’ve completely replaced toast notifications with in-modal notifications, providing a more cohesive and user-friendly experience. As a result, the enableToasts property has been removed.

4. Customizable Theme Options
Personalize your Carbon Connect experience with our new theme options. Use the theme property to set the application’s theme to light, dark, or auto (default). When set to auto, Carbon Connect will automatically adapt to your system’s theme.

5. Simplified File Limit Control
Limiting the number of files is now easier than ever. Simply set the maxFilesCount property to 1 to restrict uploads to a single file. The allowMultipleFiles property has been removed for a more straightforward approach.

Upcoming Enhancements
We’re continuously working to improve Carbon Connect and have exciting plans for the near future:

1. Enhanced Customization Options
We’re working on bringing back customization options from Carbon Connect 2.0, including loadingIconColor, primaryBackgroundColor, primaryTextColor, secondaryBackgroundColor, and secondaryTextColor.

2. Expanded In-House File Pickers
In the coming weeks, we’ll be launching Carbon’s in-house file pickers for Outlook, Slack, and Gmail, providing a consistent and seamless experience across all connectors.

Installation
You can install the new component for testing via the command npm install carbon-connect@beta. We plan to bring 3.0 out of beta by the end of the month!

Here’s a Loom video providing a quick walkthrough of the new modal: https://www.loom.com/share/b7b241fa5e5e4d0a92fb5e748d3d6ec3

External URLs Filter

A new external_urls filter has been added to the user_files_v2 endpoint.This filter allows you to refine the results returned by the endpoint based on a list of external_urls passed.

File Deletion Enhancements 

  • When a customer deletes a file from Carbon (via delete_files_v2), they have the flexibility to control whether the file row in the database is preserved or marked as deleted when deleting a file.

    • This behavior is managed by the preserve_file_record flag. If preserve_file_record is set to true, then we delete the files stored in our S3/GCS while keeping the file record and metadata to allow for re-syncs and auto-syncs.

    • We also added a file_contents_deleted field to the user_files_v2 endpoint. If the field is returned as true, then the file record still exists, but the stored file content is deleted.

  • Find more details here.

High Accuracy Mode 

  • We’ve introduced a new optional boolean parameter to the /embeddings endpoint called high_accuracy . If set to true, then vector search may give more accurate results at a slight performance penalty. By default, it’s false.

  • Find more details here.

To And From Filters for Outlook and Gmail

  • We added 2 more filters for syncing emails from Outlook and Gmail:

  • Note: Outlook only supports from filters.

Intercom Auto-Sync Update

  • We are thrilled to announce 2 updates to our Intercom connector:

    • Carbon can now sync multiple Intercom Help Centers:

      • Help Centers are now synced into Carbon as files, and Help Center and articles form a parent-child relationship.

      • Just as only published articles are synced, only activated Help Centers will be synced.

    • Carbon can now sync any new published articles with auto-sync is enabled.

  • Reconnecting Existing Intercom Connections:

    • If you have existing Intercom connections in Carbon, please note that you will need to reconnect them to enable the updates above.

CARBON

Data Connectors for LLMs

COPYRIGHT @ 2024 JCDT DBA CARBON