September 2024

Support for .eml and .msg Files

  • We’ve added support for .eml and .msg files for both local and third-party file uploads.

Return Document Chunks without Embeddings

  • We added a new flag generate_chunks_only under files_sync_config for third-party connectors (as generate_chunks_only) and at the top-level for web scrapes, file uploads, and raw text (as generateChunksOnly).

  • When this flag is set to true, documents will be chunked without generating embeddings, and the /list_chunks_and_embeddings will list chunks only.

  • If generate_chunks_only is set to true then it overrides skip_embedding_generation. Once generate_chunks_only is set to true embeddings will not be generated irrespective of the value passed for skip_embedding_generation.

ServiceNow Connector

  • The ServiceNow connector allows customers to synchronize incidents and attachments from their accounts, and support for knowledge articles and catalogs will be added soon!

  • Carbon Connect support is coming tomorrow. The enabledIntegration will be SERVICENOW.

  • You can find more details here.

Carbon Connect Enhancements

  • If a synced file in the “Synced File” list view is in ERROR status, an error message will be displayed when hovering over the Error status label.

  • If a file is re-synced via the “Synced File” list view, a success or error message will be provided based on the outcome.

  • The ServiceNow connector has been added to CCv3. The slug for the enabledIntegration is SERVICENOW.

Gong Connector

  • Just launched our Gong connector for syncing Gong calls and retrieving the call transcripts.

    • CCv3 support for the Gong Connector will be added later this week with the enabledIntegration slug being GONG.

  • By default, the Gong connector will sync all of your workspaces and calls. However, you can customize this behavior:

    • To turn off automatic syncing of all workspaces and calls, set the sync_files_on_connection parameter to false when configuring the connector.

    • To manually sync specific workspaces or calls, use the global endpoints (/integrations/items/list and /integrations/files/sync).

  • To include speaker names and emails (when available), set the include_speaker_labels flag under file_sync_config to true.

  • New calls are auto-synced from existing workspaces but any new workspaces created later will require syncing manually.

  • Find more details here.

External URL for Gmail and Outlook

  • The external_url field is now returned for both Gmail and Outlook email files under user_files_v2

Return Raw Slack Messages

  • We now return the individual Slack messages under the additional_presigned_urls->messages_json field when you set the include_additional_files parameter to true for user_files_v2.

  • The pre-signed file will contain the raw Slack response for all the messages in that file. The JSON will have one entry per conversation, with the conversation timestamp as the key.

Improved Search for Carbon Connect (3.0.12)

  • The search functionality in CCv3 has been enhanced to enable searching through all items in the directory or selected folder, rather than just what is displayed in the front-end.

Improved Notion Parsing

  • We’ve improved our Notion parser to support parsing for the following blocks:

    • Toggle lists

    • In-line tables, text, code blocks, and lists

    • Numbered and bullet lists

    • Synced blocks

    • Multi-column blocks

    • Text with links

Syncing Intercom Conversations

  • In addition to articles and tickets, Carbon now syncs Intercom Conversations.

  • You can specific CONVERSATION under file_sync_config to enable syncing conversations:

"file_sync_config": { "auto_synced_source_types": ["CONVERSATION"], "sync_attachments": true }

  • The following conversation information is available as tags for filtering:

{ "conversation_status": "open", "conversation_priority": "not_priority", "conversation_submitter": "example.user@projectmap.com", "conversation_assigned_team": "Support", "conversation_assigned_admin": "swapnil+int2@carbon.ai" }

Notion Database Properties

  • Notion database properties are now returned per page within the database.

    • All Notion database properties are supported except for relation.

    • Properties are parsed per page in a database. They are parsed in a key-value format (property_name: property_value) and are added to the beginning of the parsed page  (parsed_text_url) as a newline separated list.

    • The file returned by presigned_url also now contains the JSON representation of the Notion page. The page’s properties and child blocks can be found in the object.

Sync Files Without Processing

  • We now allow new file records to be created in Carbon (and displayed via /user_files_v2) without processing and saving the actual file. the remote file content will not be downloaded, and no chunks or embeddings will be generated. Only some metadata such as name, external id, and external URL (depending on the source being synced from) will be stored.

  • This feature can be enabled by setting the flag skip_file_processing to true under file_sync_config for a given data source, and the sync_status of files in this state will be READY_TO_SYNC.

  • It’s important to note that this flag overrides both the skip_embedding_generation and generate_chunks_only flags.

apiURL prop for CCv3 (3.0.14)

  • For customers that self-host Carbon, we added the prop apiURL to CCv3 which defaults to https://api.carbon.ai but can be set to another URL value. This URL value then acts as the base path for all of the requests made through Carbon Connect.

Qdrant Destination Connector

  • You can now “bring your own” Qdrant index to use with Carbon.

  • Carbon can automatically synchronizes embeddings generated from customer data sources with any Qdrant index.

  • To enable, we’ll require your Qdrant API key, an URL, and a mapping of embedding generators (ie: OPENAI) to collection names:

{ "api_key": "API_KEY", "url": "URL", "collection_names": { "EMBEDDING_GENERATOR_1": "COLLECTION_NAME_1", "EMBEDDING_GENERATOR_2": "COLLECTION_NAME_2" } }

Azure Blob Storage Connector

  • We launched our Azure Blob Storage connector that enables syncing files and folders from blobs.The Carbon Connect enabledIntegrations value for Azure Blob Storage is AZURE_BLOB_STORAGE, and CCv3 support will launch tomorrow.Find more details on our Azure Blob Storage connector here.

Business OneDrive Support for Microsoft File Picker

  • The file picker button will now appear on the successful connection page for Business OneDrive accounts.

  • In order to open the file picker, the tenant name of the business account is required. Carbon will try to find it through Microsoft’s API by default. If it can’t be found, the file picker button won’t appear, and the successful connection page will instruct the user to close the tab.

Carbon Self-Hosting on Google Cloud Platform

  • Starting today, customers have the option to host a Carbon instance within their own GCP instance, with full access to all features of our managed solution, including data connectors, hybrid search, and more.

  • As a reminder, we’re already live on AWS and launching on Azure next month!

  • Book a demo if you’re interested to learn more: https://cal.com/carbon-ai/30min

Unified API for CRMs

  • We are introducing a unified API to access standardized data directly from CRM systems, starting with Salesforce.

  • To start, you can now sync data from the following CRM objects:

    • Accounts

    • Leads

    • Contacts

    • Opportunities

  • You can find more details in our documentation here.

Google Sheets Update

  • The file returned in presigned_url for Google Sheets has been changed from txt to xlsx. The txt file is still available in parsed_text_url.


CARBON

Data Connectors for LLMs

COPYRIGHT @ 2024 JCDT DBA CARBON