Skip to main content

Overview

The Website connection allows Qontext to crawl public web pages and ingest their text content into your Context Vault. Supported content:
Crawl typeDescription
Single PageCrawls only the specific URL provided. Best for individual articles or pages
Deep CrawlCrawls the URL and follows all internal links. Best for entire sites or documentation
Unsupported:
ContentDescription
VideosVideo files and embedded video content
Restricted sitesSome websites (e.g. LinkedIn) block automated crawling and cannot be ingested

Setting up the connection

You can add a website from your Qontext vault, enter the URL, choose a crawl type, and set a sync schedule. The steps below walk you through the full flow.
1

Open Data Sources in your vault

In the Qontext app, open the vault where you want website data to be ingested, then go to the Data Sources tab and click + Add.Data Sources tab in a Qontext vault
2

Select Website as the data source

Choose Website from the list of available data sources. This starts the website connection flow.Select Website from the list of integrations
3

Enter the website URL

Give the website a name, which will be displayed on the Data Sources table. Then, enter the website URL you want to crawl. HTTPS is assumed by default; for HTTP sites, enter the full URL.URLs used in other vaults are shown for reference. URLs already connected to this vault are displayed and cannot be connected again, but the crawl type of an already ingested website can be edited in the Data Sources tab.Add a website URL and name
4

Choose crawl type

Select how the website should be crawled. Single Page crawls only the specific URL provided and is best for individual articles or pages. Deep Crawl crawls the URL and follows all internal links, making it best for entire sites or documentation.Choose between Single Page and Deep Crawl
5

Set sync frequency

Choose how often Qontext should re-crawl the website (e.g. daily, weekly, monthly). More frequent syncs keep the vault up to date but use more resources.Sync frequency settings
6

Set ingestion instructions (optional)

You can add ingestion instructions that tell Qontext how to interpret or prioritize the crawled content. This step is optional.Ingestion instructions field
7

Review and create connection

Review your choices (name, URL, crawl type, frequency, instructions), then click Complete to create the connection. Qontext will start the initial crawl shortly after.Review summary and create website connection

Sync latency

The crawl time depends on the crawl type and the size of the website. Most can be ingested within minutes.

FAQ

Use Single Page when you want to ingest a specific article, blog post, or landing page. Use Deep Crawl when you want to ingest an entire site or documentation portal. Qontext will follow all internal links starting from the URL you provide.
A Deep Crawl follows up to 100 pages starting from the provided URL. If you need to limit the scope, consider using Single Page for specific URLs instead.
Yes. You can add as many website connections as you need from the Data Sources tab. Each website is a separate data source with its own crawl type and sync schedule.
If a previously crawled page is no longer accessible, it will not be updated during the next sync. Content already ingested remains in your vault. For data removal, contact support@qontext.ai.
Qontext re-crawls the website based on the sync frequency you set (daily, weekly, or monthly). Each sync picks up the newest version of the pages and discovers new pages (for Deep Crawl).