diff --git a/entity-framework/core/providers/sql-server/full-text-search.md b/entity-framework/core/providers/sql-server/full-text-search.md new file mode 100644 index 0000000000..98ffe7b2bc --- /dev/null +++ b/entity-framework/core/providers/sql-server/full-text-search.md @@ -0,0 +1,183 @@ +--- +title: Microsoft SQL Server Database Provider - Full-Text Search - EF Core +description: Using full-text search with the Entity Framework Core Microsoft SQL Server database provider +author: roji +ms.date: 02/05/2026 +uid: core/providers/sql-server/full-text-search +--- +# Full-Text Search in the SQL Server EF Core Provider + +SQL Server provides [full-text search](/sql/relational-databases/search/full-text-search) capabilities that enable sophisticated text search beyond simple `LIKE` patterns. Full-text search supports linguistic matching, inflectional forms, proximity search, and weighted ranking. + +EF Core's SQL Server provider supports both full-text search *predicates* (for filtering) and *table-valued functions* (for filtering with ranking). + +## Setting up full-text search + +Before using full-text search, you must: + +1. **Create a full-text catalog** on your database +2. **Create a full-text index** on the columns you want to search + +This setup is done at the SQL Server level and is outside the scope of EF Core. For more information, see the [SQL Server full-text search documentation](/sql/relational-databases/search/get-started-with-full-text-search). + +## Full-text predicates + +EF Core supports the `FREETEXT()` and `CONTAINS()` predicates, which are used in `Where()` clauses to filter results. + +### FREETEXT() + +`FREETEXT()` performs a less strict matching, searching for words based on their meaning, including inflectional forms (such as verb tenses and noun plurals): + +```csharp +var articles = await context.Articles + .Where(a => EF.Functions.FreeText(a.Contents, "veggies")) + .ToListAsync(); +``` + +This translates to: + +```sql +SELECT [a].[Id], [a].[Title], [a].[Contents] +FROM [Articles] AS [a] +WHERE FREETEXT([a].[Contents], N'veggies') +``` + +You can optionally specify a language term: + +```csharp +var articles = await context.Articles + .Where(a => EF.Functions.FreeText(a.Contents, "veggies", "English")) + .ToListAsync(); +``` + +### CONTAINS() + +`CONTAINS()` performs more precise matching and supports more sophisticated search criteria, including prefix terms, proximity search, and weighted terms: + +```csharp +// Simple search +var articles = await context.Articles + .Where(a => EF.Functions.Contains(a.Contents, "veggies")) + .ToListAsync(); + +// Prefix search (words starting with "vegg") +var articles = await context.Articles + .Where(a => EF.Functions.Contains(a.Contents, "\"vegg*\"")) + .ToListAsync(); + +// Phrase search +var articles = await context.Articles + .Where(a => EF.Functions.Contains(a.Contents, "\"fresh vegetables\"")) + .ToListAsync(); +``` + +This translates to: + +```sql +SELECT [a].[Id], [a].[Title], [a].[Contents] +FROM [Articles] AS [a] +WHERE CONTAINS([a].[Contents], N'veggies') +``` + +For more information on `CONTAINS()` query syntax, see the [SQL Server CONTAINS documentation](/sql/t-sql/queries/contains-transact-sql). + +## Full-text table-valued functions + +> [!NOTE] +> Full-text table-valued functions are being introduced in EF Core 11. + +While the predicates above are useful for filtering, they don't provide ranking information. SQL Server's table-valued functions [`FREETEXTTABLE()`](/sql/relational-databases/system-functions/freetexttable-transact-sql) and [`CONTAINSTABLE()`](/sql/relational-databases/system-functions/containstable-transact-sql) return both matching rows and a ranking score that indicates how well each row matches the search query. + +### FreeTextTable() + +`FreeTextTable()` is the table-valued function version of `FreeText()`. It returns `FullTextSearchResult`, which includes both the entity and the ranking value: + +```csharp +var results = await context.Articles + .Join( + context.Articles.FreeTextTable("veggies", topN: 10), + a => a.Id, + ftt => ftt.Key, + (a, ftt) => new { Article = a, ftt.Rank }) + .OrderByDescending(r => r.Rank) + .ToListAsync(); + +foreach (var result in results) +{ + Console.WriteLine($"Article {result.Article.Id} with rank {result.Rank}"); +} +``` + +Note that you must provide the generic type parameters; `Article` corresponds to the entity type being searched, where `int` is the full-text search key specified when creating the index, and which is returned by `FREETEXTTABLE()`. + +The above automatically searches across all columns registered for full-text searching and returns the top 10 matches. You can also provide a specific column to search: + +```csharp +var results = await context.Articles + .Join( + context.Articles.FreeTextTable(a => a.Contents, "veggies"), + a => a.Id, + ftt => ftt.Key, + (a, ftt) => new { Article = a, ftt.Rank }) + .OrderByDescending(r => r.Rank) + .ToListAsync(); +``` + +... or multiple columns: + +```csharp +var results = await context.Articles + .FreeTextTable(a => new { a.Title, a.Contents }, "veggies") + .Select(r => new { Article = r.Value, Rank = r.Rank }) + .OrderByDescending(r => r.Rank) + .ToListAsync(); +``` + +### ContainsTable() + +`ContainsTable()` is the table-valued function version of `Contains()`, supporting the same sophisticated search syntax while also providing ranking information: + +```csharp +var results = await context.Articles + .Join( + context.Articles.ContainsTable( "veggies OR fruits"), + a => a.Id, + ftt => ftt.Key, + (a, ftt) => new { Article = a, ftt.Rank }) + .OrderByDescending(r => r.Rank) + .ToListAsync(); +``` + +### Limiting results + +Both table-valued functions support a `topN` parameter to limit the number of results: + +```csharp +var results = await context.Articles + .FreeTextTable(a => a.Contents, "veggies", topN: 10) + .Select(r => new { Article = r.Value, Rank = r.Rank }) + .OrderByDescending(r => r.Rank) + .ToListAsync(); +``` + +### Specifying a language + +Both table-valued functions support specifying a language term for linguistic matching: + +```csharp +var results = await context.Articles + .FreeTextTable(a => a.Contents, "veggies", languageTerm: "English") + .Select(r => new { Article = r.Value, Rank = r.Rank }) + .ToListAsync(); +``` + +## When to use predicates vs table-valued functions + +Feature | Predicates (`FreeText()`, `Contains()`) | Table-valued functions (`FreeTextTable()`, `ContainsTable()`) +--------------------------------- | --------------------------------------- | ------------------------------------------------------------- +Provides ranking | ❌ No | ✅ Yes +Performance for large result sets | Better for filtering | Better for ranking and sorting +Combine with other entities | Via joins | Built-in entity result +Use in `Where()` clause | ✅ Yes | ❌ No (use as a source) + +Use predicates when you simply need to filter results based on full-text search criteria. Use table-valued functions when you need ranking information to order results by relevance or display relevance scores to users. diff --git a/entity-framework/core/providers/sql-server/functions.md b/entity-framework/core/providers/sql-server/functions.md index 0bd0543b58..5f5b801224 100644 --- a/entity-framework/core/providers/sql-server/functions.md +++ b/entity-framework/core/providers/sql-server/functions.md @@ -247,5 +247,6 @@ nullable.GetValueOrDefault(defaultValue) | COALESCE(@nullable, @defaultValue) ## See also * [Vector Search Function Mappings](xref:core/providers/sql-server/vector-search) +* [Full-Text Search Function Mappings](xref:core/providers/sql-server/full-text-search) * [Spatial Function Mappings](xref:core/providers/sql-server/spatial#spatial-function-mappings) * [HierarchyId Function Mappings](xref:core/providers/sql-server/hierarchyid#function-mappings) diff --git a/entity-framework/core/providers/sql-server/vector-search.md b/entity-framework/core/providers/sql-server/vector-search.md index 2cfae980d1..5c648aa749 100644 --- a/entity-framework/core/providers/sql-server/vector-search.md +++ b/entity-framework/core/providers/sql-server/vector-search.md @@ -7,12 +7,12 @@ uid: core/providers/sql-server/vector-search --- # Vector search in the SQL Server EF Core Provider -## Vector search - > [!NOTE] > Vector support was introduced in EF Core 10.0, and is only supported with SQL Server 2025 and above. -The SQL Server vector data type allows storing *embeddings*, which are representation of meaning that can be efficiently searched over for similarity, powering AI workloads such as semantic search and retrieval-augmented generation (RAG). +The SQL Server vector data type allows storing *embeddings*, which are representations of meaning that can be efficiently searched over for similarity, powering AI workloads such as semantic search and retrieval-augmented generation (RAG). + +## Setting up vector properties To use the `vector` data type, simply add a .NET property of type `SqlVector` to your entity type, specifying the dimensions as follows: @@ -48,9 +48,9 @@ protected override void OnModelCreating(ModelBuilder modelBuilder) *** -Once your property is added and the corresponding column created in the database, you can start inserting embeddings. Embedding generation is done outside of the database, usually via a service, and the details for doing this are out of scope for this documentation. However, [the .NET Microsoft.Extensions.AI libraries](/dotnet/ai/microsoft-extensions-ai) contains [`IEmbeddingGenerator`](/dotnet/ai/microsoft-extensions-ai#create-embeddings), which is an abstraction over embedding generators that has implementations for the major providers. +Once your property is added and the corresponding column created in the database, you can start inserting embeddings. Embedding generation is done outside of the database, usually via a service, and the details for doing this are out of scope for this documentation. However, [the .NET Microsoft.Extensions.AI library](/dotnet/ai/microsoft-extensions-ai) contains [`IEmbeddingGenerator`](/dotnet/ai/microsoft-extensions-ai#create-embeddings), which is an abstraction over embedding generators that has implementations for the major providers. -Once you've chosen your embedding generator and set it up, use it to generate embeddings and insert them as follows +Once you've chosen your embedding generator and set it up, use it to generate embeddings and insert them as follows: ```c# IEmbeddingGenerator> embeddingGenerator = /* Set up your preferred embedding generator */; @@ -64,17 +64,155 @@ context.Blogs.Add(new Blog await context.SaveChangesAsync(); ``` -Finally, use the [`EF.Functions.VectorDistance()`](/sql/t-sql/functions/vector-distance-transact-sql) function to perform similarity search for a given user query: +Once you have embeddings saved to your database, you're ready to perform vector similarity search over them. + +## Exact search with VECTOR_DISTANCE() + +The [`EF.Functions.VectorDistance()`](/sql/t-sql/functions/vector-distance-transact-sql) function computes the *exact* distance between two vectors. Use it to perform similarity search for a given user query: ```c# var sqlVector = new SqlVector(await embeddingGenerator.GenerateVectorAsync("Some user query to be vectorized")); -var topSimilarBlogs = context.Blogs +var topSimilarBlogs = await context.Blogs .OrderBy(b => EF.Functions.VectorDistance("cosine", b.Embedding, sqlVector)) .Take(3) .ToListAsync(); ``` +This function computes the distance between the query vector and every row in the table, then returns the closest matches. While this provides perfectly accurate results, it can be slow for large datasets because SQL Server must scan all rows and compute distances for each one. + > [!NOTE] > The built-in support in EF 10 replaces the previous [EFCore.SqlServer.VectorSearch](https://github.com/efcore/EFCore.SqlServer.VectorSearch) extension, which allowed performing vector search before the `vector` data type was introduced. As part of upgrading to EF 10, remove the extension from your projects. -> -> The [`VECTOR_SEARCH()`](/sql/t-sql/functions/vector-search-transact-sql) function (in preview) for approximate search with DiskANN is currently unsupported. + +## Approximate search with VECTOR_SEARCH() + +> [!WARNING] +> `VECTOR_SEARCH()` and vector indexes are currently experimental features in SQL Server and are subject to change. The APIs in EF Core for these features are also subject to change. + +For large datasets, computing exact distances for every row can be prohibitively slow. SQL Server 2025 introduces support for *approximate* search through a [vector index](/sql/t-sql/statements/create-vector-index-transact-sql), which provides much better performance at the expense of returning items that are approximately similar - rather than exactly similar - to the query. + +### Vector indexes + +To use `VECTOR_SEARCH()`, you must create a vector index on your vector column. Use the `HasVectorIndex()` method in your model configuration: + +```csharp +protected override void OnModelCreating(ModelBuilder modelBuilder) +{ + modelBuilder.Entity() + .HasVectorIndex(b => b.Embedding, "cosine"); +} +``` + +This will generate the following SQL migration: + +```sql +CREATE VECTOR INDEX [IX_Blogs_Embedding] + ON [Blogs] ([Embedding]) + WITH (METRIC = COSINE) +``` + +The following distance metrics are supported for vector indexes: + +Metric | Description +----------- | ----------- +`cosine` | Cosine similarity (angular distance) +`euclidean` | Euclidean distance (L2 norm) +`dot` | Dot product (negative inner product) + +Choose the metric that best matches your embedding model and use case. Cosine similarity is commonly used for text embeddings, while euclidean distance is often used for image embeddings. + +### Searching with VECTOR_SEARCH() + +Once you have a vector index, use the `VectorSearch()` extension method on your `DbSet`: + +```csharp +var blogs = await context.Blogs + .VectorSearch(b => b.Embedding, "cosine", embedding, topN: 5) + .ToListAsync(); + +foreach (var (article, score) in blogs) +{ + Console.WriteLine($"Article {article.Id} with score {score}"); +} +``` + +This translates to the following SQL: + +```sql +SELECT [v].[Id], [v].[Embedding], [v].[Name] +FROM VECTOR_SEARCH([Blogs], 'Embedding', @__embedding, 'metric = cosine', @__topN) +``` + +The `topN` parameter specifies the maximum number of results to return. + +`VectorSearch()` returns `VectorSearchResult`, which allows you to access both the entity and the computed distance: + +```csharp +var searchResults = await context.Blogs + .VectorSearch(b => b.Embedding, "cosine", embedding, topN: 5) + .Where(r => r.Distance < 0.05) + .Select(r => new { Blog = r.Value, Distance = r.Distance }) + .ToListAsync(); +``` + +This allows you to filter on the similarity score, present it to users, etc. + +## Hybrid search + +*Hybrid search* combines vector similarity search with traditional [full-text search](xref:core/providers/sql-server/full-text-search) to deliver more relevant results. Vector search excels at finding semantically similar content, while full-text search is better at exact keyword matching. By combining both approaches and using Reciprocal Rank Fusion (RRF) to merge the results, you can build more intelligent search experiences. + +The following example shows how to implement hybrid search using EF Core, combining `FreeTextTable()` and `VectorSearch()` in a single query: + +```csharp +string textualQuery = ...; +SqlVector queryEmbedding = ...; + +var results = await context.Articles + // Perform full-text search + .FreeTextTable(textualQuery, topN: 20) + // Perform vector (semantic) search, joining the results of both searches together + .LeftJoin( + context.Articles.VectorSearch(b => b.Embedding, queryEmbedding, "cosine", topN: 20), + fts => fts.Key, + vs => vs.Value.Id, + (fts, vs) => new + { + Article = vs.Value, + FullTextRank = fts.Rank, + VectorDistance = (double?)vs.Distance + }) + // Apply Reciprocal Rank Fusion (RRF) to combine the results + .Select(x => new + { + x.Article, + RrfScore = (1.0 / (k + x.FullTextRank)) + (1.0 / (k + x.VectorDistance) ?? 0.0) + }) + .OrderByDescending(x => x.RrfScore) + .Take(10) + .Select(x => x.Article) + .ToListAsync(); +``` + +This query: + +1. Performs a full-text search on `Article` +2. Performs a vector search on `Article` and combines the results to the full-text search results via a LEFT JOIN +3. Calculates the RRF score by combining both the full text and the semantic ranking +4. Orders by RRF score, takes the desired number of results and projects out the original `Article` entities. + +> [!NOTE] +> Rather than using a LEFT JOIN, a FULL OUTER JOIN would be more suitable for this scenario; this would allow highly-ranking results from either search side to be included in the final result, even if that result does not appear at all on the other side. With the above LEFT JOIN approach, if a result has a very high vector similarity score, it never gets included in the final result if that result doesn't also have a high full-text score. However, EF doesn't currently support FULL OUTER JOIN; upvote [#37633](https://github.com/dotnet/efcore/issues/37633) if this is something you'd like to see supported. + +The query produces the following SQL: + +```sql +SELECT TOP(@p3) [a0].[Id], [a0].[Content], [a0].[Embedding], [a0].[Title] +FROM FREETEXTTABLE([Articles], *, @p, @p1) AS [f] +LEFT JOIN VECTOR_SEARCH( + TABLE = [Articles] AS [a0], + COLUMN = [Embedding], + SIMILAR_TO = @p2, + METRIC = 'cosine', + TOP_N = @p3 +) AS [v] ON [f].[KEY] = [a0].[Id] +ORDER BY 1.0E0 / CAST(10 + [f].[RANK] AS float) + ISNULL(1.0E0 / (10.0E0 + [v].[Distance]), 0.0E0) DESC +``` diff --git a/entity-framework/core/what-is-new/ef-core-11.0/whatsnew.md b/entity-framework/core/what-is-new/ef-core-11.0/whatsnew.md index d3c56b5829..10a60a8bfc 100644 --- a/entity-framework/core/what-is-new/ef-core-11.0/whatsnew.md +++ b/entity-framework/core/what-is-new/ef-core-11.0/whatsnew.md @@ -167,3 +167,66 @@ In PowerShell, use the `-Add` parameter: ```powershell Update-Database -Migration InitialCreate -Add ``` + +## SQL Server + + + +### VECTOR_SEARCH() and vector indexes + +> [!WARNING] +> `VECTOR_SEARCH()` and vector indexes are currently experimental features in SQL Server and are subject to change. The APIs in EF Core for these features are also subject to change. + +In EF Core 10, we introduced translation for `EF.Functions.VectorDistance()`, which is a scalar function that computes the distance between two vectors. This function can be used in LINQ queries for vector similarity search, allowing you to find the most similar embeddings to a given embedding. However, `VectorDistance()` computes an _exact_ distance between the given vectors. + +When querying large datasets, SQL Server 2025 also supports performing _approximate_ search over a [vector index](/sql/t-sql/statements/create-vector-index-transact-sql), which provides much better performance at the expense of returning items that are approximately similar - rather than exactly similar - to the query. EF 11 now supports creating vector indexes through migrations: + +```csharp +protected override void OnModelCreating(ModelBuilder modelBuilder) +{ + modelBuilder.Entity() + .HasVectorIndex(b => b.Embedding, "cosine"); +} +``` + +Once you have a vector index, you can use the `VectorSearch()` extension method on your `DbSet` to perform an approximate search: + +```csharp +var blogs = await context.Blogs + .VectorSearch(b => b.Embedding, "cosine", embedding, topN: 5) + .ToListAsync(); +``` + +This translates to the SQL Server [`VECTOR_SEARCH()`](/sql/t-sql/functions/vector-search-transact-sql) table-valued function, which performs an approximate search over the vector index. The `topN` parameter specifies the number of results to return. + +`VectorSearch()` returns `VectorSearchResult`, allowing you to access the distance alongside the entity. + +For more information, see the [full documentation on vector search](xref:core/providers/sql-server/vector-search). + + + +### Full-text search table-valued functions + +EF Core has long provided support for SQL Server's full-text search predicates `FREETEXT()` and `CONTAINS()`, via `EF.Functions.FreeText()` and `EF.Functions.Contains()`. These predicates can be used in LINQ `Where()` clauses to filter results based on search criteria. + +However, SQL Server also has table-valued function versions of these functions, [`FREETEXTTABLE()`](/sql/relational-databases/system-functions/freetexttable-transact-sql) and [`CONTAINSTABLE()`](/sql/relational-databases/system-functions/containstable-transact-sql), which also return a ranking score along with the results, providing additional flexibility over the predicate versions. EF 11 now supports these table-valued functions: + +```csharp +// Using FreeTextTable with a search query +var results = await context.Blogs + .FreeTextTable(b => b.FullName, "John") + .Select(r => new { Blog = r.Value, Rank = r.Rank }) + .OrderByDescending(r => r.Rank) + .ToListAsync(); + +// Using ContainsTable with a search query +var results = await context.Blogs + .ContainsTable(b => b.FullName, "John") + .Select(r => new { Blog = r.Value, Rank = r.Rank }) + .OrderByDescending(r => r.Rank) + .ToListAsync(); +``` + +Both methods return `FullTextSearchResult`, giving you access to both the entity and the ranking value from SQL Server's full-text engine. This allows for more sophisticated result ordering and filtering based on relevance scores. + +For more information, see the [full documentation on full-text search](xref:core/providers/sql-server/full-text-search). diff --git a/entity-framework/toc.yml b/entity-framework/toc.yml index 560304cc47..c52a89e4ce 100644 --- a/entity-framework/toc.yml +++ b/entity-framework/toc.yml @@ -412,6 +412,8 @@ href: core/providers/sql-server/indexes.md - name: Value generation href: core/providers/sql-server/value-generation.md + - name: Full-text search + href: core/providers/sql-server/full-text-search.md - name: Vector search href: core/providers/sql-server/vector-search.md - name: Temporal tables