Skip to content

feat: Making feast vector store with open ai search api compatible#6121

Open
patelchaitany wants to merge 5 commits intofeast-dev:masterfrom
patelchaitany:enh/openai-compatibel-store-api
Open

feat: Making feast vector store with open ai search api compatible#6121
patelchaitany wants to merge 5 commits intofeast-dev:masterfrom
patelchaitany:enh/openai-compatibel-store-api

Conversation

@patelchaitany
Copy link
Contributor

@patelchaitany patelchaitany commented Mar 17, 2026

What this PR does / why we need it:

This PR making the feast vector store api with open ai search api compatible so.

the current changes are creating an new rest api end point which is compatible with open ai search api and also it include an extra field in the metadata named features_to_retrieve(allows to retrieve specific feature) and content_field (which field include the main content or Chunk document)

what things are missing :

 1. It only supports the normal query string need to have some embedding mode which embed the query string.
 2. If need it could also support the feature service
 3. Support the filters params (Require changes in the vector store retrieve query.) 

Which issue(s) this PR fixes:

#5615

Misc


Open with Devin

@ntkathole ntkathole changed the title feat: making feast vector store with open ai search api compatible feat: Making feast vector store with open ai search api compatible Mar 17, 2026
@patelchaitany patelchaitany force-pushed the enh/openai-compatibel-store-api branch 4 times, most recently from e45f167 to c8392a9 Compare March 23, 2026 11:17
@patelchaitany patelchaitany changed the title feat: Making feast vector store with open ai search api compatible feat: Making feast vector store with open ai search api compatible Mar 23, 2026
@patelchaitany patelchaitany force-pushed the enh/openai-compatibel-store-api branch 4 times, most recently from 974d688 to 639a87e Compare March 24, 2026 10:06
Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>
Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>
@patelchaitany patelchaitany force-pushed the enh/openai-compatibel-store-api branch from 7e8adfb to 3f541ad Compare March 24, 2026 11:29
@patelchaitany patelchaitany marked this pull request as ready for review March 24, 2026 11:29
@patelchaitany patelchaitany requested review from a team as code owners March 24, 2026 11:29
@patelchaitany patelchaitany requested review from HaoXuAI, nquinn408 and redhatHameed and removed request for a team March 24, 2026 11:29
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment on lines +351 to +354
if requested_features is None:
requested_features = []
if "distance" not in requested_features:
requested_features.append("distance")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 RemoteOnlineStore mutates caller's requested_features list, causing duplicate 'distance' in response

In RemoteOnlineStore.retrieve_online_documents_v2, lines 351-354 mutate the requested_features parameter in-place by appending "distance". This list is the same object passed by reference from feature_store.py:_retrieve_from_online_store_v2 (line 3066). After the remote store call returns, feature_store.py:3089 builds features_to_request = requested_features + ["distance"], which now produces a list with "distance" appearing twice (since it was already appended). This causes _populate_response_from_feature_data at feature_store.py:3121-3128 to add a duplicate "distance" entry in the response metadata and results, producing a malformed OnlineResponse when using the remote online store.

Suggested change
if requested_features is None:
requested_features = []
if "distance" not in requested_features:
requested_features.append("distance")
if requested_features is None:
requested_features = []
requested_features = list(requested_features)
if "distance" not in requested_features:
requested_features.append("distance")
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the request feature is modified after getting response as it is not returning the distance field in the response of the remoteonlinestore

Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>
Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 13 additional findings in Devin Review.

Open in Devin Review

Comment on lines +446 to +463
if f.type == "eq":
return {"term": {field: fmt_val}}
elif f.type == "ne":
return {"bool": {"must_not": [{"term": {field: fmt_val}}]}}
elif f.type in ("gt", "gte", "lt", "lte"):
return {"range": {field: {f.type: fmt_val}}}
elif f.type == "in":
if not isinstance(f.value, list):
raise ValueError(
f"'in' filter requires a list value, got {type(f.value)}"
)
return {"terms": {field: fmt_list}}
elif f.type == "nin":
if not isinstance(f.value, list):
raise ValueError(
f"'nin' filter requires a list value, got {type(f.value)}"
)
return {"bool": {"must_not": [{"terms": {field: fmt_list}}]}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Elasticsearch term query on analyzed text field causes string filters to silently fail

The new _translate_comparison_filter method generates term/terms queries against the value_text field (e.g. {"term": {"category.value_text": "Category-0"}}). However, value_text is mapped as "type": "text" in the ES index (sdk/python/feast/infra/online_stores/elasticsearch_online_store/elasticsearch.py:260). ES text fields are analyzed (lowercased, tokenized on whitespace/punctuation), but term queries match against raw unanalyzed tokens. This means an eq filter for "Category-0" would look for the exact token "Category-0", but the indexed tokens are ["category", "0"] after analysis — so the filter silently returns no matches. This affects eq, ne, in, and nin operators for any string value that is capitalized, hyphenated, or multi-word.

Fix: add a keyword sub-field to value_text mapping

Change the value_text mapping from {"type": "text"} to {"type": "text", "fields": {"keyword": {"type": "keyword"}}}, and update _translate_comparison_filter to use f"{f.key}.value_text.keyword" for term/terms queries instead of f"{f.key}.value_text". This preserves full-text search on the analyzed field while enabling exact matching via the keyword sub-field.

Prompt for agents
Two changes are needed to fix the Elasticsearch string filter bug:

1. In sdk/python/feast/infra/online_stores/elasticsearch_online_store/elasticsearch.py, in the create_index method (around line 258-260), change the value_text mapping from:
   "value_text": {"type": "text"}
to:
   "value_text": {"type": "text", "fields": {"keyword": {"type": "keyword"}}}

2. In the same file, in the _translate_comparison_filter method (around lines 438-464), when has_value_num is False (i.e., the text branch), change the field from:
   field = f"{f.key}.value_text"
to:
   field = f"{f.key}.value_text.keyword"

This ensures that exact-match filters (eq, ne, in, nin) use the non-analyzed keyword sub-field for reliable matching, while the analyzed text field remains available for full-text search queries.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant