[python] Support query auth (row filter & column masking) for REST catalog#8136
[python] Support query auth (row filter & column masking) for REST catalog#8136MgjLLL wants to merge 1 commit into
Conversation
…talog
Adds query-auth support to the Python client so it honors the row-level
filter and column masking rules returned by a REST catalog, matching the
existing JVM client behavior.
When the new option `query-auth.enabled` is set to true, the client
calls `POST /v1/.../databases/{db}/tables/{tb}/auth` before producing a
plan, receives `{ filter, columnMasking }`, and applies them on the
read path:
* `predicate_json_parser` parses Paimon predicate JSON into a
PyArrow compute filter (EQ/NEQ/LT/LTEQ/GT/GTEQ/IS_NULL/IS_NOT_NULL/
IN/NOT_IN/STARTS_WITH/ENDS_WITH/CONTAINS/AND/OR/NOT).
* `AuthFilterReader` / `AuthMaskingReader` / `ColumnProjectReader`
perform row filtering, column masking transforms (NULL, FIELD_REF,
CAST, UPPER, LOWER, CONCAT, CONCAT_WS) and final projection back to
the user's requested columns.
* `TableQueryAuth` / `TableQueryAuthResult` wrap the result and
convert each split to a `QueryAuthSplit`.
Behavior is gated by `CoreOptions.QUERY_AUTH_ENABLED` (default false),
so existing users see no change.
|
I found a few correctness issues in the query-auth paths introduced here:
|
Purpose
Adds query-auth support to the Python client so it honors the row-level filter and column masking rules returned by a REST catalog, matching the existing JVM client behavior.
When the new option
query-auth.enabledis set totrue, before producing aPlanthe client callsPOST /v1/.../databases/{db}/tables/{tb}/authwith the projected fields, receives{ filter, columnMasking }, and applies them on the read path:RESTApi.auth_table_queryissues the call (new request/response modelsAuthTableQueryRequest/AuthTableQueryResponse, new path inResourcePaths.auth_table).TableQueryAuth/TableQueryAuthResult(catalog/table_query_auth.py) wrap the result and convert each split to aQueryAuthSplit.predicate_json_parser(common/predicate_json_parser.py) parses Paimon predicate JSON into a PyArrow compute filter (EQ/NEQ/LT/LTEQ/GT/GTEQ/IS_NULL/IS_NOT_NULL/IN/NOT_IN/STARTS_WITH/ENDS_WITH/CONTAINS/AND/OR/NOT).AuthFilterReader/AuthMaskingReader/ColumnProjectReader(read/reader/auth_masking_reader.py) implement row filtering, column masking transforms (NULL,FIELD_REF,CAST,UPPER,LOWER,CONCAT,CONCAT_WS) and final projection back to the user's requested columns.read_builder/stream_read_builder/table_read/table_scan/file_store_table/catalog_environment/rest_catalogare wired to invoke the auth call and pull extra fields required only by the auth filter.Behavior is gated by the new
CoreOptions.QUERY_AUTH_ENABLED(query-auth.enabled, defaultfalse), so existing users see no change.Tests
Three new test files (994+ lines, all passing locally under
pytest):paimon-python/pypaimon/tests/predicate_json_parser_test.py— covers each predicate kind, nested AND/OR/NOT, type coercion, null handling, andextract_referenced_fields.paimon-python/pypaimon/tests/auth_masking_reader_test.py— covers each masking transform, missing-field validation, and projection back to the user-requested columns.paimon-python/pypaimon/tests/table_query_auth_test.py— end-to-end coverage: REST catalog callsauth_table_query, the result is plumbed into the plan, splits becomeQueryAuthSplit, and reads return filtered + masked rows.Local check:
API and Format
query-auth.enabled(boolean, defaultfalse).POST /v1/{prefix}/databases/{db}/tables/{tb}/auth. Request{ "select": [...] }, response{ "filter": [<predicate-json>...], "columnMasking": { <col>: <transform-json>, ... } }. The contract follows the existing Java client; no server-side change is required for catalogs that already implement query auth.AuthTableQueryRequest,AuthTableQueryResponse,TableQueryAuth,TableQueryAuthResult,QueryAuthSplit,AuthFilterReader,AuthMaskingReader,ColumnProjectReader) are additive and live under existing modules.Documentation
The new option
query-auth.enabledshould be reflected in the Python configuration reference. Happy to add the docs entry in this PR or in a follow-up — please advise.This closes #8135