-
Notifications
You must be signed in to change notification settings - Fork 30
Wire JSON ingestion schema extension modules #1215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jwils
wants to merge
1
commit into
joshuaw/json-ingestion-extension-modules
Choose a base branch
from
joshuaw/json-ingestion-api-polish
base: joshuaw/json-ingestion-extension-modules
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
155 changes: 155 additions & 0 deletions
155
...cgraph-json_ingestion/lib/elastic_graph/json_ingestion/schema_definition/api_extension.rb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,155 @@ | ||
| # Copyright 2024 - 2026 Block, Inc. | ||
| # | ||
| # Use of this source code is governed by an MIT-style | ||
| # license that can be found in the LICENSE file or at | ||
| # https://opensource.org/licenses/MIT. | ||
| # | ||
| # frozen_string_literal: true | ||
|
|
||
| require "elastic_graph/constants" | ||
| require "elastic_graph/json_ingestion/schema_definition/factory_extension" | ||
| require "elastic_graph/json_ingestion/schema_definition/state_extension" | ||
|
|
||
| module ElasticGraph | ||
| module JSONIngestion | ||
| # Namespace for all JSON Schema schema definition support. | ||
| # | ||
| # {SchemaDefinition::APIExtension} is the primary entry point and should be used as a schema definition extension module. | ||
| module SchemaDefinition | ||
| # Module designed to be extended onto an {ElasticGraph::SchemaDefinition::API} instance | ||
| # to add JSON Schema ingestion serializer capabilities. | ||
| module APIExtension | ||
| # Wires up the JSON ingestion extensions when this module is extended onto an API instance. | ||
| # | ||
| # @param api [ElasticGraph::SchemaDefinition::API] the API instance to extend | ||
| # @return [void] | ||
| # @api private | ||
| def self.extended(api) | ||
| api.state.extend(StateExtension) | ||
| api.factory.extend(FactoryExtension) | ||
|
|
||
| api.on_built_in_types do |type| | ||
| if type.name == api.state.type_ref("GeoLocation").to_final_form.name | ||
| # @type var geo_location_type: ElasticGraph::SchemaDefinition::SchemaElements::TypeWithSubfields & SchemaElements::TypeWithSubfieldsExtension | ||
| geo_location_type = _ = type | ||
| names = api.state.schema_elements | ||
|
|
||
| # We use `nullable: false` because `GeoLocation` is indexed as a single `geo_point` field, | ||
| # and therefore can't support a `latitude` without a `longitude` or vice-versa. | ||
| latitude = geo_location_type.graphql_fields_by_name.fetch(names.latitude) # : ElasticGraph::SchemaDefinition::SchemaElements::Field & SchemaElements::FieldExtension | ||
| longitude = geo_location_type.graphql_fields_by_name.fetch(names.longitude) # : ElasticGraph::SchemaDefinition::SchemaElements::Field & SchemaElements::FieldExtension | ||
| latitude.json_schema minimum: -90, maximum: 90, nullable: false | ||
| longitude.json_schema minimum: -180, maximum: 180, nullable: false | ||
| end | ||
| end | ||
| end | ||
|
|
||
| # Defines the version number of the current JSON schema. Importantly, every time a change is made that impacts the JSON schema | ||
| # artifact, the version number must be incremented to ensure that each different version of the JSON schema is identified by a unique | ||
| # version number. The publisher will then include this version number in published events to identify the version of the schema it | ||
| # was using. This avoids the need to deploy the publisher and ElasticGraph indexer at the same time to keep them in sync. | ||
| # | ||
| # @note While this is an important part of how ElasticGraph is designed to support schema evolution, it can be annoying constantly | ||
| # have to increment this while rapidly changing the schema during prototyping. You can disable the requirement to increment this | ||
| # on every JSON schema change with {#enforce_json_schema_version}. | ||
| # | ||
| # @param version [Integer] current version number of the JSON schema artifact | ||
| # @return [void] | ||
| # @see #enforce_json_schema_version | ||
| # | ||
| # @example Set the JSON schema version to 1 | ||
| # ElasticGraph.define_schema do |schema| | ||
| # schema.json_schema_version 1 | ||
| # end | ||
| def json_schema_version(version) | ||
| state = json_ingestion_state | ||
|
|
||
| if !version.is_a?(Integer) || version < 1 | ||
| raise Errors::SchemaError, "`json_schema_version` must be a positive integer. Specified version: #{version}" | ||
| end | ||
|
|
||
| if state.json_schema_version | ||
| raise Errors::SchemaError, "`json_schema_version` can only be set once on a schema. Previously-set version: #{state.json_schema_version}" | ||
| end | ||
|
|
||
| state.json_schema_version = version | ||
| state.json_schema_version_setter_location = caller_locations(1, 1).to_a.first | ||
| nil | ||
| end | ||
|
|
||
| # Configures whether JSON schema artifact dumping enforces the requirement that the JSON schema version is incremented every time | ||
| # dumping the JSON schemas results in a changed artifact. Defaults to `true`. | ||
| # | ||
| # @note Generally speaking, you will want this to be `true` for any ElasticGraph application that is in | ||
| # production as the versioning of JSON schemas is what supports safe schema evolution as it allows | ||
| # ElasticGraph to identify which version of the JSON schema the publishing system was operating on | ||
| # when it published an event. | ||
| # | ||
| # It can be useful to set it to `false` before your application is in production, as you do not want | ||
| # to be forced to bump the version after every single schema change while you are building an initial | ||
| # prototype. | ||
| # | ||
| # @param value [Boolean] whether to require `json_schema_version` to be incremented on changes that impact `json_schemas.yaml` | ||
| # @return [void] | ||
| # @see #json_schema_version | ||
| # | ||
| # @example Disable enforcement during initial prototyping | ||
| # ElasticGraph.define_schema do |schema| | ||
| # # TODO: remove this once we're past the prototyping stage | ||
| # schema.enforce_json_schema_version false | ||
| # end | ||
| def enforce_json_schema_version(value) | ||
| unless value == true || value == false | ||
| raise Errors::SchemaError, "`enforce_json_schema_version` must be a boolean. Specified value: #{value.inspect}" | ||
| end | ||
|
|
||
| json_ingestion_state.enforce_json_schema_version = value | ||
| nil | ||
| end | ||
|
|
||
| # Defines strictness of the JSON schema validation. By default, the JSON schema will require all fields to be provided by the | ||
| # publisher (but they can be nullable) and will ignore extra fields that are not defined in the schema. Use this method to | ||
| # configure this behavior. | ||
| # | ||
| # @param allow_omitted_fields [bool] Whether nullable fields can be omitted from indexing events. | ||
| # @param allow_extra_fields [bool] Whether extra fields (e.g. beyond fields defined in the schema) can be included in indexing events. | ||
| # @return [void] | ||
| # | ||
| # @note If you allow both omitted fields and extra fields, ElasticGraph's JSON schema validation will allow (and ignore) misspelled | ||
| # field names in indexing events. For example, if the ElasticGraph schema has a nullable field named `parentId` but the publisher | ||
| # accidentally provides it as `parent_id`, ElasticGraph would happily ignore the `parent_id` field entirely, because `parentId` | ||
| # is allowed to be omitted and `parent_id` would be treated as an extra field. Therefore, we recommend that you only set one of | ||
| # these to `true` (or none). | ||
| # | ||
| # @example Allow omitted fields and disallow extra fields | ||
| # ElasticGraph.define_schema do |schema| | ||
| # schema.json_schema_strictness allow_omitted_fields: true, allow_extra_fields: false | ||
| # end | ||
| def json_schema_strictness(allow_omitted_fields: false, allow_extra_fields: true) | ||
|
jwils marked this conversation as resolved.
|
||
| state = json_ingestion_state | ||
|
|
||
| unless [true, false].include?(allow_omitted_fields) | ||
| raise Errors::SchemaError, "`allow_omitted_fields` must be true or false" | ||
| end | ||
|
|
||
| unless [true, false].include?(allow_extra_fields) | ||
| raise Errors::SchemaError, "`allow_extra_fields` must be true or false" | ||
| end | ||
|
|
||
| state.allow_omitted_json_schema_fields = allow_omitted_fields | ||
| state.allow_extra_json_schema_fields = allow_extra_fields | ||
| nil | ||
| end | ||
|
|
||
| private | ||
|
|
||
| # Returns the API's `state` narrowed to include this gem's `StateExtension`. Centralizes | ||
| # the Steep cast that's needed because Steep can't see the `extend(StateExtension)` applied | ||
| # at runtime in `extended`. | ||
|
jwils marked this conversation as resolved.
|
||
| def json_ingestion_state | ||
| state # : ElasticGraph::SchemaDefinition::State & StateExtension | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
136 changes: 136 additions & 0 deletions
136
...ph-json_ingestion/lib/elastic_graph/json_ingestion/schema_definition/factory_extension.rb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| # Copyright 2024 - 2026 Block, Inc. | ||
| # | ||
| # Use of this source code is governed by an MIT-style | ||
| # license that can be found in the LICENSE file or at | ||
| # https://opensource.org/licenses/MIT. | ||
| # | ||
| # frozen_string_literal: true | ||
|
|
||
| require "elastic_graph/constants" | ||
| require "elastic_graph/json_ingestion/schema_definition/indexing/field_type/enum" | ||
| require "elastic_graph/json_ingestion/schema_definition/indexing/field_type/object" | ||
| require "elastic_graph/json_ingestion/schema_definition/indexing/field_type/scalar" | ||
| require "elastic_graph/json_ingestion/schema_definition/indexing/field_type/union" | ||
| require "elastic_graph/json_ingestion/schema_definition/indexing/index_extension" | ||
| require "elastic_graph/graphql/scalar_coercion_adapters/valid_time_zones" | ||
| require "elastic_graph/json_ingestion/schema_definition/results_extension" | ||
| require "elastic_graph/json_ingestion/schema_definition/schema_artifact_manager_extension" | ||
| require "elastic_graph/json_ingestion/schema_definition/schema_elements/enum_type_extension" | ||
| require "elastic_graph/json_ingestion/schema_definition/schema_elements/field_extension" | ||
| require "elastic_graph/json_ingestion/schema_definition/schema_elements/scalar_type_extension" | ||
| require "elastic_graph/json_ingestion/schema_definition/schema_elements/type_with_subfields_extension" | ||
|
|
||
| module ElasticGraph | ||
| module JSONIngestion | ||
| module SchemaDefinition | ||
| # Extension module applied to `ElasticGraph::SchemaDefinition::Factory` to wire up | ||
| # JSON Schema support on Results and SchemaArtifactManager instances. | ||
| # | ||
| # @api private | ||
| module FactoryExtension | ||
| # Default JSON schema options applied to ElasticGraph's built-in scalar types as they | ||
| # are constructed. Keyed by the un-overridden type name, because built-in type | ||
| # registration always uses the canonical type name before `type_name_overrides` are | ||
| # applied to the resulting type reference. | ||
| BUILT_IN_SCALAR_JSON_SCHEMA_OPTIONS_BY_NAME = { | ||
| "Boolean" => {type: "boolean"}, | ||
| "Float" => {type: "number"}, | ||
| "ID" => {type: "string"}, | ||
| "Int" => {type: "integer", minimum: INT_MIN, maximum: INT_MAX}, | ||
| "String" => {type: "string"}, | ||
| "Cursor" => {type: "string"}, | ||
| "Date" => {type: "string", format: "date"}, | ||
| "DateTime" => {type: "string", format: "date-time"}, | ||
| "LocalTime" => {type: "string", pattern: VALID_LOCAL_TIME_JSON_SCHEMA_PATTERN}, | ||
| "TimeZone" => {type: "string", enum: GraphQL::ScalarCoercionAdapters::VALID_TIME_ZONES.to_a.freeze}, | ||
| "Untyped" => {type: ["array", "boolean", "integer", "number", "object", "string"].freeze}, | ||
| "JsonSafeLong" => {type: "integer", minimum: JSON_SAFE_LONG_MIN, maximum: JSON_SAFE_LONG_MAX}, | ||
| "LongString" => {type: "integer", minimum: LONG_STRING_MIN, maximum: LONG_STRING_MAX} | ||
| }.freeze | ||
|
|
||
| # @private | ||
| def new_enum_type(name) | ||
| super(name) do |type| | ||
| extended_type = type.extend(SchemaElements::EnumTypeExtension) # : ::ElasticGraph::SchemaDefinition::SchemaElements::EnumType & SchemaElements::EnumTypeExtension | ||
| yield extended_type if block_given? | ||
| end | ||
| end | ||
|
|
||
| # @private | ||
| def new_enum_indexing_field_type(enum_value_names) | ||
| Indexing::FieldType::Enum.new(super) | ||
| end | ||
|
|
||
| # @private | ||
| def new_field(**kwargs) | ||
| super(**kwargs) do |field| | ||
| extended_field = field.extend(SchemaElements::FieldExtension) # : ::ElasticGraph::SchemaDefinition::SchemaElements::Field & SchemaElements::FieldExtension | ||
| yield extended_field if block_given? | ||
| end | ||
| end | ||
|
|
||
| # @private | ||
| def new_index(name, settings, type) | ||
| super(name, settings, type) do |index| | ||
| extended_index = index.extend(Indexing::IndexExtension) # : ::ElasticGraph::SchemaDefinition::Indexing::Index & Indexing::IndexExtension | ||
| yield extended_index if block_given? | ||
| end | ||
| end | ||
|
|
||
| # @private | ||
| def new_object_indexing_field_type(...) | ||
| Indexing::FieldType::Object.new(super) | ||
| end | ||
|
|
||
| # @private | ||
| def new_scalar_type(name) | ||
| super(name) do |type| | ||
| extended_type = type.extend(SchemaElements::ScalarTypeExtension) # : ::ElasticGraph::SchemaDefinition::SchemaElements::ScalarType & SchemaElements::ScalarTypeExtension | ||
| if state.initially_registered_built_in_types.empty? && (options = BUILT_IN_SCALAR_JSON_SCHEMA_OPTIONS_BY_NAME[name.to_s]) | ||
| extended_type.json_schema(**options) | ||
| end | ||
|
|
||
| yield extended_type if block_given? | ||
| extended_type.finalize_json_schema_configuration! | ||
| end | ||
| end | ||
|
|
||
| # @private | ||
| def new_scalar_indexing_field_type(scalar_type:) | ||
| Indexing::FieldType::Scalar.new(super) | ||
| end | ||
|
|
||
| # @private | ||
| def new_type_with_subfields(schema_kind, name, wrapping_type:, field_factory:) | ||
| super(schema_kind, name, wrapping_type: wrapping_type, field_factory: field_factory) do |type| | ||
| extended_type = type.extend(SchemaElements::TypeWithSubfieldsExtension) # : ::ElasticGraph::SchemaDefinition::SchemaElements::TypeWithSubfields & SchemaElements::TypeWithSubfieldsExtension | ||
| yield extended_type if block_given? | ||
| end | ||
| end | ||
|
|
||
| # @private | ||
| def new_union_indexing_field_type(subtypes_by_name) | ||
| Indexing::FieldType::Union.new(super) | ||
| end | ||
|
|
||
| # Creates a new Results instance with JSON Schema extensions. | ||
| # | ||
| # @return [ElasticGraph::SchemaDefinition::Results] the created results instance | ||
| def new_results | ||
| super.tap do |results| | ||
| results.extend(ResultsExtension) | ||
| end | ||
| end | ||
|
|
||
| # Creates a new SchemaArtifactManager instance with JSON Schema extensions. | ||
| # | ||
| # @return [ElasticGraph::SchemaDefinition::SchemaArtifactManager] the created artifact manager | ||
| def new_schema_artifact_manager(...) | ||
| super.tap do |manager| | ||
| manager.extend(SchemaArtifactManagerExtension) | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.