Skip to content

Latest commit

 

History

History
149 lines (106 loc) · 4.56 KB

File metadata and controls

149 lines (106 loc) · 4.56 KB

JsonSchemaPatcher V8 and Time Complexity Optimizations

Major Optimizations

1. Pre-compute Schema-based Type Maps

The build plan already extracts schema information but doesn't leverage it for type-based fast paths. Add schema-aware type checking:

interface ArrayPlan {
  // ... existing fields
  itemType?: "primitive" | "object" | "mixed"; // Pre-computed from schema
  objectShape?: Set<string>; // Required + frequent properties from schema
  maxDepth?: number; // Pre-computed traversal depth limit
}

2. Eliminate Redundant Path Operations

Current path building (${path}/${key}) creates many temporary strings. Pre-build path segments:

private pathSegmentCache = new Map<string, string[]>()

private getPathSegments(path: string): string[] {
  if (!this.pathSegmentCache.has(path)) {
    this.pathSegmentCache.set(path, path.split('/').filter(Boolean))
  }
  return this.pathSegmentCache.get(path)!
}

3. Schema-guided Object Diffing

Use the build plan's requiredFields and objectShape to optimize object traversal order:

private diffObjectOptimized(obj1: JsonObject, obj2: JsonObject, path: string, patches: Operation[]) {
  const plan = this.getPlanForPath(path)

  // Process required fields first (most likely to differ)
  if (plan?.requiredFields) {
    for (const key of plan.requiredFields) {
      // Process required fields with fast path
    }
  }

  // Then process remaining fields
  const processedKeys = plan?.requiredFields || new Set()
  // ... rest of logic
}

4. Monomorphic Operation Creation

V8 optimizes better with consistent object shapes. Create operation factories:

private readonly operationFactories = {
  add: (path: string, value: JsonValue): Operation => ({ op: 'add', path, value }),
  remove: (path: string, oldValue: JsonValue): Operation => ({ op: 'remove', path, oldValue }),
  replace: (path: string, value: JsonValue, oldValue: JsonValue): Operation =>
    ({ op: 'replace', path, value, oldValue })
}

5. Batch Path Lookups

Instead of individual getPlanForPath calls, batch related lookups:

private batchGetPlans(paths: string[]): Map<string, ArrayPlan | undefined> {
  const results = new Map()
  const uncachedPaths = paths.filter(p => !this.planLookupCache.has(p))

  // Process uncached paths in batch to reduce Map lookups
  for (const path of uncachedPaths) {
    // ... existing logic but batched
  }

  return results
}

Time Complexity Improvements

1. O(1) Type Dispatch

Replace type checking chains with lookup tables:

private readonly typeHandlers = new Map([
  ['string', this.handlePrimitive],
  ['number', this.handlePrimitive],
  ['boolean', this.handlePrimitive],
  ['object', this.handleObject],
  // etc.
])

2. Early Schema Validation

Use build plan metadata to fail fast on incompatible structures:

private validateStructure(obj: JsonValue, plan?: ArrayPlan): boolean {
  if (!plan?.itemType) return true

  // Fast type validation based on pre-computed schema info
  if (plan.itemType === 'primitive' && typeof obj === 'object') return false
  if (plan.maxDepth && this.currentDepth > plan.maxDepth) return false

  return true
}

3. Reduce Set Operations

The current allKeys = new Set([...keys1, ...keys2]) is O(n). For objects with known schemas, pre-compute expected key sets:

// In buildPlan phase, pre-compute this
interface ObjectPlan {
  expectedKeys: Set<string>; // All possible keys from schema
  requiredKeys: Set<string>; // Required keys only
}

Missing Logical Optimizations

1. Schema-aware Equality

The build plan knows which fields are hash fields, but this isn't used optimally in refine(). Should check hash fields first before deep equality.

2. Structural Shortcuts

With a fixed JSON schema, you know the maximum nesting depth and can pre-allocate patch arrays or use iterative instead of recursive approaches for deep objects.

3. Plan Inheritance

Child paths could inherit parent plans more efficiently rather than doing full path lookups each time.

Implementation Priority

  1. High Priority: Schema-based type maps, path operation optimization, schema-guided diffing
  2. Medium Priority: Monomorphic operations, batch lookups, type dispatch
  3. Low Priority: Set operation optimization, schema-aware equality

The biggest wins would come from leveraging the schema information more aggressively for type dispatch and structural validation, plus reducing the string operations in path handling.