From 8093cadcab351bd687ede2163010b6ee7bb72439 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 5 Feb 2024 18:55:54 -0800 Subject: [PATCH 01/73] [draft proposal] Safe Access to Contiguous Storage --- .../nnnn-safe-shared-contiguous-storage.md | 658 ++++++++++++++++++ 1 file changed, 658 insertions(+) create mode 100644 proposals/nnnn-safe-shared-contiguous-storage.md diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md new file mode 100644 index 0000000000..4c669dec62 --- /dev/null +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -0,0 +1,658 @@ +## Safe Access to Contiguous Storage + +* Proposal: [SE-NNNN](NNNN-filename.md) +* Authors: [Guillaume Lessard](https://github.com/glessard), [Andrew Trick](https://github.com/atrick) +* Review Manager: TBD +* Status: **Awaiting implementation** +* Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) +* Bug: rdar://48132971, rdar://96837923 +* Implementation: (pending) +* Upcoming Feature Flag: (pending) +* Review: (pitch pending) + +## Introduction + +We introduce `StorageView`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without giving up on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. + +In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a struct being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, with no compromise to memory safety. + +This proposal is related to two other features being proposed along with it: [non-escapable type constraint]() (`~Escapable`) and [compile-time lifetime dependency annotations](https://github.com/tbkka/swift-evolution/blob/tbkka-lifetime-dependency/proposals/NNNN-lifetime-dependency.md). This proposal also supersedes [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. Additionally, we refer to an upcoming proposal to define a `BitwiseCopyable` layout constraint. + +## Motivation + +Consider for example a program using multiple libraries, including [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply it in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()` and use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. + +## Proposed solution + +`StorageView` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of a span of contiguous memory. A view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed during the lifetime of the view. `StorageView`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `StorageView` also performs bounds-checking on every access to preserve spatial safety. Additionally `StorageView` always represents initialized memory, preserving the definite initialization guarantee. + +By relying on borrowing, `StorageView` can provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `StorageView` is not a replacement for a copyable container with owned storage; see the future directions for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) + +A type can indicate that it can provide a `StorageView` by conforming to the `ContiguousStorage` protocol. For example, for the hypothetical base64 decoding library mentioned above, a possible API could be: + +```swift +extension HypotheticalBase64Decoder { + public func decode(bytes: some ContiguousStorage) -> [UInt8] +} +``` + +## Detailed design + +`StorageView` is a simple representation of a span of initialized memory. + +```swift +public struct StorageView +: ~Escapable, Copyable { + internal var _start: StorageViewIndex + internal var _count: Int +} +``` + +It provides a collection-like interface to the elements stored in that span of memory: + +```swift +extension StorageView { + public typealias Index: StorageViewIndex + public typealias SubSequence: Self + + public var startIndex: Index { _read } + public var endIndex: Index { _read } + public var count: Int { get } + + public func makeIterator() -> copy(self) StorageViewIterator + + public var isEmpty: Bool { get } + + // index-based subscripts + subscript(_ position: Index) -> copy(self) Element { _read } + subscript(_ bounds: Range) -> copy(self) StorageView { _read } + + // integer-offset subscripts + subscript(offset: Int) -> copy(self) Element { _read } + subscript(offsets: Range) -> copy(self) StorageView { _read } +} + +struct StorageViewIndex +: Copyable, Escapable, Strideable { /* ... */ } + +struct StorageViewIterator +: ~Escapable, Copyable { + // Should conform to a `BorrowingIterator` protocol that is not yet defined + // public mutating func borrowingNextElement() -> @copy(self) Element? +} + +extension StorageViewIterator where Element: Escapable, Copyable { + // Cannot conform to `IteratorProtocol` because `Self: ~Escapable` + public mutating func next() -> Element? +} +``` + +Note that `StorageView` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `StorageView`, the slice _must_ have the same lifetime as the view from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. + +A type can declare that it can provide access to contiguous storage by conforming to the `ContiguousStorage` protocol: + +```swift +public protocol ContiguousStorage: ~Escapable { + associatedtype Element: ~Copyable & ~Escapable + + var storageView: borrow(self) StorageView { _read } +} +``` + +The key safety feature is that a `StorageView` cannot escape to a scope where the value it borrowed no longer exists. + +An API that wishes to read from contiguous storage can declare a parameter type of `some ContiguousStorage`. The implementation will internally consist of a brief generic section, followed by business logic implemented in terms of a concrete `StorageView`. Frameworks that support library evolution (resilient frameworks) have an additional concern. Resilient frameworks have an ABI boundary that may differ from the API proper. Resilient frameworks may wish to adopt a pattern such as the following: + +```swift +extension MyResilientType { + // public API + @inlinable public func essentialFunction(_ a: some ContiguousStorage) -> Int { + a.withStorageView({ self.essentialFunction($0) }) + } + + // ABI boundary + @usableFromInline func essentialFunction(_ a: StorageView) -> Int { ... } +} +``` + +Here, the public function obtains the `StorageView` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `StorageView`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions)) + + + +#### Extensions to Standard Library and Foundation types + +```swift +extension Array: ContiguousStorage { + var storageView: borrow(self) StorageView { _read } +} +extension ArraySlice: ContiguousStorage { + var storageView: borrow(self) StorageView { _read } +} +extension ContiguousArray: ContiguousStorage { + var storageView: borrow(self) StorageView { _read } +} + +extension Foundation.Data: ContiguousStorage { + var storageView: borrow(self) StorageView { _read } +} + +extension String.UTF8View: ContiguousStorage { + // note: this could borrow a temporary copy of the original `String`'s storage object + var storageView: borrow(self) StorageView { _read } +} +extension Substring.UTF8View: ContiguousStorage { + // note: this could borrow a temporary copy of the original `Substring`'s storage object + var storageView: borrow(self) StorageView { _read } +} +extension Character.UTF8View: ContiguousStorage { + // note: this could borrow a temporary copy of the original `Character`'s storage object + var storageView: borrow(self) StorageView { _read } +} + +extension SIMD: ContiguousStorage { + var storageView: borrow(self) StorageView { _read } +} +extension KeyValuePairs: ContiguousStorage<(Self.Key, Self.Value)> { + var storageView: borrow(self) StorageView<(Self.Key, Self.Value)> { _read } +} +extension CollectionOfOne: ContiguousStorage { + var storageView: borrow(self) StorageView { _read } +} + +extension Slice: ContiguousStorage where Base: ContiguousStorage { + var storageView: borrow(self) StorageView { _read } +} + +extension UnsafeBufferPointer: ContiguousStorage { + // note: this applies additional preconditions to `self` for the duration of the borrow + var storageView: borrow(self) StorageView { _read } +} +extension UnsafeMutableBufferPointer: ContiguousStorage { + // note: this applies additional preconditions to `self` for the duration of the borrow + var storageView: borrow(self) StorageView { _read } +} +extension UnsafeRawBufferPointer: ContiguousStorage { + // note: this applies additional preconditions to `self` for the duration of the borrow + var storageView: borrow(self) StorageView { _read } +} +extension UnsafeMutableRawBufferPointer: ContiguousStorage { + // note: this applies additional preconditions to `self` for the duration of the borrow + var storageView: borrow(self) StorageView { _read } +} +``` + +#### Using `StorageView` with C functions or other unsafe code: + +`StorageView` has an escape hatch to allow its use with C functions: + +```swift +extension StorageView { + func withUnsafeBufferPointer( + _ body: (_ buffer: UnsafeBufferPointer) -> Result + ) -> Result +} + +extension StorageView where Element: BitwiseCopyable { + func withUnsafeBytes( + _ body: (_ buffer: UnsafeRawBufferPointer) -> Result + ) -> Result +} +``` + +#### Complete `StorageView` API: + +```swift +public struct StorageView +: Copyable, ~Escapable { + internal var _start: StorageViewIndex + internal var _count: Int +} +``` + +##### Creating a `StorageView`: + +The initialization of a `StorageView` instance is an unsafe operation. When it is initialized correctly, subsequent uses of the borrowed instance are safe. Typically these initializers will be used internally to a container's implementation of functions or computed properties that return a borrowed `StorageView`. + +```swift +extension StorageView { + + /// Unsafely create a `StorageView` over a span of initialized memory. + /// + /// The memory must be owned by the instance `owner`, meaning that + /// as long as `owner` is alive, then the memory will remain valid. + /// + /// - Parameters: + /// - unsafeBufferPointer: a buffer to initialized elements. + /// - owner: a binding whose lifetime must exceed that of + /// the returned `StorageView`. + public init( + unsafeBufferPointer: UnsafeBufferPointer, owner: borrowing Owner + ) -> borrow(owner) Self + + /// Unsafely create a `StorageView` over a span of initialized memory. + /// + /// The memory representing `count` instances starting at + /// `unsafePointer` must be owned by the instance `owner`, meaning that + /// as long as `owner` is alive, then the memory will remain valid. + /// + /// - Parameters: + /// - unsafePointer: a pointer to the first initialized element. + /// - count: the number of initialized elements in the view. + /// - owner: a binding whose lifetime must exceed that of + /// the returned `StorageView`. + public init( + unsafePointer: UnsafePointer, count: Int, owner: borrowing Owner + ) -> borrow(owner) Self + + /// Unsafely create a `StorageView` over a span of initialized memory. + /// + /// The memory in `unsafeBytes` must be owned by the instance + /// `owner`, meaning that as long as `owner` is alive, then the + /// memory will remain valid. + /// + /// `unsafeBytes` must be correctly aligned for accessing + /// an element of type `Element`, and must contain a number of bytes + /// that is an exact multiple of `Element`'s stride. + /// + /// - Parameters: + /// - unsafeBytes: a buffer to initialized elements. + /// - type: the type to use when interpreting the bytes in memory. + /// - owner: a binding whose lifetime must exceed that of + /// the returned `StorageView`. + public init( + unsafeBytes: UnsafeRawBufferPointer, as type: Element.Type, owner: borrowing Owner + ) -> borrow(owner) Self + + /// Unsafely create a `StorageView` over a span of initialized memory. + /// + /// The memory representing `count` instances starting at + /// `unsafeRawPointer` must be owned by the instance `owner`, + /// meaning that as long as `owner` is alive, then the memory + /// will remain valid. + /// + /// `unsafeRawPointer` must be correctly aligned for accessing + /// an element of type `Element`. + /// + /// - Parameters: + /// - unsafeRawPointer: a pointer to the first initialized element. + /// - type: the type to use when interpreting the bytes in memory. + /// - count: the number of initialized elements in the view. + /// - owner: a binding whose lifetime must exceed that of + /// the returned `StorageView`. + public init( + unsafeRawPointer: UnsafeRawPointer, as type: Element.Type, count: Int, owner: borrowing Owner + ) -> borrow(owner) Self +} +``` + +##### `Collection`-like API: + +The following typealiases, properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Sequence`, `Collection`, `BidirectionalCollection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation, allowing them to return borrowed nonescapable values or borrowed noncopyable values. + +```swift +extension StorageView { + public typealias Index: StorageViewIndex + public typealias SubSequence: Self + + public var startIndex: Index { _read } + public var endIndex: Index { _read } + public var count: Int { get } + + public func makeIterator() -> copy(self) StorageViewIterator + + // indexing operations + public func index(after i: Index) -> Index + public func index(before i: Index) -> Index + public func index(_ i: Index, offsetBy distance: Int) -> Index + + public func distance(from start: Index, to end: Index) -> Int + + public func formIndex(after i: inout Index) + public func formIndex(before i: inout Index) + public func formIndex(_ i: inout Index, offsetBy distance: Int) + + // subscripts + public subscript( + _ position: Index + ) -> copy(self) Element { _read } + public subscript( + _ bounds: Range + ) -> copy(self) StorageView { _read } + public subscript( + _ bounds: some RangeExpression + ) -> copy(self) StorageView { _read } + public subscript( + x: UnboundedRange + ) -> copy StorageView { _read } + + // one-sided slicing operations + public func prefix(upTo: Index) -> copy(self) StorageView + public func prefix(through: Index) -> copy(self) StorageView + public func prefix(_ maxLength: Int) -> copy(self) StorageView + public func dropLast(_ k: Int = 1) -> copy(self) StorageView + public func suffix(from: Index) -> copy(self) StorageView + public func suffix(_ maxLength: Int) -> copy(self) StorageView + public func dropFirst(_ k: Int = 1) -> copy(self) StorageView +} +``` + +##### Additions not in the `Collection` family API: + +```swift +extension StorageView { + /// Traps if `position` is not a valid index for this `StorageView` + public boundsCheckPrecondition(_ position: Index) + + /// Traps if `bounds` is not a valid range of indices for this `StorageView` + public boundsCheckPrecondition(_ bounds: Range) + + // Integer-offset subscripts + + /// Accesses the element at the specified offset in the `StorageView`. + /// + /// - Parameter offset: The offset of the element to access. `offset` + /// must be greater or equal to zero, and less than the `count` property. + /// + /// - Complexity: O(1) + public subscript(offset: Int) -> copy(self) Element { _read } + + /// Accesses the contiguous subrange of elements at the specified + /// range of offsets in this `StorageView`. + /// + /// - Parameter offsets: A range of offsets. The bounds of the range + /// must be greater or equal to zero, and less than the `count` property. + /// + /// - Complexity: O(1) + public subscript(offsets: Range) -> copy(self) StorageView { _read } + + /// Accesses the contiguous subrange of elements at the specified + /// range of offsets in this `StorageView`. + /// + /// - Parameter offsets: A range of offsets. The bounds of the range + /// must be greater or equal to zero, and less than the `count` property. + /// + /// - Complexity: O(1) + public subscript( + offsets: some RangeExpression + ) -> copy(self) StorageView { _read } + + // Unchecked subscripts + + /// Accesses the element at the specified `position`. + /// + /// This subscript does not validate `position`; this is an unsafe operation. + /// + /// - Parameter position: The position of the element to access. `position` + /// must be a valid index that is not equal to the `endIndex` property. + /// + /// - Complexity: O(1) + public subscript(unchecked position: Index) -> copy(self) Element { _read } + + /// Accesses a contiguous subrange of the elements represented by this `StorageView` + /// + /// This subscript does not validate `bounds`; this is an unsafe operation. + /// + /// - Parameter bounds: A range of the collection's indices. The bounds of + /// the range must be valid indices of the collection. + /// + /// - Complexity: O(1) + public subscript( + uncheckedBounds bounds: Range + ) -> copy(self) StorageView { _read } + + /// Accesses the contiguous subrange of the elements represented by this `StorageView`, + /// specified by a range expression. + /// + /// This subscript does not validate `bounds`; this is an unsafe operation. + /// + /// - Parameter bounds: A range of the collection's indices. The bounds of + /// the range must be valid indices of the collection. + /// + /// - Complexity: O(1) + public subscript( + uncheckedBounds bounds: some RangeExpression + ) -> copy(self) StorageView + + // Unchecked integer-offset subscripts + + /// Accesses the element at the specified offset in the `StorageView`. + /// + /// This subscript does not validate `offset`; this is an unsafe operation. + /// + /// - Parameter offset: The offset of the element to access. `offset` + /// must be greater or equal to zero, and less than the `count` property. + /// + /// - Complexity: O(1) + public subscript(uncheckedOffset offset: Int) -> copy(self) Element { _read } + + /// Accesses the contiguous subrange of elements at the specified + /// range of offsets in this `StorageView`. + /// + /// This subscript does not validate `offsets`; this is an unsafe operation. + /// + /// - Parameter offsets: A range of offsets. The bounds of the range + /// must be greater or equal to zero, and less than the `count` property. + /// + /// - Complexity: O(1) + public subscript( + uncheckedOffsets offsets: Range + ) -> copy(self) StorageView { _read } +} +``` + +`StorageView` gains additional functions when its `Element` is `BitwiseCopyable`: + +```swift +extension StorageView where Element: BitwiseCopyable { + // We may not need to require T: BitwiseCopyable for the aligned load operations + + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified byte offset. + /// + /// The memory at `offset` bytes from the start of this `StorageView` + /// must be properly aligned for accessing `T` and initialized to `T` + /// or another type that is layout compatible with `T`. + /// + /// - Parameters: + /// - offset: The offset from the start of this `StorageView`, in bytes. + /// `offset` must be nonnegative. The default is zero. + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes at + /// `offset`. The returned instance is memory-managed and unassociated + /// with the value in the memory referenced by this `StorageView`. + public func load( + fromByteOffset: Int = 0, as: T.Type + ) -> T + + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified index. + /// + /// The memory starting at `index` must be properly aligned for accessing `T` + /// and initialized to `T` or another type that is layout compatible with `T`. + /// + /// - Parameters: + /// - index: The index into this `StorageView` + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes starting at + /// `index`. The returned instance is memory-managed and isn't associated + /// with the value in the memory referenced by this `StorageView`. + public func load( + from index: Index, as: T.Type + ) -> T + + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified byte offset. + /// + /// The memory at `offset` bytes from the start of this `StorageView` + /// must be laid out identically to the in-memory representation of `T`. + /// + /// - Parameters: + /// - offset: The offset from the start of this `StorageView`, in bytes. + /// `offset` must be nonnegative. The default is zero. + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes at + /// `offset`. The returned instance isn't associated + /// with the value in the memory referenced by this `StorageView`. + public func loadUnaligned( + fromByteOffset: Int = 0, as: T.Type + ) -> T + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified index. + /// + /// The memory starting at `index` must be laid out identically + /// to the in-memory representation of `T`. + /// + /// - Parameters: + /// - index: The index into this `StorageView` + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes starting at + /// `index`. The returned instance isn't associated + /// with the value in the memory referenced by this `StorageView`. + public func loadUnaligned( + from index: Index, as: T.Type + ) -> T +} +``` + +##### Interoperability with unsafe code: + +We provide two functions for interoperability with C or other legacy pointer-taking functions. + +```swift +extension StorageView { + /// Calls a closure with a pointer to the viewed contiguous storage. + /// + /// The buffer pointer passed as an argument to `body` is valid only + /// during the execution of `withUnsafeBufferPointer(_:)`. + /// Do not store or return the pointer for later use. + /// + /// - Parameter body: A closure with an `UnsafeBufferPointer` parameter + /// that points to the viewed contiguous storage. If `body` has + /// a return value, that value is also used as the return value + /// for the `withUnsafeBufferPointer(_:)` method. The closure's + /// parameter is valid only for the duration of its execution. + /// - Returns: The return value of the `body` closure parameter. + func withUnsafeBufferPointer( + _ body: (_ buffer: UnsafeBufferPointer) -> Result + ) -> Result +} + +extension StorageView where Element: BitwiseCopyable { + /// Calls the given closure with a pointer to the underlying bytes of + /// the viewed contiguous storage. + /// + /// The buffer pointer passed as an argument to `body` is valid only + /// during the execution of `withUnsafeBytes(_:)`. + /// Do not store or return the pointer for later use. + /// + /// - Parameter body: A closure with an `UnsafeRawBufferPointer` + /// parameter that points to the viewed contiguous storage. + /// If `body` has a return value, that value is also + /// used as the return value for the `withUnsafeBytes(_:)` method. + /// The closure's parameter is valid only for the duration of + /// its execution. + /// - Returns: The return value of the `body` closure parameter. + func withUnsafeBytes( + _ body: (_ buffer: UnsafeRawBufferPointer) -> Result + ) -> Result +} +``` + + +## Source compatibility + +This proposal is additive and source-compatible with existing code. + +## ABI compatibility + +This proposal is additive and ABI-compatible with existing code. + +## Implications on adoption + +The additions described in this proposal require a new version of the standard library and runtime. + +## Alternatives considered + +##### Make `StorageView` a noncopyable type +Making `StorageView` non-copyable was in the early vision of this type. However, we found that would make `StorageView` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. + +##### A protocol in addition to `ContiguousStorage` for unsafe buffers +This document proposes adding the `ContiguousStorage` protocol to the standard library's `Unsafe{Mutable,Raw}BufferPointer` types. On the surface this seems like whitewashing the unsafety of these types. The lifetime constraint only applies to the binding used to obtain a `StorageView`, and the initialization precondition can only be enforced by documentation. Nothing will prevent unsafe code from deinitializing a portion of the storage while a `StorageView` is alive. There is no safe bridge from `UnsafeBufferPointer` to `ContiguousStorage`. We considered having the unsafe buffer types conforming to a different version of `ContiguousStorage`, which would vend a `StorageView` through a closure-taking API. Unfortunately such a closure would be perfectly capable of capturing the `UnsafeBufferPointer` binding and be as unsafe as can be. For this reason, the `UnsafeBufferPointer` family will conform to `ContiguousStorage`, with safety being enforced in documentation. + +##### Use a non-escapable index type +Eventually we want a similar usage pattern for a `MutableStorageView` as we are proposing for `StorageView`. If the index of a `MutableStorageView` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. + +##### Naming +The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and the term "View" in the API of `String`. A possible alternative name is `StorageSpan`, which mark it as a relative of C++'s `std::span`. + +## Future directions + +##### Defining `BorrowingIterator` with support in `for` loops +This proposal defines a `StorageViewIterator` that is borrowed and non-escapable. This is not compatible with `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: + +```swift +borrowing view: StorageView = ... +for borrowing element in view { + doSomething(element) +} +``` + +In the meantime, it is possible to loop through a `StorageView`'s elements by direct indexing: + +```swift +func doSomething(_ e: borrowing Element) { ... } +let view: StorageView = ... +// either: +var i = view.startIndex +while i < view.endIndex { + doSomething(view[i]) + view.index(after: &i) +} +// ...or: +for o in 0..` +Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. A `MutableStorageView` should provide a better, safer alternative. + +##### Delegating initialization of memory with `OutputBuffer` +Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. In the standard library we have `Array.init(unsafeUninitializedCapacity:initializingWith:)` and `String.init(unsafeUninitializedCapacity:initializingUTF8With:)`. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. + +##### Resizable, contiguously-stored, untyped collection in the standard library + +The example in the [motivation](#motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. + +Even if `StorageView` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, just as `Array` will: resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may still want to add an untyped-element equivalent of `Array` at a later time. + +##### Syntactic Sugar for Automatic Conversions +In the context of a resilient library, a generic entry point in terms of `some ContiguousStorage` may add unwanted overhead. As detailed above, an entry point in an evolution-enabled library requires an inlinable generic public entry point which forwards to a publicly-accessible function defined in terms of `StorageView`. If `StorageView` does become a widely-used type to interface between libraries, we could simplify these conversions with a bit of compiler help. + +We could provide an automatic way to use a `ContiguousStorage`-conforming type with a function that takes a `StorageView` of the appropriate element type: + +```swift +func myStrnlen(_ b: StorageView) -> Int { + guard let i = b.firstIndex(of: 0) else { return b.count } + return b.distance(from: b.startIndex, to: e) +} +let data = Data((0..<9).reversed()) // Data conforms to ContiguousStorage +let array = Array(data) // Array also conforms to ContiguousStorage +myStrnlen(data) // 8 +myStrnlen(array) // 8 +``` + +This would probably consist of a new type of custom conversion in the language. A type author would provide a way to convert from their type to an owned `StorageView`, and the compiler would insert that conversion where needed. This would enhance readability and reduce boilerplate. + +##### Interopability with C++'s `std::span` and with llvm's `-fbounds-safety` +The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are an opportunity for better, safer interoperation with a type such as `StorageView`. + +## Acknowledgments + +Joe Groff, John McCall, Tim Kientzle, Michael Ilseman, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. From 548b2505fb055f8f27c56fa22623b1091fffe55e Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 6 Feb 2024 14:35:36 -0800 Subject: [PATCH 02/73] edit placeholder proposal url --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 4c669dec62..057d25db26 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -1,6 +1,6 @@ ## Safe Access to Contiguous Storage -* Proposal: [SE-NNNN](NNNN-filename.md) +* Proposal: [SE-NNNN](nnnn-safe-shared-contiguous-storage.md) * Authors: [Guillaume Lessard](https://github.com/glessard), [Andrew Trick](https://github.com/atrick) * Review Manager: TBD * Status: **Awaiting implementation** From 7cfd46d316272e02af2780b7df8c1bc3a7acd139 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 6 Feb 2024 15:49:10 -0800 Subject: [PATCH 03/73] link to pitch thread --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 057d25db26..dca9ff7ece 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -8,7 +8,7 @@ * Bug: rdar://48132971, rdar://96837923 * Implementation: (pending) * Upcoming Feature Flag: (pending) -* Review: (pitch pending) +* Review: ([pitch](https://forums.swift.org/t/69888)) ## Introduction From 9993694b5d89f7d22437546e9d5a186eefd1c82a Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 8 Feb 2024 22:02:18 -0800 Subject: [PATCH 04/73] declare typealiases correctly --- proposals/nnnn-safe-shared-contiguous-storage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index dca9ff7ece..6d87119877 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -291,8 +291,8 @@ The following typealiases, properties, functions and subscripts have direct coun ```swift extension StorageView { - public typealias Index: StorageViewIndex - public typealias SubSequence: Self + public typealias Index = StorageViewIndex + public typealias SubSequence = Self public var startIndex: Index { _read } public var endIndex: Index { _read } From fece2282062a65b0590aa72e9d9d353be2fe4af5 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 8 Feb 2024 22:02:39 -0800 Subject: [PATCH 05/73] =?UTF-8?q?add=20=E2=80=9Cfirst=E2=80=9D=20and=20?= =?UTF-8?q?=E2=80=9Clast=E2=80=9D=20properties?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- proposals/nnnn-safe-shared-contiguous-storage.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 6d87119877..cb72392f18 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -324,6 +324,10 @@ extension StorageView { public subscript( x: UnboundedRange ) -> copy StorageView { _read } + + // utility properties + public var first: copy(self) Element? { _read } + public var last: copy(self) Element? { _read } // one-sided slicing operations public func prefix(upTo: Index) -> copy(self) StorageView From a3784fe919ab06246e99b374a88bfdf8ca5d76c0 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 9 Feb 2024 11:15:36 -0800 Subject: [PATCH 06/73] fix inits from raw pointers --- proposals/nnnn-safe-shared-contiguous-storage.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index cb72392f18..b6eb9e25b7 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -243,6 +243,9 @@ extension StorageView { public init( unsafePointer: UnsafePointer, count: Int, owner: borrowing Owner ) -> borrow(owner) Self +} + +extension StorageView where Element: BitwiseCopyable { /// Unsafely create a `StorageView` over a span of initialized memory. /// @@ -259,7 +262,7 @@ extension StorageView { /// - type: the type to use when interpreting the bytes in memory. /// - owner: a binding whose lifetime must exceed that of /// the returned `StorageView`. - public init( + public init( unsafeBytes: UnsafeRawBufferPointer, as type: Element.Type, owner: borrowing Owner ) -> borrow(owner) Self @@ -279,7 +282,7 @@ extension StorageView { /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of /// the returned `StorageView`. - public init( + public init( unsafeRawPointer: UnsafeRawPointer, as type: Element.Type, count: Int, owner: borrowing Owner ) -> borrow(owner) Self } From ceb261dceb0c0b7f3114bfad75c7b6489da2e829 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 13 Feb 2024 16:17:34 -0800 Subject: [PATCH 07/73] Update proposals/nnnn-safe-shared-contiguous-storage.md Co-authored-by: Alex Martini --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index b6eb9e25b7..b00b80fde7 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -1,4 +1,4 @@ -## Safe Access to Contiguous Storage +# Safe Access to Contiguous Storage * Proposal: [SE-NNNN](nnnn-safe-shared-contiguous-storage.md) * Authors: [Guillaume Lessard](https://github.com/glessard), [Andrew Trick](https://github.com/atrick) From 2480ca21495602c6eefdc75fe61db5c7f70b951a Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 14 Feb 2024 14:40:01 -0800 Subject: [PATCH 08/73] add `view(as: T)` --- proposals/nnnn-safe-shared-contiguous-storage.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index b00b80fde7..f0b6dff130 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -503,6 +503,7 @@ extension StorageView where Element: BitwiseCopyable { public func loadUnaligned( fromByteOffset: Int = 0, as: T.Type ) -> T + /// Returns a new instance of the given type, constructed from the raw memory /// at the specified index. /// @@ -518,6 +519,15 @@ extension StorageView where Element: BitwiseCopyable { public func loadUnaligned( from index: Index, as: T.Type ) -> T + + /// View the memory span represented by this view as a different type + /// + /// The memory must be laid out identically to the in-memory representation of `T`. + /// + /// - Parameters: + /// - type: The type you wish to view the memory as + /// - Returns: A new `StorageView` over elements of type `T` + public func view(as: T.Type) -> borrow(self) StorageView } ``` From 441a5c8eddfc1f7376abeecb6f29e8d332d3c217 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 14 Feb 2024 17:56:59 -0800 Subject: [PATCH 09/73] incorporate feedback from pitch discussion --- .../nnnn-safe-shared-contiguous-storage.md | 20 ++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index f0b6dff130..8bc48c151e 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -95,7 +95,7 @@ A type can declare that it can provide access to contiguous storage by conformin public protocol ContiguousStorage: ~Escapable { associatedtype Element: ~Copyable & ~Escapable - var storageView: borrow(self) StorageView { _read } + var storage: borrow(self) StorageView { _read } } ``` @@ -107,15 +107,15 @@ An API that wishes to read from contiguous storage can declare a parameter type extension MyResilientType { // public API @inlinable public func essentialFunction(_ a: some ContiguousStorage) -> Int { - a.withStorageView({ self.essentialFunction($0) }) + self.essentialFunction(a.storage) } // ABI boundary - @usableFromInline func essentialFunction(_ a: StorageView) -> Int { ... } + public func essentialFunction(_ a: StorageView) -> Int { ... } } ``` -Here, the public function obtains the `StorageView` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `StorageView`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions)) +Here, the public function obtains the `StorageView` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `StorageView`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions).) @@ -183,7 +183,7 @@ extension UnsafeMutableRawBufferPointer: ContiguousStorage { #### Using `StorageView` with C functions or other unsafe code: -`StorageView` has an escape hatch to allow its use with C functions: +`StorageView` has an unsafe hatch for use with unsafe code. ```swift extension StorageView { @@ -307,12 +307,18 @@ extension StorageView { public func index(after i: Index) -> Index public func index(before i: Index) -> Index public func index(_ i: Index, offsetBy distance: Int) -> Index - - public func distance(from start: Index, to end: Index) -> Int + public func index( + _ i: Index, offsetBy distance: Int, limitedBy limit: Index + ) -> Index? public func formIndex(after i: inout Index) public func formIndex(before i: inout Index) public func formIndex(_ i: inout Index, offsetBy distance: Int) + public func formIndex( + _ i: inout Index, offsetBy distance: Int, limitedBy limit: Index + ) -> Bool + + public func distance(from start: Index, to end: Index) -> Int // subscripts public subscript( From b360a50019aeee65a3ea9c1925ed1ba8ea7a98f0 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sat, 17 Feb 2024 14:09:40 -0800 Subject: [PATCH 10/73] enclose index and iterator types in the main type --- .../nnnn-safe-shared-contiguous-storage.md | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 8bc48c151e..7592f19503 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -52,7 +52,12 @@ It provides a collection-like interface to the elements stored in that span of m ```swift extension StorageView { - public typealias Index: StorageViewIndex + public struct Index: Copyable, Escapable, Strideable { /* .... */ } + public struct Iterator: Copyable, ~Escapable { + // Should conform to a `BorrowingIterator` protocol + // that will be defined at a later date. + } + public typealias SubSequence: Self public var startIndex: Index { _read } @@ -72,16 +77,7 @@ extension StorageView { subscript(offsets: Range) -> copy(self) StorageView { _read } } -struct StorageViewIndex -: Copyable, Escapable, Strideable { /* ... */ } - -struct StorageViewIterator -: ~Escapable, Copyable { - // Should conform to a `BorrowingIterator` protocol that is not yet defined - // public mutating func borrowingNextElement() -> @copy(self) Element? -} - -extension StorageViewIterator where Element: Escapable, Copyable { +extension StorageView.Iterator where Element: Escapable, Copyable { // Cannot conform to `IteratorProtocol` because `Self: ~Escapable` public mutating func next() -> Element? } From 9aa96ea29b059bd1475622a3dcec3370dad570ca Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 22 Feb 2024 21:53:45 -0800 Subject: [PATCH 11/73] update protocol declaration --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 7592f19503..ea4147fcbd 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -88,7 +88,7 @@ Note that `StorageView` does _not_ conform to `Collection`. This is because `Col A type can declare that it can provide access to contiguous storage by conforming to the `ContiguousStorage` protocol: ```swift -public protocol ContiguousStorage: ~Escapable { +public protocol ContiguousStorage: ~Copyable, ~Escapable { associatedtype Element: ~Copyable & ~Escapable var storage: borrow(self) StorageView { _read } From 1adba6db7930a3a8535b16359782e99d480c2a2f Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 23 Feb 2024 17:49:48 -0800 Subject: [PATCH 12/73] link to additional related pitches --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index ea4147fcbd..201bb74d7d 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -16,7 +16,7 @@ We introduce `StorageView`, an abstraction for container-agnostic access to c In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a struct being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, with no compromise to memory safety. -This proposal is related to two other features being proposed along with it: [non-escapable type constraint]() (`~Escapable`) and [compile-time lifetime dependency annotations](https://github.com/tbkka/swift-evolution/blob/tbkka-lifetime-dependency/proposals/NNNN-lifetime-dependency.md). This proposal also supersedes [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. Additionally, we refer to an upcoming proposal to define a `BitwiseCopyable` layout constraint. +This proposal is related to two other features being proposed along with it: [non-escapable type constraint]() (`~Escapable`) and [compile-time lifetime dependency annotations](https://github.com/tbkka/swift-evolution/blob/tbkka-lifetime-dependency/proposals/NNNN-lifetime-dependency.md). This proposal also supersedes [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. Additionally, we refer to proposals for [`BitwiseCopyable`](https://forums.swift.org/t/69943) and [Non-copyable Generics](https://forums.swift.org/t/68180). ## Motivation From 300591db3282663ba07a6cb7d69f72892ffd1338 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 15 Apr 2024 17:48:20 -0700 Subject: [PATCH 13/73] fix a stored property type --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 201bb74d7d..51217531f4 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -43,7 +43,7 @@ extension HypotheticalBase64Decoder { ```swift public struct StorageView : ~Escapable, Copyable { - internal var _start: StorageViewIndex + internal var _start: StorageView.Index internal var _count: Int } ``` From ed5fea2ca7dbc22a397fb52e9faa349f1cf311c5 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 17 Apr 2024 10:32:52 -0700 Subject: [PATCH 14/73] rename type, adopt new syntax --- .../nnnn-safe-shared-contiguous-storage.md | 258 +++++++++--------- 1 file changed, 129 insertions(+), 129 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 51217531f4..767ee29113 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -12,7 +12,7 @@ ## Introduction -We introduce `StorageView`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without giving up on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. +We introduce `Span`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without giving up on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a struct being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, with no compromise to memory safety. @@ -24,11 +24,11 @@ Consider for example a program using multiple libraries, including [base64](http ## Proposed solution -`StorageView` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of a span of contiguous memory. A view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed during the lifetime of the view. `StorageView`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `StorageView` also performs bounds-checking on every access to preserve spatial safety. Additionally `StorageView` always represents initialized memory, preserving the definite initialization guarantee. +`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of a span of contiguous memory. A view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed during the lifetime of the view. `Span`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. -By relying on borrowing, `StorageView` can provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `StorageView` is not a replacement for a copyable container with owned storage; see the future directions for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) +By relying on borrowing, `Span` can provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see the future directions for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) -A type can indicate that it can provide a `StorageView` by conforming to the `ContiguousStorage` protocol. For example, for the hypothetical base64 decoding library mentioned above, a possible API could be: +A type can indicate that it can provide a `Span` by conforming to the `ContiguousStorage` protocol. For example, for the hypothetical base64 decoding library mentioned above, a possible API could be: ```swift extension HypotheticalBase64Decoder { @@ -38,12 +38,12 @@ extension HypotheticalBase64Decoder { ## Detailed design -`StorageView` is a simple representation of a span of initialized memory. +`Span` is a simple representation of a span of initialized memory. ```swift -public struct StorageView +public struct Span : ~Escapable, Copyable { - internal var _start: StorageView.Index + internal var _start: Span.Index internal var _count: Int } ``` @@ -51,7 +51,7 @@ public struct StorageView It provides a collection-like interface to the elements stored in that span of memory: ```swift -extension StorageView { +extension Span { public struct Index: Copyable, Escapable, Strideable { /* .... */ } public struct Iterator: Copyable, ~Escapable { // Should conform to a `BorrowingIterator` protocol @@ -64,26 +64,26 @@ extension StorageView { public var endIndex: Index { _read } public var count: Int { get } - public func makeIterator() -> copy(self) StorageViewIterator + public func makeIterator() -> dependsOn(self) Span.Iterator public var isEmpty: Bool { get } // index-based subscripts - subscript(_ position: Index) -> copy(self) Element { _read } - subscript(_ bounds: Range) -> copy(self) StorageView { _read } + subscript(_ position: Index) -> dependsOn(self) Element { _read } + subscript(_ bounds: Range) -> dependsOn(self) Span { _read } // integer-offset subscripts - subscript(offset: Int) -> copy(self) Element { _read } - subscript(offsets: Range) -> copy(self) StorageView { _read } + subscript(offset: Int) -> dependsOn(self) Element { _read } + subscript(offsets: Range) -> dependsOn(self) Span { _read } } -extension StorageView.Iterator where Element: Escapable, Copyable { +extension Span.Iterator where Element: Copyable, Escapable { // Cannot conform to `IteratorProtocol` because `Self: ~Escapable` public mutating func next() -> Element? } ``` -Note that `StorageView` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `StorageView`, the slice _must_ have the same lifetime as the view from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. +Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, the slice _must_ have the same lifetime as the view from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. A type can declare that it can provide access to contiguous storage by conforming to the `ContiguousStorage` protocol: @@ -91,13 +91,13 @@ A type can declare that it can provide access to contiguous storage by conformin public protocol ContiguousStorage: ~Copyable, ~Escapable { associatedtype Element: ~Copyable & ~Escapable - var storage: borrow(self) StorageView { _read } + var storage: Span { _read } } ``` -The key safety feature is that a `StorageView` cannot escape to a scope where the value it borrowed no longer exists. +The key safety feature is that a `Span` cannot escape to a scope where the value it borrowed no longer exists. -An API that wishes to read from contiguous storage can declare a parameter type of `some ContiguousStorage`. The implementation will internally consist of a brief generic section, followed by business logic implemented in terms of a concrete `StorageView`. Frameworks that support library evolution (resilient frameworks) have an additional concern. Resilient frameworks have an ABI boundary that may differ from the API proper. Resilient frameworks may wish to adopt a pattern such as the following: +An API that wishes to read from contiguous storage can declare a parameter type of `some ContiguousStorage`. The implementation will internally consist of a brief generic section, followed by business logic implemented in terms of a concrete `Span`. Frameworks that support library evolution (resilient frameworks) have an additional concern. Resilient frameworks have an ABI boundary that may differ from the API proper. Resilient frameworks may wish to adopt a pattern such as the following: ```swift extension MyResilientType { @@ -107,11 +107,11 @@ extension MyResilientType { } // ABI boundary - public func essentialFunction(_ a: StorageView) -> Int { ... } + public func essentialFunction(_ a: Span) -> Int { ... } } ``` -Here, the public function obtains the `StorageView` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `StorageView`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions).) +Here, the public function obtains the `Span` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `Span`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions).) @@ -119,131 +119,131 @@ Here, the public function obtains the `StorageView` from the type that vends it ```swift extension Array: ContiguousStorage { - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension ArraySlice: ContiguousStorage { - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension ContiguousArray: ContiguousStorage { - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension Foundation.Data: ContiguousStorage { - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension String.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the original `String`'s storage object - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension Substring.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the original `Substring`'s storage object - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension Character.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the original `Character`'s storage object - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension SIMD: ContiguousStorage { - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension KeyValuePairs: ContiguousStorage<(Self.Key, Self.Value)> { - var storageView: borrow(self) StorageView<(Self.Key, Self.Value)> { _read } + var span: Span<(Self.Key, Self.Value)> { _read } } extension CollectionOfOne: ContiguousStorage { - var storageView: borrow(self) StorageView { _read } + var span: dependsOn(scope) Span { _read } } extension Slice: ContiguousStorage where Base: ContiguousStorage { - var storageView: borrow(self) StorageView { _read } + var span: Span { _read } } extension UnsafeBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var storageView: borrow(self) StorageView { _read } + var span: dependsOn(scope) Span { _read } } extension UnsafeMutableBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var storageView: borrow(self) StorageView { _read } + var span: dependsOn(scope) Span { _read } } extension UnsafeRawBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var storageView: borrow(self) StorageView { _read } + var span: dependsOn(scope) Span { _read } } extension UnsafeMutableRawBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var storageView: borrow(self) StorageView { _read } + var span: dependsOn(scope) Span { _read } } ``` -#### Using `StorageView` with C functions or other unsafe code: +#### Using `Span` with C functions or other unsafe code: -`StorageView` has an unsafe hatch for use with unsafe code. +`Span` has an unsafe hatch for use with unsafe code. ```swift -extension StorageView { +extension Span { func withUnsafeBufferPointer( _ body: (_ buffer: UnsafeBufferPointer) -> Result ) -> Result } -extension StorageView where Element: BitwiseCopyable { +extension Span where Element: BitwiseCopyable { func withUnsafeBytes( _ body: (_ buffer: UnsafeRawBufferPointer) -> Result ) -> Result } ``` -#### Complete `StorageView` API: +#### Complete `Span` API: ```swift -public struct StorageView +public struct Span : Copyable, ~Escapable { - internal var _start: StorageViewIndex + internal var _start: Span.Index internal var _count: Int } ``` -##### Creating a `StorageView`: +##### Creating a `Span`: -The initialization of a `StorageView` instance is an unsafe operation. When it is initialized correctly, subsequent uses of the borrowed instance are safe. Typically these initializers will be used internally to a container's implementation of functions or computed properties that return a borrowed `StorageView`. +The initialization of a `Span` instance is an unsafe operation. When it is initialized correctly, subsequent uses of the borrowed instance are safe. Typically these initializers will be used internally to a container's implementation of functions or computed properties that return a borrowed `Span`. ```swift -extension StorageView { +extension Span { - /// Unsafely create a `StorageView` over a span of initialized memory. + /// Unsafely create a `Span` over initialized memory. /// - /// The memory must be owned by the instance `owner`, meaning that + /// The span of memory must be owned by the instance `owner`, meaning that /// as long as `owner` is alive, then the memory will remain valid. /// /// - Parameters: - /// - unsafeBufferPointer: a buffer to initialized elements. + /// - buffer: an `UnsafeBufferPointer` to initialized elements. /// - owner: a binding whose lifetime must exceed that of - /// the returned `StorageView`. + /// the returned `Span`. public init( - unsafeBufferPointer: UnsafeBufferPointer, owner: borrowing Owner - ) -> borrow(owner) Self + unsafeBufferPointer buffer: UnsafeBufferPointer, owner: borrowing Owner + ) -> dependsOn(owner) Self - /// Unsafely create a `StorageView` over a span of initialized memory. + /// Unsafely create a `Span` over initialized memory. /// - /// The memory representing `count` instances starting at - /// `unsafePointer` must be owned by the instance `owner`, meaning that + /// The span of memory representing `count` instances starting at + /// `pointer` must be owned by the instance `owner`, meaning that /// as long as `owner` is alive, then the memory will remain valid. /// /// - Parameters: - /// - unsafePointer: a pointer to the first initialized element. + /// - pointer: a pointer to the first initialized element. /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of - /// the returned `StorageView`. + /// the returned `Span`. public init( unsafePointer: UnsafePointer, count: Int, owner: borrowing Owner - ) -> borrow(owner) Self + ) -> dependsOn(owner) Self } -extension StorageView where Element: BitwiseCopyable { +extension Span where Element: BitwiseCopyable { - /// Unsafely create a `StorageView` over a span of initialized memory. + /// Unsafely create a `Span` over a span of initialized memory. /// /// The memory in `unsafeBytes` must be owned by the instance /// `owner`, meaning that as long as `owner` is alive, then the @@ -257,12 +257,12 @@ extension StorageView where Element: BitwiseCopyable { /// - unsafeBytes: a buffer to initialized elements. /// - type: the type to use when interpreting the bytes in memory. /// - owner: a binding whose lifetime must exceed that of - /// the returned `StorageView`. + /// the returned `Span`. public init( unsafeBytes: UnsafeRawBufferPointer, as type: Element.Type, owner: borrowing Owner - ) -> borrow(owner) Self + ) -> dependsOn(owner) Self - /// Unsafely create a `StorageView` over a span of initialized memory. + /// Unsafely create a `Span` over a span of initialized memory. /// /// The memory representing `count` instances starting at /// `unsafeRawPointer` must be owned by the instance `owner`, @@ -277,10 +277,10 @@ extension StorageView where Element: BitwiseCopyable { /// - type: the type to use when interpreting the bytes in memory. /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of - /// the returned `StorageView`. + /// the returned `Span`. public init( unsafeRawPointer: UnsafeRawPointer, as type: Element.Type, count: Int, owner: borrowing Owner - ) -> borrow(owner) Self + ) -> dependsOn(owner) Self } ``` @@ -289,15 +289,15 @@ extension StorageView where Element: BitwiseCopyable { The following typealiases, properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Sequence`, `Collection`, `BidirectionalCollection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation, allowing them to return borrowed nonescapable values or borrowed noncopyable values. ```swift -extension StorageView { - public typealias Index = StorageViewIndex +extension Span { + public typealias Index = Span.Index public typealias SubSequence = Self public var startIndex: Index { _read } public var endIndex: Index { _read } public var count: Int { get } - public func makeIterator() -> copy(self) StorageViewIterator + public func makeIterator() -> dependsOn(self) Span.Iterator // indexing operations public func index(after i: Index) -> Index @@ -319,63 +319,63 @@ extension StorageView { // subscripts public subscript( _ position: Index - ) -> copy(self) Element { _read } + ) -> dependsOn(self) Element { _read } public subscript( _ bounds: Range - ) -> copy(self) StorageView { _read } + ) -> dependsOn(self) Span { _read } public subscript( _ bounds: some RangeExpression - ) -> copy(self) StorageView { _read } + ) -> dependsOn(self) Span { _read } public subscript( x: UnboundedRange - ) -> copy StorageView { _read } + ) -> copy Span { _read } // utility properties - public var first: copy(self) Element? { _read } - public var last: copy(self) Element? { _read } + public var first Element? { _read } + public var last Element? { _read } // one-sided slicing operations - public func prefix(upTo: Index) -> copy(self) StorageView - public func prefix(through: Index) -> copy(self) StorageView - public func prefix(_ maxLength: Int) -> copy(self) StorageView - public func dropLast(_ k: Int = 1) -> copy(self) StorageView - public func suffix(from: Index) -> copy(self) StorageView - public func suffix(_ maxLength: Int) -> copy(self) StorageView - public func dropFirst(_ k: Int = 1) -> copy(self) StorageView + public func prefix(upTo: Index) -> dependsOn(self) Span + public func prefix(through: Index) -> dependsOn(self) Span + public func prefix(_ maxLength: Int) -> dependsOn(self) Span + public func dropLast(_ k: Int = 1) -> dependsOn(self) Span + public func suffix(from: Index) -> dependsOn(self) Span + public func suffix(_ maxLength: Int) -> dependsOn(self) Span + public func dropFirst(_ k: Int = 1) -> dependsOn(self) Span } ``` ##### Additions not in the `Collection` family API: ```swift -extension StorageView { - /// Traps if `position` is not a valid index for this `StorageView` +extension Span { + /// Traps if `position` is not a valid index for this `Span` public boundsCheckPrecondition(_ position: Index) - /// Traps if `bounds` is not a valid range of indices for this `StorageView` + /// Traps if `bounds` is not a valid range of indices for this `Span` public boundsCheckPrecondition(_ bounds: Range) // Integer-offset subscripts - /// Accesses the element at the specified offset in the `StorageView`. + /// Accesses the element at the specified offset in the `Span`. /// /// - Parameter offset: The offset of the element to access. `offset` /// must be greater or equal to zero, and less than the `count` property. /// /// - Complexity: O(1) - public subscript(offset: Int) -> copy(self) Element { _read } + public subscript(offset: Int) -> dependsOn(self) Element { _read } /// Accesses the contiguous subrange of elements at the specified - /// range of offsets in this `StorageView`. + /// range of offsets in this `Span`. /// /// - Parameter offsets: A range of offsets. The bounds of the range /// must be greater or equal to zero, and less than the `count` property. /// /// - Complexity: O(1) - public subscript(offsets: Range) -> copy(self) StorageView { _read } + public subscript(offsets: Range) -> dependsOn(self) Span { _read } /// Accesses the contiguous subrange of elements at the specified - /// range of offsets in this `StorageView`. + /// range of offsets in this `Span`. /// /// - Parameter offsets: A range of offsets. The bounds of the range /// must be greater or equal to zero, and less than the `count` property. @@ -383,7 +383,7 @@ extension StorageView { /// - Complexity: O(1) public subscript( offsets: some RangeExpression - ) -> copy(self) StorageView { _read } + ) -> dependsOn(self) Span { _read } // Unchecked subscripts @@ -395,9 +395,9 @@ extension StorageView { /// must be a valid index that is not equal to the `endIndex` property. /// /// - Complexity: O(1) - public subscript(unchecked position: Index) -> copy(self) Element { _read } + public subscript(unchecked position: Index) -> dependsOn(self) Element { _read } - /// Accesses a contiguous subrange of the elements represented by this `StorageView` + /// Accesses a contiguous subrange of the elements represented by this `Span` /// /// This subscript does not validate `bounds`; this is an unsafe operation. /// @@ -407,9 +407,9 @@ extension StorageView { /// - Complexity: O(1) public subscript( uncheckedBounds bounds: Range - ) -> copy(self) StorageView { _read } + ) -> dependsOn(self) Span { _read } - /// Accesses the contiguous subrange of the elements represented by this `StorageView`, + /// Accesses the contiguous subrange of the elements represented by this `Span`, /// specified by a range expression. /// /// This subscript does not validate `bounds`; this is an unsafe operation. @@ -420,11 +420,11 @@ extension StorageView { /// - Complexity: O(1) public subscript( uncheckedBounds bounds: some RangeExpression - ) -> copy(self) StorageView + ) -> dependsOn(self) Span // Unchecked integer-offset subscripts - /// Accesses the element at the specified offset in the `StorageView`. + /// Accesses the element at the specified offset in the `Span`. /// /// This subscript does not validate `offset`; this is an unsafe operation. /// @@ -432,10 +432,10 @@ extension StorageView { /// must be greater or equal to zero, and less than the `count` property. /// /// - Complexity: O(1) - public subscript(uncheckedOffset offset: Int) -> copy(self) Element { _read } + public subscript(uncheckedOffset offset: Int) -> dependsOn(self) Element { _read } /// Accesses the contiguous subrange of elements at the specified - /// range of offsets in this `StorageView`. + /// range of offsets in this `Span`. /// /// This subscript does not validate `offsets`; this is an unsafe operation. /// @@ -445,30 +445,30 @@ extension StorageView { /// - Complexity: O(1) public subscript( uncheckedOffsets offsets: Range - ) -> copy(self) StorageView { _read } + ) -> dependsOn(self) Span { _read } } ``` -`StorageView` gains additional functions when its `Element` is `BitwiseCopyable`: +`Span` gains additional functions when its `Element` is `BitwiseCopyable`: ```swift -extension StorageView where Element: BitwiseCopyable { +extension Span where Element: BitwiseCopyable { // We may not need to require T: BitwiseCopyable for the aligned load operations /// Returns a new instance of the given type, constructed from the raw memory /// at the specified byte offset. /// - /// The memory at `offset` bytes from the start of this `StorageView` + /// The memory at `offset` bytes from the start of this `Span` /// must be properly aligned for accessing `T` and initialized to `T` /// or another type that is layout compatible with `T`. /// /// - Parameters: - /// - offset: The offset from the start of this `StorageView`, in bytes. + /// - offset: The offset from the start of this `Span`, in bytes. /// `offset` must be nonnegative. The default is zero. /// - type: The type of the instance to create. /// - Returns: A new instance of type `T`, read from the raw bytes at /// `offset`. The returned instance is memory-managed and unassociated - /// with the value in the memory referenced by this `StorageView`. + /// with the value in the memory referenced by this `Span`. public func load( fromByteOffset: Int = 0, as: T.Type ) -> T @@ -480,11 +480,11 @@ extension StorageView where Element: BitwiseCopyable { /// and initialized to `T` or another type that is layout compatible with `T`. /// /// - Parameters: - /// - index: The index into this `StorageView` + /// - index: The index into this `Span` /// - type: The type of the instance to create. /// - Returns: A new instance of type `T`, read from the raw bytes starting at /// `index`. The returned instance is memory-managed and isn't associated - /// with the value in the memory referenced by this `StorageView`. + /// with the value in the memory referenced by this `Span`. public func load( from index: Index, as: T.Type ) -> T @@ -492,16 +492,16 @@ extension StorageView where Element: BitwiseCopyable { /// Returns a new instance of the given type, constructed from the raw memory /// at the specified byte offset. /// - /// The memory at `offset` bytes from the start of this `StorageView` + /// The memory at `offset` bytes from the start of this `Span` /// must be laid out identically to the in-memory representation of `T`. /// /// - Parameters: - /// - offset: The offset from the start of this `StorageView`, in bytes. + /// - offset: The offset from the start of this `Span`, in bytes. /// `offset` must be nonnegative. The default is zero. /// - type: The type of the instance to create. /// - Returns: A new instance of type `T`, read from the raw bytes at /// `offset`. The returned instance isn't associated - /// with the value in the memory referenced by this `StorageView`. + /// with the value in the memory referenced by this `Span`. public func loadUnaligned( fromByteOffset: Int = 0, as: T.Type ) -> T @@ -513,11 +513,11 @@ extension StorageView where Element: BitwiseCopyable { /// to the in-memory representation of `T`. /// /// - Parameters: - /// - index: The index into this `StorageView` + /// - index: The index into this `Span` /// - type: The type of the instance to create. /// - Returns: A new instance of type `T`, read from the raw bytes starting at /// `index`. The returned instance isn't associated - /// with the value in the memory referenced by this `StorageView`. + /// with the value in the memory referenced by this `Span`. public func loadUnaligned( from index: Index, as: T.Type ) -> T @@ -528,8 +528,8 @@ extension StorageView where Element: BitwiseCopyable { /// /// - Parameters: /// - type: The type you wish to view the memory as - /// - Returns: A new `StorageView` over elements of type `T` - public func view(as: T.Type) -> borrow(self) StorageView + /// - Returns: A new `Span` over elements of type `T` + public func view(as: T.Type) -> dependsOn(self) Span } ``` @@ -538,7 +538,7 @@ extension StorageView where Element: BitwiseCopyable { We provide two functions for interoperability with C or other legacy pointer-taking functions. ```swift -extension StorageView { +extension Span { /// Calls a closure with a pointer to the viewed contiguous storage. /// /// The buffer pointer passed as an argument to `body` is valid only @@ -556,7 +556,7 @@ extension StorageView { ) -> Result } -extension StorageView where Element: BitwiseCopyable { +extension Span where Element: BitwiseCopyable { /// Calls the given closure with a pointer to the underlying bytes of /// the viewed contiguous storage. /// @@ -592,14 +592,14 @@ The additions described in this proposal require a new version of the standard l ## Alternatives considered -##### Make `StorageView` a noncopyable type -Making `StorageView` non-copyable was in the early vision of this type. However, we found that would make `StorageView` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. +##### Make `Span` a noncopyable type +Making `Span` non-copyable was in the early vision of this type. However, we found that would make `Span` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. ##### A protocol in addition to `ContiguousStorage` for unsafe buffers -This document proposes adding the `ContiguousStorage` protocol to the standard library's `Unsafe{Mutable,Raw}BufferPointer` types. On the surface this seems like whitewashing the unsafety of these types. The lifetime constraint only applies to the binding used to obtain a `StorageView`, and the initialization precondition can only be enforced by documentation. Nothing will prevent unsafe code from deinitializing a portion of the storage while a `StorageView` is alive. There is no safe bridge from `UnsafeBufferPointer` to `ContiguousStorage`. We considered having the unsafe buffer types conforming to a different version of `ContiguousStorage`, which would vend a `StorageView` through a closure-taking API. Unfortunately such a closure would be perfectly capable of capturing the `UnsafeBufferPointer` binding and be as unsafe as can be. For this reason, the `UnsafeBufferPointer` family will conform to `ContiguousStorage`, with safety being enforced in documentation. +This document proposes adding the `ContiguousStorage` protocol to the standard library's `Unsafe{Mutable,Raw}BufferPointer` types. On the surface this seems like whitewashing the unsafety of these types. The lifetime constraint only applies to the binding used to obtain a `Span`, and the initialization precondition can only be enforced by documentation. Nothing will prevent unsafe code from deinitializing a portion of the storage while a `Span` is alive. There is no safe bridge from `UnsafeBufferPointer` to `ContiguousStorage`. We considered having the unsafe buffer types conforming to a different version of `ContiguousStorage`, which would vend a `Span` through a closure-taking API. Unfortunately such a closure would be perfectly capable of capturing the `UnsafeBufferPointer` binding and be as unsafe as can be. For this reason, the `UnsafeBufferPointer` family will conform to `ContiguousStorage`, with safety being enforced in documentation. ##### Use a non-escapable index type -Eventually we want a similar usage pattern for a `MutableStorageView` as we are proposing for `StorageView`. If the index of a `MutableStorageView` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. +Eventually we want a similar usage pattern for a `MutableSpan` as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. ##### Naming The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and the term "View" in the API of `String`. A possible alternative name is `StorageSpan`, which mark it as a relative of C++'s `std::span`. @@ -610,17 +610,17 @@ The ideas in this proposal previously used the name `BufferView`. While the use This proposal defines a `StorageViewIterator` that is borrowed and non-escapable. This is not compatible with `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: ```swift -borrowing view: StorageView = ... +borrowing view: Span = ... for borrowing element in view { doSomething(element) } ``` -In the meantime, it is possible to loop through a `StorageView`'s elements by direct indexing: +In the meantime, it is possible to loop through a `Span`'s elements by direct indexing: ```swift func doSomething(_ e: borrowing Element) { ... } -let view: StorageView = ... +let view: Span = ... // either: var i = view.startIndex while i < view.endIndex { @@ -639,8 +639,8 @@ Non-copyable and non-escapable containers would benefit from a `Collection`-like ##### Sharing piecewise-contiguous memory Some types store their internal representation in a piecewise-contiguous manner, such as trees and ropes. Some operations naturally return information in a piecewise-contiguous manner, such as network operations. These could supply results by iterating through a list of contiguous chunks of memory. -##### Delegating mutations of memory with `MutableStorageView` -Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. A `MutableStorageView` should provide a better, safer alternative. +##### Delegating mutations of memory with `MutableSpan` +Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. A `MutableSpan` should provide a better, safer alternative. ##### Delegating initialization of memory with `OutputBuffer` Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. In the standard library we have `Array.init(unsafeUninitializedCapacity:initializingWith:)` and `String.init(unsafeUninitializedCapacity:initializingUTF8With:)`. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. @@ -649,15 +649,15 @@ Some data structures can delegate initialization of their initial memory represe The example in the [motivation](#motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. -Even if `StorageView` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, just as `Array` will: resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may still want to add an untyped-element equivalent of `Array` at a later time. +Even if `Span` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, just as `Array` will: resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may still want to add an untyped-element equivalent of `Array` at a later time. ##### Syntactic Sugar for Automatic Conversions -In the context of a resilient library, a generic entry point in terms of `some ContiguousStorage` may add unwanted overhead. As detailed above, an entry point in an evolution-enabled library requires an inlinable generic public entry point which forwards to a publicly-accessible function defined in terms of `StorageView`. If `StorageView` does become a widely-used type to interface between libraries, we could simplify these conversions with a bit of compiler help. +In the context of a resilient library, a generic entry point in terms of `some ContiguousStorage` may add unwanted overhead. As detailed above, an entry point in an evolution-enabled library requires an inlinable generic public entry point which forwards to a publicly-accessible function defined in terms of `Span`. If `Span` does become a widely-used type to interface between libraries, we could simplify these conversions with a bit of compiler help. -We could provide an automatic way to use a `ContiguousStorage`-conforming type with a function that takes a `StorageView` of the appropriate element type: +We could provide an automatic way to use a `ContiguousStorage`-conforming type with a function that takes a `Span` of the appropriate element type: ```swift -func myStrnlen(_ b: StorageView) -> Int { +func myStrnlen(_ b: Span) -> Int { guard let i = b.firstIndex(of: 0) else { return b.count } return b.distance(from: b.startIndex, to: e) } @@ -667,10 +667,10 @@ myStrnlen(data) // 8 myStrnlen(array) // 8 ``` -This would probably consist of a new type of custom conversion in the language. A type author would provide a way to convert from their type to an owned `StorageView`, and the compiler would insert that conversion where needed. This would enhance readability and reduce boilerplate. +This would probably consist of a new type of custom conversion in the language. A type author would provide a way to convert from their type to an owned `Span`, and the compiler would insert that conversion where needed. This would enhance readability and reduce boilerplate. ##### Interopability with C++'s `std::span` and with llvm's `-fbounds-safety` -The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are an opportunity for better, safer interoperation with a type such as `StorageView`. +The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are an opportunity for better, safer interoperation with a type such as `Span`. ## Acknowledgments From 7b2bb1fc54f14f347608c57a7232f7d3eb6330ba Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 19 Apr 2024 15:41:16 -0700 Subject: [PATCH 15/73] various updates --- .../nnnn-safe-shared-contiguous-storage.md | 356 ++++++++++++------ 1 file changed, 241 insertions(+), 115 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 767ee29113..ce53440442 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -14,17 +14,17 @@ We introduce `Span`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without giving up on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. -In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a struct being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, with no compromise to memory safety. +In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, with no compromise to memory safety. -This proposal is related to two other features being proposed along with it: [non-escapable type constraint]() (`~Escapable`) and [compile-time lifetime dependency annotations](https://github.com/tbkka/swift-evolution/blob/tbkka-lifetime-dependency/proposals/NNNN-lifetime-dependency.md). This proposal also supersedes [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. Additionally, we refer to proposals for [`BitwiseCopyable`](https://forums.swift.org/t/69943) and [Non-copyable Generics](https://forums.swift.org/t/68180). +This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/apple/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/apple/swift-evolution/pull/2305). This proposal also supersedes the rejected proposal [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. Additionally, we refer to the proposals for [`BitwiseCopyable`](https://github.com/apple/swift-evolution/blob/main/proposals/0426-bitwise-copyable.md) and [Non-copyable Generics](https://github.com/apple/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md). ## Motivation -Consider for example a program using multiple libraries, including [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply it in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()` and use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. +Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()` and use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. ## Proposed solution -`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of a span of contiguous memory. A view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed during the lifetime of the view. `Span`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. +`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. A view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed during the lifetime of the view. `Span`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. By relying on borrowing, `Span` can provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see the future directions for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) @@ -36,6 +36,8 @@ extension HypotheticalBase64Decoder { } ``` +We will also provide a `RawSpan` in order to provide operations over contiguous bytes, for use in decoders and the like. The advantage of `RawSpan` is to be a concrete type, without a generic parameter. + ## Detailed design `Span` is a simple representation of a span of initialized memory. @@ -52,29 +54,28 @@ It provides a collection-like interface to the elements stored in that span of m ```swift extension Span { - public struct Index: Copyable, Escapable, Strideable { /* .... */ } + public struct Index: Copyable, Escapable, Strideable { /* .... */ } public struct Iterator: Copyable, ~Escapable { // Should conform to a `BorrowingIterator` protocol // that will be defined at a later date. } - + public typealias SubSequence: Self - public var startIndex: Index { _read } - public var endIndex: Index { _read } + public var startIndex: Index { get } + public var endIndex: Index { get } public var count: Int { get } + public var isEmpty: Bool { get } public func makeIterator() -> dependsOn(self) Span.Iterator - public var isEmpty: Bool { get } - // index-based subscripts - subscript(_ position: Index) -> dependsOn(self) Element { _read } - subscript(_ bounds: Range) -> dependsOn(self) Span { _read } + subscript(_ position: Index) -> dependsOn(self) Element { get } + subscript(_ bounds: Range) -> dependsOn(self) Span { get } // integer-offset subscripts - subscript(offset: Int) -> dependsOn(self) Element { _read } - subscript(offsets: Range) -> dependsOn(self) Span { _read } + subscript(offset: Int) -> dependsOn(self) Element { get } + subscript(offsets: Range) -> dependsOn(self) Span { get } } extension Span.Iterator where Element: Copyable, Escapable { @@ -91,7 +92,7 @@ A type can declare that it can provide access to contiguous storage by conformin public protocol ContiguousStorage: ~Copyable, ~Escapable { associatedtype Element: ~Copyable & ~Escapable - var storage: Span { _read } + var storage: Span { get } } ``` @@ -113,67 +114,67 @@ extension MyResilientType { Here, the public function obtains the `Span` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `Span`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions).) - +In addition to `Span`, we propose the addition of `RawSpan`. `RawSpan` is similar to `Span`, but represents initialized bytes. Its API supports slicing, along with the operations `load(as:)` and `loadUnaligned(as:)`. `RawSpan` is a specialized type supporting parsing and decoding applications in particular, where heavily-used code paths require concrete types as much as possible. #### Extensions to Standard Library and Foundation types ```swift extension Array: ContiguousStorage { - var span: Span { _read } + var storage: Span { get } } extension ArraySlice: ContiguousStorage { - var span: Span { _read } + var storage: Span { get } } extension ContiguousArray: ContiguousStorage { - var span: Span { _read } + var storage: Span { get } } extension Foundation.Data: ContiguousStorage { - var span: Span { _read } + var storage: Span { get } } extension String.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the original `String`'s storage object - var span: Span { _read } + var storage: Span { get } } extension Substring.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the original `Substring`'s storage object - var span: Span { _read } + var storage: Span { get } } extension Character.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the original `Character`'s storage object - var span: Span { _read } + var storage: Span { get } } extension SIMD: ContiguousStorage { - var span: Span { _read } + var storage: Span { get } } extension KeyValuePairs: ContiguousStorage<(Self.Key, Self.Value)> { - var span: Span<(Self.Key, Self.Value)> { _read } + var storage: Span<(Self.Key, Self.Value)> { get } } extension CollectionOfOne: ContiguousStorage { - var span: dependsOn(scope) Span { _read } + var storage: Span { get } } extension Slice: ContiguousStorage where Base: ContiguousStorage { - var span: Span { _read } + var storage: Span { get } } extension UnsafeBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var span: dependsOn(scope) Span { _read } + var storage: Span { @_unsafeNonescapableResult get } } extension UnsafeMutableBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var span: dependsOn(scope) Span { _read } + var storage: Span { @_unsafeNonescapableResult get } } extension UnsafeRawBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var span: dependsOn(scope) Span { _read } + var storage: Span { @_unsafeNonescapableResult get } } extension UnsafeMutableRawBufferPointer: ContiguousStorage { // note: this applies additional preconditions to `self` for the duration of the borrow - var span: dependsOn(scope) Span { _read } + var storage: Span { @_unsafeNonescapableResult get } } ``` @@ -214,22 +215,22 @@ extension Span { /// Unsafely create a `Span` over initialized memory. /// - /// The span of memory must be owned by the instance `owner`, meaning that - /// as long as `owner` is alive, then the memory will remain valid. + /// The memory in `buffer` must be owned by the instance `owner`, + /// meaning that as long as `owner` is alive the memory will remain valid. /// /// - Parameters: /// - buffer: an `UnsafeBufferPointer` to initialized elements. /// - owner: a binding whose lifetime must exceed that of /// the returned `Span`. - public init( + public init?( unsafeBufferPointer buffer: UnsafeBufferPointer, owner: borrowing Owner - ) -> dependsOn(owner) Self + ) -> dependsOn(owner) Self? /// Unsafely create a `Span` over initialized memory. /// - /// The span of memory representing `count` instances starting at - /// `pointer` must be owned by the instance `owner`, meaning that - /// as long as `owner` is alive, then the memory will remain valid. + /// The memory representing `count` instances starting at + /// `pointer` must be owned by the instance `owner`, + /// meaning that as long as `owner` is alive the memory will remain valid. /// /// - Parameters: /// - pointer: a pointer to the first initialized element. @@ -237,17 +238,16 @@ extension Span { /// - owner: a binding whose lifetime must exceed that of /// the returned `Span`. public init( - unsafePointer: UnsafePointer, count: Int, owner: borrowing Owner + unsafePointer pointer: UnsafePointer, count: Int, owner: borrowing Owner ) -> dependsOn(owner) Self } extension Span where Element: BitwiseCopyable { - /// Unsafely create a `Span` over a span of initialized memory. + /// Unsafely create a `Span` over initialized memory. /// - /// The memory in `unsafeBytes` must be owned by the instance - /// `owner`, meaning that as long as `owner` is alive, then the - /// memory will remain valid. + /// The memory in `unsafeBytes` must be owned by the instance `owner` + /// meaning that as long as `owner` is alive the memory will remain valid. /// /// `unsafeBytes` must be correctly aligned for accessing /// an element of type `Element`, and must contain a number of bytes @@ -259,15 +259,16 @@ extension Span where Element: BitwiseCopyable { /// - owner: a binding whose lifetime must exceed that of /// the returned `Span`. public init( - unsafeBytes: UnsafeRawBufferPointer, as type: Element.Type, owner: borrowing Owner + unsafeBytes buffer: UnsafeRawBufferPointer, + as type: Element.Type, + owner: borrowing Owner ) -> dependsOn(owner) Self - + /// Unsafely create a `Span` over a span of initialized memory. /// /// The memory representing `count` instances starting at /// `unsafeRawPointer` must be owned by the instance `owner`, - /// meaning that as long as `owner` is alive, then the memory - /// will remain valid. + /// meaning that as long as `owner` is alive the memory will remain valid. /// /// `unsafeRawPointer` must be correctly aligned for accessing /// an element of type `Element`. @@ -279,7 +280,10 @@ extension Span where Element: BitwiseCopyable { /// - owner: a binding whose lifetime must exceed that of /// the returned `Span`. public init( - unsafeRawPointer: UnsafeRawPointer, as type: Element.Type, count: Int, owner: borrowing Owner + unsafeRawPointer pointer: UnsafeRawPointer, + as type: Element.Type, + count: Int, + owner: borrowing Owner ) -> dependsOn(owner) Self } ``` @@ -293,11 +297,14 @@ extension Span { public typealias Index = Span.Index public typealias SubSequence = Self - public var startIndex: Index { _read } - public var endIndex: Index { _read } + public func makeIterator() -> dependsOn(self) Span.Iterator + + public var startIndex: Index { get } + public var endIndex: Index { get } public var count: Int { get } + public var isEmpty: Bool { get } - public func makeIterator() -> dependsOn(self) Span.Iterator + public var indices: Range { get } // indexing operations public func index(after i: Index) -> Index @@ -310,7 +317,7 @@ extension Span { public func formIndex(after i: inout Index) public func formIndex(before i: inout Index) public func formIndex(_ i: inout Index, offsetBy distance: Int) - public func formIndex( + public func formIndex( _ i: inout Index, offsetBy distance: Int, limitedBy limit: Index ) -> Bool @@ -319,20 +326,20 @@ extension Span { // subscripts public subscript( _ position: Index - ) -> dependsOn(self) Element { _read } + ) -> dependsOn(self) Element { get } public subscript( _ bounds: Range - ) -> dependsOn(self) Span { _read } + ) -> dependsOn(self) Span { get } public subscript( _ bounds: some RangeExpression - ) -> dependsOn(self) Span { _read } + ) -> dependsOn(self) Span { get } public subscript( x: UnboundedRange - ) -> copy Span { _read } - + ) -> dependsOn(self) Span { get } + // utility properties - public var first Element? { _read } - public var last Element? { _read } + public var first Element? { get } + public var last Element? { get } // one-sided slicing operations public func prefix(upTo: Index) -> dependsOn(self) Span @@ -350,9 +357,15 @@ extension Span { ```swift extension Span { /// Traps if `position` is not a valid index for this `Span` + /// + /// - Parameters: + /// - position: an Index to validate public boundsCheckPrecondition(_ position: Index) /// Traps if `bounds` is not a valid range of indices for this `Span` + /// + /// - Parameters: + /// - position: a range of indices to validate public boundsCheckPrecondition(_ bounds: Range) // Integer-offset subscripts @@ -363,7 +376,7 @@ extension Span { /// must be greater or equal to zero, and less than the `count` property. /// /// - Complexity: O(1) - public subscript(offset: Int) -> dependsOn(self) Element { _read } + public subscript(offset: Int) -> dependsOn(self) Element { get } /// Accesses the contiguous subrange of elements at the specified /// range of offsets in this `Span`. @@ -372,7 +385,7 @@ extension Span { /// must be greater or equal to zero, and less than the `count` property. /// /// - Complexity: O(1) - public subscript(offsets: Range) -> dependsOn(self) Span { _read } + public subscript(offsets: Range) -> dependsOn(self) Span { get } /// Accesses the contiguous subrange of elements at the specified /// range of offsets in this `Span`. @@ -383,7 +396,7 @@ extension Span { /// - Complexity: O(1) public subscript( offsets: some RangeExpression - ) -> dependsOn(self) Span { _read } + ) -> dependsOn(self) Span { get } // Unchecked subscripts @@ -395,7 +408,7 @@ extension Span { /// must be a valid index that is not equal to the `endIndex` property. /// /// - Complexity: O(1) - public subscript(unchecked position: Index) -> dependsOn(self) Element { _read } + public subscript(unchecked position: Index) -> dependsOn(self) Element { get } /// Accesses a contiguous subrange of the elements represented by this `Span` /// @@ -407,7 +420,7 @@ extension Span { /// - Complexity: O(1) public subscript( uncheckedBounds bounds: Range - ) -> dependsOn(self) Span { _read } + ) -> dependsOn(self) Span { get } /// Accesses the contiguous subrange of the elements represented by this `Span`, /// specified by a range expression. @@ -432,7 +445,7 @@ extension Span { /// must be greater or equal to zero, and less than the `count` property. /// /// - Complexity: O(1) - public subscript(uncheckedOffset offset: Int) -> dependsOn(self) Element { _read } + public subscript(uncheckedOffset offset: Int) -> dependsOn(self) Element { get } /// Accesses the contiguous subrange of elements at the specified /// range of offsets in this `Span`. @@ -445,15 +458,167 @@ extension Span { /// - Complexity: O(1) public subscript( uncheckedOffsets offsets: Range - ) -> dependsOn(self) Span { _read } + ) -> dependsOn(self) Span { get } } ``` -`Span` gains additional functions when its `Element` is `BitwiseCopyable`: +##### Interoperability with unsafe code: + +We provide two functions for interoperability with C or other legacy pointer-taking functions. ```swift +extension Span { + /// Calls a closure with a pointer to the viewed contiguous storage. + /// + /// The buffer pointer passed as an argument to `body` is valid only + /// during the execution of `withUnsafeBufferPointer(_:)`. + /// Do not store or return the pointer for later use. + /// + /// - Parameter body: A closure with an `UnsafeBufferPointer` parameter + /// that points to the viewed contiguous storage. If `body` has + /// a return value, that value is also used as the return value + /// for the `withUnsafeBufferPointer(_:)` method. The closure's + /// parameter is valid only for the duration of its execution. + /// - Returns: The return value of the `body` closure parameter. + func withUnsafeBufferPointer( + _ body: (_ buffer: UnsafeBufferPointer) -> Result + ) -> Result +} + extension Span where Element: BitwiseCopyable { - // We may not need to require T: BitwiseCopyable for the aligned load operations + /// Calls the given closure with a pointer to the underlying bytes of + /// the viewed contiguous storage. + /// + /// The buffer pointer passed as an argument to `body` is valid only + /// during the execution of `withUnsafeBytes(_:)`. + /// Do not store or return the pointer for later use. + /// + /// - Parameter body: A closure with an `UnsafeRawBufferPointer` + /// parameter that points to the viewed contiguous storage. + /// If `body` has a return value, that value is also + /// used as the return value for the `withUnsafeBytes(_:)` method. + /// The closure's parameter is valid only for the duration of + /// its execution. + /// - Returns: The return value of the `body` closure parameter. + func withUnsafeBytes( + _ body: (_ buffer: UnsafeRawBufferPointer) -> Result + ) -> Result +} +``` + +#### Complete `RawSpan` API: + +```swift +public struct RawSpan: Copyable, ~Escapable { + internal var _start: RawSpan.Index + internal var _count: Int +} +``` + +##### Initializing a `RawSpan`: + +```swift +extension RawSpan { + /// Unsafely create a `RawSpan` over initialized memory. + /// + /// The memory in `buffer` must be owned by the instance `owner`, + /// meaning that as long as `owner` is alive the memory will remain valid. + /// + /// - Parameters: + /// - buffer: an `UnsafeRawBufferPointer` to initialized memory. + /// - owner: a binding whose lifetime must exceed that of + /// the returned `RawSpan`. + public init?( + unsafeBufferPointer buffer: UnsafeBufferPointer, + owner: borrowing Owner + ) -> dependsOn(owner) Self? + + /// Unsafely create a `RawSpan` over initialized memory. + /// + /// The memory over `count` bytes starting at + /// `pointer` must be owned by the instance `owner`, + /// meaning that as long as `owner` is alive the memory will remain valid. + /// + /// - Parameters: + /// - pointer: a pointer to the first initialized element. + /// - count: the number of initialized elements in the view. + /// - owner: a binding whose lifetime must exceed that of + /// the returned `Span`. + public init( + unsafeRawPointer pointer: UnsafeRawPointer, + count: Int, + owner: borrowing Owner + ) -> dependsOn(owner) Self + + /// Create a `RawSpan` over the memory represented by a `Span` + /// + /// - Parameters: + /// - span: An existing `Span`, which will define both this + /// `RawSpan`'s lifetime and the memory it represents. + @inlinable @inline(__always) + public init( + _ span: borrowing Span + ) -> dependsOn(span) Self +} +``` + +##### Indexing Operations: + +`RawSpan` has `Collection`-like indexing operations: + +```swift +extension RawSpan { + public typealias Index = Span.Index + public typealias SubSequence = Self + + public func makeIterator() -> dependsOn(self) Span.Iterator + + public var startIndex: Index { get } + public var endIndex: Index { get } + public var count: Int { get } + public var isEmpty: Bool { get } + + public var indices: Range { get } + + // indexing operations + public func index(after i: Index) -> Index + public func index(before i: Index) -> Index + public func index(_ i: Index, offsetBy distance: Int) -> Index + public func index( + _ i: Index, offsetBy distance: Int, limitedBy limit: Index + ) -> Index? + + public func formIndex(after i: inout Index) + public func formIndex(before i: inout Index) + public func formIndex(_ i: inout Index, offsetBy distance: Int) + public func formIndex( + _ i: inout Index, offsetBy distance: Int, limitedBy limit: Index + ) -> Bool + + public func distance(from start: Index, to end: Index) -> Int +} +``` + +```swift +extension RawSpan { + /// Traps if `position` is not a valid index for this `Span` + /// + /// - Parameters: + /// - position: an Index to validate + public boundsCheckPrecondition(_ position: Index) + + /// Traps if `bounds` is not a valid range of indices for this `Span` + /// + /// - Parameters: + /// - position: a range of indices to validate + public boundsCheckPrecondition(_ bounds: Range) +} +``` + +`RawSpan` has the following functions for loading arbitrary types from the memory it represents: + +```swift +extension RawSpan { /// Returns a new instance of the given type, constructed from the raw memory /// at the specified byte offset. @@ -521,7 +686,7 @@ extension Span where Element: BitwiseCopyable { public func loadUnaligned( from index: Index, as: T.Type ) -> T - + /// View the memory span represented by this view as a different type /// /// The memory must be laid out identically to the in-memory representation of `T`. @@ -533,50 +698,7 @@ extension Span where Element: BitwiseCopyable { } ``` -##### Interoperability with unsafe code: - -We provide two functions for interoperability with C or other legacy pointer-taking functions. - -```swift -extension Span { - /// Calls a closure with a pointer to the viewed contiguous storage. - /// - /// The buffer pointer passed as an argument to `body` is valid only - /// during the execution of `withUnsafeBufferPointer(_:)`. - /// Do not store or return the pointer for later use. - /// - /// - Parameter body: A closure with an `UnsafeBufferPointer` parameter - /// that points to the viewed contiguous storage. If `body` has - /// a return value, that value is also used as the return value - /// for the `withUnsafeBufferPointer(_:)` method. The closure's - /// parameter is valid only for the duration of its execution. - /// - Returns: The return value of the `body` closure parameter. - func withUnsafeBufferPointer( - _ body: (_ buffer: UnsafeBufferPointer) -> Result - ) -> Result -} - -extension Span where Element: BitwiseCopyable { - /// Calls the given closure with a pointer to the underlying bytes of - /// the viewed contiguous storage. - /// - /// The buffer pointer passed as an argument to `body` is valid only - /// during the execution of `withUnsafeBytes(_:)`. - /// Do not store or return the pointer for later use. - /// - /// - Parameter body: A closure with an `UnsafeRawBufferPointer` - /// parameter that points to the viewed contiguous storage. - /// If `body` has a return value, that value is also - /// used as the return value for the `withUnsafeBytes(_:)` method. - /// The closure's parameter is valid only for the duration of - /// its execution. - /// - Returns: The return value of the `body` closure parameter. - func withUnsafeBytes( - _ body: (_ buffer: UnsafeRawBufferPointer) -> Result - ) -> Result -} -``` - +##### ## Source compatibility @@ -604,10 +726,14 @@ Eventually we want a similar usage pattern for a `MutableSpan` as we are proposi ##### Naming The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and the term "View" in the API of `String`. A possible alternative name is `StorageSpan`, which mark it as a relative of C++'s `std::span`. +##### Adding `load` and `loadUnaligned` on `Span` instead of making `RawSpan` + +TKTKTK + ## Future directions ##### Defining `BorrowingIterator` with support in `for` loops -This proposal defines a `StorageViewIterator` that is borrowed and non-escapable. This is not compatible with `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: +This proposal defines a `Span.Iterator` that is borrowed and non-escapable. This is not compatible with `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: ```swift borrowing view: Span = ... @@ -642,7 +768,7 @@ Some types store their internal representation in a piecewise-contiguous manner, ##### Delegating mutations of memory with `MutableSpan` Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. A `MutableSpan` should provide a better, safer alternative. -##### Delegating initialization of memory with `OutputBuffer` +##### Delegating initialization of memory with `OutputSpan` Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. In the standard library we have `Array.init(unsafeUninitializedCapacity:initializingWith:)` and `String.init(unsafeUninitializedCapacity:initializingUTF8With:)`. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. ##### Resizable, contiguously-stored, untyped collection in the standard library From f8760434d0b172cd32806a11aa7884f4b2bf91f3 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sun, 21 Apr 2024 14:02:49 -0700 Subject: [PATCH 16/73] add more RawSpan API, doc-comment fixes --- .../nnnn-safe-shared-contiguous-storage.md | 158 ++++++++++++------ 1 file changed, 107 insertions(+), 51 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index ce53440442..8577e1ec16 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -43,8 +43,7 @@ We will also provide a `RawSpan` in order to provide operations over contiguous `Span` is a simple representation of a span of initialized memory. ```swift -public struct Span -: ~Escapable, Copyable { +public struct Span: Copyable, ~Escapable { internal var _start: Span.Index internal var _count: Int } @@ -54,21 +53,16 @@ It provides a collection-like interface to the elements stored in that span of m ```swift extension Span { - public struct Index: Copyable, Escapable, Strideable { /* .... */ } - public struct Iterator: Copyable, ~Escapable { - // Should conform to a `BorrowingIterator` protocol - // that will be defined at a later date. - } - public typealias SubSequence: Self + public struct Index: Copyable, Escapable, Strideable { /* ... */ } public var startIndex: Index { get } public var endIndex: Index { get } + public var indices: Range { get } + public var count: Int { get } public var isEmpty: Bool { get } - public func makeIterator() -> dependsOn(self) Span.Iterator - // index-based subscripts subscript(_ position: Index) -> dependsOn(self) Element { get } subscript(_ bounds: Range) -> dependsOn(self) Span { get } @@ -77,15 +71,18 @@ extension Span { subscript(offset: Int) -> dependsOn(self) Element { get } subscript(offsets: Range) -> dependsOn(self) Span { get } } - -extension Span.Iterator where Element: Copyable, Escapable { - // Cannot conform to `IteratorProtocol` because `Self: ~Escapable` - public mutating func next() -> Element? -} ``` Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, the slice _must_ have the same lifetime as the view from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. +As a side-effect of not conforming to `Collection` or `Sequence`, `Span` is not directly supported by `for` loops at this time. It is, however, easy to use in a `for` loop via indexing: + +```swift +for i in mySpan.indices { + calculation(mySpan[i]) +} +``` + A type can declare that it can provide access to contiguous storage by conforming to the `ContiguousStorage` protocol: ```swift @@ -98,7 +95,7 @@ public protocol ContiguousStorage: ~Copyable, ~Escapable { The key safety feature is that a `Span` cannot escape to a scope where the value it borrowed no longer exists. -An API that wishes to read from contiguous storage can declare a parameter type of `some ContiguousStorage`. The implementation will internally consist of a brief generic section, followed by business logic implemented in terms of a concrete `Span`. Frameworks that support library evolution (resilient frameworks) have an additional concern. Resilient frameworks have an ABI boundary that may differ from the API proper. Resilient frameworks may wish to adopt a pattern such as the following: +A function that wishes to read from contiguous storage can declare a parameter type of `some ContiguousStorage`. The implementation will internally consist of a brief generic section, followed by business logic implemented in terms of a concrete `Span`. Frameworks that support library evolution (resilient frameworks) have an additional concern. Resilient frameworks have an ABI boundary that may differ from the API proper. Resilient frameworks may wish to adopt a pattern such as the following: ```swift extension MyResilientType { @@ -120,9 +117,11 @@ In addition to `Span`, we propose the addition of `RawSpan`. `RawSpan` is sim ```swift extension Array: ContiguousStorage { + // note: this could borrow a temporary copy of the `Array`'s storage var storage: Span { get } } extension ArraySlice: ContiguousStorage { + // note: this could borrow a temporary copy of the `ArraySlice`'s storage var storage: Span { get } } extension ContiguousArray: ContiguousStorage { @@ -134,15 +133,15 @@ extension Foundation.Data: ContiguousStorage { } extension String.UTF8View: ContiguousStorage { - // note: this could borrow a temporary copy of the original `String`'s storage object + // note: this could borrow a temporary copy of the `String`'s storage var storage: Span { get } } extension Substring.UTF8View: ContiguousStorage { - // note: this could borrow a temporary copy of the original `Substring`'s storage object + // note: this could borrow a temporary copy of the `Substring`'s storage var storage: Span { get } } extension Character.UTF8View: ContiguousStorage { - // note: this could borrow a temporary copy of the original `Character`'s storage object + // note: this could borrow a temporary copy of the `Character`'s storage var storage: Span { get } } @@ -161,19 +160,19 @@ extension Slice: ContiguousStorage where Base: ContiguousStorage { } extension UnsafeBufferPointer: ContiguousStorage { - // note: this applies additional preconditions to `self` for the duration of the borrow + // note: additional preconditions apply until the end of the scope var storage: Span { @_unsafeNonescapableResult get } } extension UnsafeMutableBufferPointer: ContiguousStorage { - // note: this applies additional preconditions to `self` for the duration of the borrow + // note: additional preconditions apply until the end of the scope var storage: Span { @_unsafeNonescapableResult get } } extension UnsafeRawBufferPointer: ContiguousStorage { - // note: this applies additional preconditions to `self` for the duration of the borrow + // note: additional preconditions apply until the end of the scope var storage: Span { @_unsafeNonescapableResult get } } extension UnsafeMutableRawBufferPointer: ContiguousStorage { - // note: this applies additional preconditions to `self` for the duration of the borrow + // note: additional preconditions apply until the end of the scope var storage: Span { @_unsafeNonescapableResult get } } ``` @@ -199,8 +198,7 @@ extension Span where Element: BitwiseCopyable { #### Complete `Span` API: ```swift -public struct Span -: Copyable, ~Escapable { +public struct Span: Copyable, ~Escapable { internal var _start: Span.Index internal var _count: Int } @@ -223,7 +221,8 @@ extension Span { /// - owner: a binding whose lifetime must exceed that of /// the returned `Span`. public init?( - unsafeBufferPointer buffer: UnsafeBufferPointer, owner: borrowing Owner + unsafeBufferPointer buffer: UnsafeBufferPointer, + owner: borrowing Owner ) -> dependsOn(owner) Self? /// Unsafely create a `Span` over initialized memory. @@ -238,7 +237,9 @@ extension Span { /// - owner: a binding whose lifetime must exceed that of /// the returned `Span`. public init( - unsafePointer pointer: UnsafePointer, count: Int, owner: borrowing Owner + unsafePointer pointer: UnsafePointer, + count: Int, + owner: borrowing Owner ) -> dependsOn(owner) Self } @@ -356,18 +357,6 @@ extension Span { ```swift extension Span { - /// Traps if `position` is not a valid index for this `Span` - /// - /// - Parameters: - /// - position: an Index to validate - public boundsCheckPrecondition(_ position: Index) - - /// Traps if `bounds` is not a valid range of indices for this `Span` - /// - /// - Parameters: - /// - position: a range of indices to validate - public boundsCheckPrecondition(_ bounds: Range) - // Integer-offset subscripts /// Accesses the element at the specified offset in the `Span`. @@ -422,8 +411,8 @@ extension Span { uncheckedBounds bounds: Range ) -> dependsOn(self) Span { get } - /// Accesses the contiguous subrange of the elements represented by this `Span`, - /// specified by a range expression. + /// Accesses the contiguous subrange of the elements represented by + /// this `Span`, specified by a range expression. /// /// This subscript does not validate `bounds`; this is an unsafe operation. /// @@ -445,7 +434,9 @@ extension Span { /// must be greater or equal to zero, and less than the `count` property. /// /// - Complexity: O(1) - public subscript(uncheckedOffset offset: Int) -> dependsOn(self) Element { get } + public subscript( + uncheckedOffset offset: Int + ) -> dependsOn(self) Element { get } /// Accesses the contiguous subrange of elements at the specified /// range of offsets in this `Span`. @@ -462,6 +453,38 @@ extension Span { } ``` +##### Index validation utilities: + +Every time `Span` uses an index or an integer offset, it checks for their validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: + +```swift +extension Span { + /// Traps if `position` is not a valid index for this `Span` + /// + /// - Parameters: + /// - position: an Index to validate + public boundsCheckPrecondition(_ position: Index) + + /// Traps if `bounds` is not a valid range of indices for this `Span` + /// + /// - Parameters: + /// - position: a range of indices to validate + public boundsCheckPrecondition(_ bounds: Range) + + /// Traps if `offset` is not a valid offset into this `Span` + /// + /// - Parameters: + /// - offset: an offset to validate + public boundsCheckPrecondition(offset: Int) + + /// Traps if `offsets` is not a valid range of offsets into this `Span` + /// + /// - Parameters: + /// - offsets: a range of offsets to validate + public boundsCheckPrecondition(offsets: Range) +} +``` + ##### Interoperability with unsafe code: We provide two functions for interoperability with C or other legacy pointer-taking functions. @@ -564,15 +587,13 @@ extension RawSpan { ##### Indexing Operations: -`RawSpan` has `Collection`-like indexing operations: +`RawSpan` has these `Collection`-like indexing operations: ```swift extension RawSpan { public typealias Index = Span.Index public typealias SubSequence = Self - public func makeIterator() -> dependsOn(self) Span.Iterator - public var startIndex: Index { get } public var endIndex: Index { get } public var count: Int { get } @@ -599,19 +620,56 @@ extension RawSpan { } ``` +##### Index validation utiliities: + +Every time `Span` uses an index or an integer offset, it checks for their validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: + ```swift extension RawSpan { - /// Traps if `position` is not a valid index for this `Span` + /// Traps if `position` is not a valid index for this `RawSpan` /// /// - Parameters: /// - position: an Index to validate public boundsCheckPrecondition(_ position: Index) - /// Traps if `bounds` is not a valid range of indices for this `Span` + /// Traps if `bounds` is not a valid range of indices for this `RawSpan` /// /// - Parameters: - /// - position: a range of indices to validate + /// - bounds: a range of indices to validate public boundsCheckPrecondition(_ bounds: Range) + + /// Traps if `offset` is not a valid offset into this `RawSpan` + /// + /// - Parameters: + /// - offset: an offset to validate + public boundsCheckPrecondition(offset: Int) + + /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` + /// + /// - Parameters: + /// - offsets: a range of offsets to validate + public boundsCheckPrecondition(offsets: Range) +} +``` + +##### Slicing of `RawSpan` instances: + +`RawSpan` has `Collection`-like slicing operations. Like `Span`, it also has unchecked slicing operations and can be sliced using integer offsets: + +```swift +extension RawSpan { + public subscript(bounds: Range) -> dependsOn(self) Self { get } + public subscript(unchecked bounds: Range) -> dependsOn(self) Self { get } + + public subscript(bounds: some RangeExpression) -> dependsOn(self) Self { get } + public subscript(unchecked bounds: some RangeExpression) -> dependsOn(self) Self { get } + public subscript(x: UnboundedRange) -> dependsOn(self) Self { get } + + public subscript(offsets: Range) -> dependsOn(self) Self { get } + public subscript(uncheckedOffsets offsets: Range) -> dependsOn(self) Self { get } + + public subscript(offsets: some RangeExpression) -> dependsOn(self) Self { get } + public subscript(uncheckedOffsets offsets: some RangeExpression) -> dependsOn(self) Self { get } } ``` @@ -698,8 +756,6 @@ extension RawSpan { } ``` -##### - ## Source compatibility This proposal is additive and source-compatible with existing code. @@ -726,7 +782,7 @@ Eventually we want a similar usage pattern for a `MutableSpan` as we are proposi ##### Naming The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and the term "View" in the API of `String`. A possible alternative name is `StorageSpan`, which mark it as a relative of C++'s `std::span`. -##### Adding `load` and `loadUnaligned` on `Span` instead of making `RawSpan` +##### Adding `load` and `loadUnaligned` to `Span`on `Span` instead of adding `RawSpan TKTKTK From 844e661bf5c4cad0be09d213b22d5dd2e3fa74a8 Mon Sep 17 00:00:00 2001 From: Michael Ilseman Date: Mon, 22 Apr 2024 13:58:37 -0600 Subject: [PATCH 17/73] Added more prose, added TODOs for further clarification --- .../nnnn-safe-shared-contiguous-storage.md | 109 ++++++++++++++++-- 1 file changed, 97 insertions(+), 12 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 8577e1ec16..35dd7a581a 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -1,7 +1,7 @@ # Safe Access to Contiguous Storage * Proposal: [SE-NNNN](nnnn-safe-shared-contiguous-storage.md) -* Authors: [Guillaume Lessard](https://github.com/glessard), [Andrew Trick](https://github.com/atrick) +* Authors: [Guillaume Lessard](https://github.com/glessard), [Andrew Trick](https://github.com/atrick), [Michael Ilseman](https://github.com/milseman) * Review Manager: TBD * Status: **Awaiting implementation** * Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) @@ -20,7 +20,16 @@ This proposal is related to two other features being proposed along with it: [No ## Motivation -Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()` and use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. +Swift needs safe and performant types for local processing over values in contiguous memory. Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()` and use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. + +The `UnsafeBufferPointer` passed to a `withUnsafeXXX` closure-style API, while performant, is unsafe in multiple ways: + +1. The pointer itself is unsafe and unmanaged +2. `subscript` is only bounds-checked in debug builds of client code +3. It might escape the duration of the closure + +Even if the body of the `withUnsafeXXX` call does not escape the pointer, other functions called inside the closure have to be written in terms of unsafe pointers. This requires programmer vigilance across a project and pollutes code that otherwise could be written in terms of safe constructs. + ## Proposed solution @@ -28,7 +37,13 @@ Consider for example a program using multiple libraries, including one for [base By relying on borrowing, `Span` can provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see the future directions for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) -A type can indicate that it can provide a `Span` by conforming to the `ContiguousStorage` protocol. For example, for the hypothetical base64 decoding library mentioned above, a possible API could be: +`Span` is the currency type for local processing over values in contiguous memory. It is the replacement for any API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that does not need to escape the value. + +### `ContiguousStorage` + +A type can indicate that it can provide a `Span` by conforming to the `ContiguousStorage` protocol. `ContiguousStorage` forms a bridge between multi-type or generically-typed interfaces and a performant concrete implementation. + +For example, for the hypothetical base64 decoding library mentioned above, a possible API could be: ```swift extension HypotheticalBase64Decoder { @@ -36,7 +51,16 @@ extension HypotheticalBase64Decoder { } ``` -We will also provide a `RawSpan` in order to provide operations over contiguous bytes, for use in decoders and the like. The advantage of `RawSpan` is to be a concrete type, without a generic parameter. +**TODO**: But, we don't want to encourage this use. We want to encourage one concrete function taking a `Span`. Advanced libraries might add an inlinable/alwaysEmitIntoClient generic-dispatch interface in addition to this. + +### `RawSpan` + +`RawSpan` allows sharing the contiguous internal representation for values which may be heterogenously-typed, such as in decoders. Furthermore, it is a fully concrete type, without a generic parameter, which achieves better performance in debug builds of client code as well as a more straight-forwards unstanding of performance for library code. + +All `Span`s have a backing `RawSpan`. + +**TODO**: Do we have a (parent) protocol for just raw span? Do we have API to get the raw span from a span? + ## Detailed design @@ -83,6 +107,8 @@ for i in mySpan.indices { } ``` +### `ContiguousStorage` + A type can declare that it can provide access to contiguous storage by conforming to the `ContiguousStorage` protocol: ```swift @@ -111,8 +137,6 @@ extension MyResilientType { Here, the public function obtains the `Span` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `Span`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions).) -In addition to `Span`, we propose the addition of `RawSpan`. `RawSpan` is similar to `Span`, but represents initialized bytes. Its API supports slicing, along with the operations `load(as:)` and `loadUnaligned(as:)`. `RawSpan` is a specialized type supporting parsing and decoding applications in particular, where heavily-used code paths require concrete types as much as possible. - #### Extensions to Standard Library and Foundation types ```swift @@ -177,6 +201,14 @@ extension UnsafeMutableRawBufferPointer: ContiguousStorage { } ``` +**TODO**: What is the `@_unsafeNonescapableResult` annotation? Would `Slice>` need it? + +**TODO**: Do we do a `Sequence.withSpanIfAvailable` API? + +**TODO**: What all can we deprecate with this proposal? + +**TODO**: Do these needs lifetime annotations on them? + #### Using `Span` with C functions or other unsafe code: `Span` has an unsafe hatch for use with unsafe code. @@ -529,6 +561,14 @@ extension Span where Element: BitwiseCopyable { } ``` +**TODO**: `public var rawSpan: RawSpan` API, as well a conformance to a raw span protocol if there is one. + +### RawSpan + +In addition to `Span`, we propose the addition of `RawSpan` which can represent heterogenously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents initialized untyped bytes. Its API supports slicing, along with the operations `load(as:)` and `loadUnaligned(as:)`. + +`RawSpan` is a specialized type supporting parsing and decoding applications in particular, as well as applications where heavily-used code paths require concrete types as much as possible. + #### Complete `RawSpan` API: ```swift @@ -620,9 +660,11 @@ extension RawSpan { } ``` +**TODO**: What does `typealias Index = Span.Index` mean? + ##### Index validation utiliities: -Every time `Span` uses an index or an integer offset, it checks for their validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: +Every time `RawSpan` uses an index or an integer offset, it checks for their validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: ```swift extension RawSpan { @@ -744,7 +786,13 @@ extension RawSpan { public func loadUnaligned( from index: Index, as: T.Type ) -> T +``` +**TODO**: What about unchecked variants? Those would/could be the bottom API called by data parsers which have already checked the bounds earlier (e.g. for error-throwing purposes). + +A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homogenously as instances of `T`. + +```swift /// View the memory span represented by this view as a different type /// /// The memory must be laid out identically to the in-memory representation of `T`. @@ -809,24 +857,61 @@ while i < view.endIndex { doSomething(view[i]) view.index(after: &i) } + // ...or: -for o in 0..` -Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. A `MutableSpan` should provide a better, safer alternative. +##### Safe mutations of memory with `MutableSpan` + +Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. + +The `UnsafeMutableBufferPointer` passed to a `withUnsafeMutableXXX` closure-style API is unsafe in multiple ways: + +1. The pointer itself is unsafe and unmanaged +2. `subscript` is only bounds-checked in debug builds of client code +3. It might escape the duration of the closure +4. Exclusivity of writes is not enforced +5. Initialization of any particular memory address is not ensured + +I.e., it is unsafe in all the ways `UnsafeBufferPointer`-passing closure APIs are unsafe in addition to being unsafe in exclusivity and in initialization. + +Loading an uninitialized non-`BitwiseCopyable` value leads to undefined behavior. Loading an uninitialized `BitwiseCopyable` value does not immediately lead to undefined behavior, but it produces a garbage value which may lead to misbehavior of the program. + +A `MutableSpan` should provide a better, safer alternative to mutable memory in the same way that `Span` provides a better, safer read-only type. `MutableSpan` would also automatically enforce exclusivity of writes. + +However, it alone does not track initialization state of each address, and that will continue to be the responsibility of the developer. + ##### Delegating initialization of memory with `OutputSpan` + Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. In the standard library we have `Array.init(unsafeUninitializedCapacity:initializingWith:)` and `String.init(unsafeUninitializedCapacity:initializingUTF8With:)`. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. +`OutputSpan` would need run-time bookkeeping (e.g. a bitvector with a bit per-address) to track initialization state to safely support random access and random-order initialization. + +Alternatively, a divide-and-conqueor style initialization order might be solvable via an API layer without run-time bookkeeping, but with more complex ergonomics. + + + ##### Resizable, contiguously-stored, untyped collection in the standard library The example in the [motivation](#motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. @@ -856,4 +941,4 @@ The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class templa ## Acknowledgments -Joe Groff, John McCall, Tim Kientzle, Michael Ilseman, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. +Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. From 15de4ab280ceec34886d968cf89071622f442d89 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 22 Apr 2024 13:34:20 -0700 Subject: [PATCH 18/73] Update proposals/nnnn-safe-shared-contiguous-storage.md --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 35dd7a581a..5e4d0c9885 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -871,7 +871,7 @@ while let elt = iter.next() { ``` -**TODO**: Karoy mentioned that be might not want to even take the name `Iterator` until more of the borrowed iterator design if figured out +**TODO**: Karoy mentioned that be might not want to even take the name `Iterator` until more of the borrowed iterator design is figured out ##### Collection-like protocols for non-copyable and non-escapable types From be180f15a8c15f0fdbbcc3157c16c8b97b804877 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 22 Apr 2024 17:06:28 -0700 Subject: [PATCH 19/73] remove some trailing whitespace from code blocks --- proposals/nnnn-safe-shared-contiguous-storage.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 5e4d0c9885..4e9a91182f 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -508,7 +508,7 @@ extension Span { /// - Parameters: /// - offset: an offset to validate public boundsCheckPrecondition(offset: Int) - + /// Traps if `offsets` is not a valid range of offsets into this `Span` /// /// - Parameters: @@ -679,13 +679,13 @@ extension RawSpan { /// - Parameters: /// - bounds: a range of indices to validate public boundsCheckPrecondition(_ bounds: Range) - + /// Traps if `offset` is not a valid offset into this `RawSpan` /// /// - Parameters: /// - offset: an offset to validate public boundsCheckPrecondition(offset: Int) - + /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` /// /// - Parameters: @@ -706,7 +706,7 @@ extension RawSpan { public subscript(bounds: some RangeExpression) -> dependsOn(self) Self { get } public subscript(unchecked bounds: some RangeExpression) -> dependsOn(self) Self { get } public subscript(x: UnboundedRange) -> dependsOn(self) Self { get } - + public subscript(offsets: Range) -> dependsOn(self) Self { get } public subscript(uncheckedOffsets offsets: Range) -> dependsOn(self) Self { get } From 26a7637633a860c0e281bb32ebaf4485b3756193 Mon Sep 17 00:00:00 2001 From: Michael Ilseman Date: Mon, 6 May 2024 11:40:39 -0600 Subject: [PATCH 20/73] Update --- .../nnnn-safe-shared-contiguous-storage.md | 337 +++++++++++++++++- 1 file changed, 328 insertions(+), 9 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 4e9a91182f..dc69d19fe4 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -143,6 +143,7 @@ Here, the public function obtains the `Span` from the type that vends it in inli extension Array: ContiguousStorage { // note: this could borrow a temporary copy of the `Array`'s storage var storage: Span { get } + // TODO: probably _read, so it can yield a ContigArray if needed } extension ArraySlice: ContiguousStorage { // note: this could borrow a temporary copy of the `ArraySlice`'s storage @@ -159,14 +160,17 @@ extension Foundation.Data: ContiguousStorage { extension String.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the `String`'s storage var storage: Span { get } + // TODO: probably _read, so it can yield a ContigArray if needed } extension Substring.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the `Substring`'s storage var storage: Span { get } + // TODO: probably _read, so it can yield a ContigArray if needed } extension Character.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the `Character`'s storage var storage: Span { get } + // TODO: probably _read, so it can yield a ContigArray if needed } extension SIMD: ContiguousStorage { @@ -201,14 +205,6 @@ extension UnsafeMutableRawBufferPointer: ContiguousStorage { } ``` -**TODO**: What is the `@_unsafeNonescapableResult` annotation? Would `Slice>` need it? - -**TODO**: Do we do a `Sequence.withSpanIfAvailable` API? - -**TODO**: What all can we deprecate with this proposal? - -**TODO**: Do these needs lifetime annotations on them? - #### Using `Span` with C functions or other unsafe code: `Span` has an unsafe hatch for use with unsafe code. @@ -227,7 +223,166 @@ extension Span where Element: BitwiseCopyable { } ``` -#### Complete `Span` API: +### Index and slicing design considerations + +There are 3 potentially-desirable features of `Span`'s `Index` design: + +1. `Span` is its own slice type +2. Indices from a slice can be used on the base collection +3. Additional reuse-after-free checking + +Each of these introduces practical tradeoffs in the design. + +#### `Span` is its own slice type + +Collections which own their storage have the convention of separate slice types, such as `Array` and `String`. This has the advantage of clearly delineating storage ownership in the programming model and the disadvantage of introducing a second type through which to interact. + +`UnsafeBufferPointer` may or may not (unsafely) own its storage, and hence has a separate slice type. It's `baseAddress` has a `deallocate` method for situations in which it does (unsafely) own its storage, and that method should only be called on the `baseAddress` of the original allocation, not the start of a slice. However, many uses of `UnsafeBufferPointer` are unowned use cases, where having a separate slice type is [cumbersome](https://github.com/apple/swift/blob/bcd08c0c9a74974b4757b4b8a2d1796659b1d940/stdlib/public/core/StringComparison.swift#L175). + +`Span` does not own its storage and there is no concern about leaking larger allocations. Thus, it would benefit from being its own slice type, even if doing so increases the size of the type from 2 to 3 words (depending on other design tradeoffs discussed below). We propose making `Span` be its own slice type. + +#### Indices from a slice can be used on the base collection + +There is very strong stdlib precedent that indices from the base collection can be used in a slice and vice-versa. + +```swift +let myCollection = [0,1,2,3,4,5,6] +let idx = myCollection.index(myCollection.startIndex, offsetBy: 4) +myCollection[idx] // 4 +let slice = myCollection[idx...] // [4, 5, 6] +slice[idx] // 4 +myCollection[slice.indices] // [4, 5, 6] +``` + +Code can be written to take advantage of this fact. For example, a simplistic parser can be written as mutating methods on a slice. The slice's indices can be saved for reference into the original collection or another slice. + +```swift +extension Slice where Base == UnsafeRawBufferPointer { + mutating func parse(numBytes: Int) -> Self { + let end = index(startIndex, offsetBy: numBytes) + defer { self = self[end...] } + return self[.. Int { + parse(numBytes: MemoryLayout.stride).loadUnaligned(as: Int.self) + } + + mutating func parseHeader() -> Self { + // Comments show what happens when ran with `myCollection` + + let copy = self + parseInt() // 0 + parseInt() // 1 + parse(numBytes: 8) // [2, 0, 0, 0, 0, 0, 0, 0] + parseInt() // 3 + parse(numBytes: 7) // [4, 0, 0, 0, 0, 0, 0] + + // self: [0, 5, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0] + parseInt() // 1280 (0x00_00_05_00 little endian) + // self: [0, 6, 0, 0, 0, 0, 0, 0, 0] + + return copy[..( + _ c: C +) -> C.Element where C.Index == Int { + c[0] +} + +getFirst(myCollection) // 0 +getFirst(slice) // Fatal error: Index out of bounds +``` + +Preserving index interchange across views and the base is a nice-to-have for `Span`, and we propose keeping it. However, we are evaluating the tradeoffs it requires. + + +#### Additional reuse-after-free checking + +`Span` bounds-checks its indices, which is important for safety. If the index is based around a pointer (instead of an offset), then bounds checks will also ensure that indices are not used with the wrong span in most situations. However, it is possible for a memory address to be reused after being freed, and using a stale index into this reused memory may introduce safety problems. + +```swift +var idx: Span.Index + +let array1: Array = ... +let span1 = array1.span +idx = span1.startIndex.advanced(by: ...) +... +// array1 is freed + +let array2: Array = ... +let span2 = array2.span +// array2 happens to be allocated within the same memory of array1 +// but with a different base address whose offset is not an even +// multiple of `MemoryLayout.stride`. + +span2[idx] // unaligned load, what happens? +``` + +If `T` is `BitwiseCopyable`, then the unaligned load is not undefined behavior, but the value that is loaded is garbage. Whether the program is well-behaved going forwards depends on whether it is resilient to getting garbage values. + +If `T` is not `BitwiseCopyable`, then the unaligned load may introduce undefined behavior. No matter how well-written the rest of the program is, it has a critical safety and security flaw. + +When the reused allocation happens to be stride-aligned, there is no undefined behavior from undefined loads, nor are there "garbage" values in the strictest sense, but it is still reflective of a programming bug. The program may be interacting with an unexpected value. + +Bounds checks protect against critical programmer errors. It would be nice, pending engineering tradeoffs, to also protect against some reuse after free errors and invalid index reuse, especially those that may lead to undefined behavior. + +Future improvements to microarchitecture may make reuse after free checks cheaper, however we need something for the forseeable future. Any validation we can do reduces the need to switch to other mitigation strategies or make other tradeoffs. + +#### Design approaches for indices + +##### Index is an offset (`Int` or a wrapper around `Int`) + +When `Index` is an offset, there is no undefined behavior from unaligned loads because the `Span`'s base address is advanced by `MemoryLayout.stride * offset`. + +However, there is no protection against invalidly using an index derived from a different span, provided the offset is in-bounds. + +If `Span` is 2 words (base address and count), then indices cannot be interchanged between slices and the base span. `Span` would need to additionally store a base offset, bringing it up to 3 words in size. + +**TODO**: What's the perf impact of having a base offset? Bounds checking would need `(baseOffset..<(count &- baseOffset)).contains(i)`. + +##### Index is a pointer (wrapper around `UnsafeRawPointer`) + +When Index holds a pointer, `Span` only needs to be 2 words in size, as valid index interchange across slices falls out naturally. Additionally, invalid reuse of an index across spans will typically be caught during bounds checking. + +However, in a reuse-after-free situation, unaligned loads (i.e. undefined behavior) are possible. If stride is not a multiple of 2, then alignement checking can be expensive. Alternatively, we could choose not to detect these bugs. + +##### Index is a fat pointer (pointer and allocation ID) + +We can create a per-allocation ID (e.g. a cryptographic `UInt64`) for both `Span` and `Span.Index` to store. This makes `Span` 3 words in size and `Span.Index` 2 words in size. This provides the most protection possible against all forms of invalid index use, including reuse-after-free. + +However, making `Span.Index` be 2 words in size is unfortunate. `Range` is now 4 words in size, storing the allocation ID twice. Anything built on top of `Span` that wishes to store multiple indices is either bloated or must hand-extract the pointers and hand-manage the allocation ID. + +##### Bitpacking an allocation ID hash + +As an alternative to the above, we could create a smaller hash value of an allocation ID and use that for checking. + +If index is an offset, it could use e.g. 48 bits for the offset and 16 bits for the hash value. If index is a pointer, it could use **TODO** bits for the hash value. + +**TODO**: perf impact of this approach + +We recommend going with **TODO** + + +### Complete `Span` API: ```swift public struct Span: Copyable, ~Escapable { @@ -804,6 +959,164 @@ A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homoge } ``` +### Byte parsing helpers + +The below (severable) API make `RawSpan` well-suited for use in binary parsers and decoders. + + +#### Out of bounds errors + +The stdlib's lowest level (safe) interfaces, direct indexing, trap on error ([Logic failures](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#logic-failures)). Some operations, such as the key-based subcript on `Dictionary`, are expected to fail often and have no useful information to communicate other than to return `nil` ([Simple domain errors](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#simple-domain-errors)). + +Data parsing is generally expected to succeed, but when it doesn't we want an error that we can propagate upwards with enough information in that we can try to recover ([Recoverable errors](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#recoverable-errors)). For example, if our data is provided in chunks of contiguous memory, we might be able to recover by buffering more bytes and trying again. + + +```swift +/// An error indicating that out-of-bounds access was attempted +@frozen +public struct OutOfBoundsError: Error { + /// The number of elements expected + public var expected: Int + + /// The number of elements found + public var has: Int + + @inlinable + public init(expected: Int, has: Int) +} +``` + +#### Index-advancing operations + +The following parsing primitives + +(most general/powerful, but they require developer to manage indices) + +```swift +extension RawSpan { + /// Parse an instance of `T`, advancing `position`. + @inlinable + public func parse( + _ position: inout Index, as t: T.Type = T.self + ) throws(OutOfBoundsError) -> T + + /// Parse `numBytes` of data, advancing `position`. + @inlinable + public func parse( + _ position: inout Index, numBytes: some FixedWidthInteger + ) throws (OutOfBoundsError) -> Self +} +``` + +However, they do require that a developer manage indices. + +#### Cursor-mutating operations + +`Cursor` provides a more convenient interface to the index-advancing primitives by encapsulating the current position as well as subrange within the input in which to operate. + +When parsing data, there are often multiple subranges of the data that we are parsing within. For example, when parsing an entire file, we might treat each line as a separate record, and we might individually parse different fields in each line. Knowing whether we are at the start or end of the file requires checking the file's original bounds and knowing whether we are at the start of a line requires either knowing the line's bounds or peeking-behind the record's current parse range for a newline character. + +`Cursor` stores and manages a parsing subrange, which alleviates the developer from managing one layer of slicing. + +*Alternative*: If `Cursor` does not store the subrange, it would be 3 words in size rather than 5 words. The developer would have to pre-slice and manage the slice, and future API on cursor could not peek outside of the subrange's bounds (e.g. checking for start-of-line). + + +```swift +extension RawSpan { + @frozen + public struct Cursor: Copyable, ~Escapable { + public let base: RawSpan + + /// The range within which we parse + public let parseRange: Range + + /// The current parsing position + public var position: RawSpan.Index + + @inlinable + public init(_ base: RawSpan, in range: Range) + + @inlinable + public init(_ base: RawSpan) + + /// Parse an instance of `T` and advance + @inlinable + public mutating func parse( + _ t: T.Type = T.self + ) throws(OutOfBoundsError) -> T + + /// Parse `numBytes`and advance + @inlinable + public mutating func parse( + numBytes: some FixedWidthInteger + ) throws (OutOfBoundsError) -> RawSpan + + /// The bytes that we've parsed so far + @inlinable + public var parsedBytes: RawSpan { get } + + /// The number of bytes left to parse + @inlinable + public var remainingBytes: Int { get } + } + + @inlinable + public func makeCursor() -> Cursor + + @inlinable + public func makeCursor(in range: Range) -> Cursor +} +``` + +#### Example: Parsing PNG + +The below parses [PNG Chunks](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks). + +```swift +struct PNGChunk: ~Escapable { + let contents: RawSpan + + public init( + _ contents: RawSpan, _ owner: borrowing Owner + ) throws (PNGValidationError) -> dependsOn(owner) Self { + self.contents = contents + try self._validate() + } + + var length: UInt32 { + contents.loadUnaligned(as: UInt32.self).bigEndian + } + var type: UInt32 { + contents.loadUnaligned( + fromUncheckedByteOffset: 4, as: UInt32.self).bigEndian + } + var data: RawSpan { + contents[uncheckedOffsets: 8..<(contents.count-4)] + } + var crc: UInt32 { + contents.loadUnaligned( + fromUncheckedByteOffset: contents.count-4, as: UInt32.self + ).bigEndian + } +} + +func parsePNGChunk( + _ span: RawSpan, + _ owner: borrowing Owner +) throws -> dependsOn(owner) PNGChunk { + var cursor = span.makeCursor() + + let length = try cursor.parse(UInt32.self).bigEndian + _ = try cursor.parse(UInt32.self) // type + _ = try cursor.parse(numBytes: length) // data + _ = try cursor.parse(UInt32.self) // crc + + return PNGChunk(cursor.parsedBytes, owner) +} +``` + + + ## Source compatibility This proposal is additive and source-compatible with existing code. @@ -877,6 +1190,9 @@ while let elt = iter.next() { Non-copyable and non-escapable containers would benefit from a `Collection`-like protocol family to represent a set basic, common operations. This may be `Collection` if we find a way to make it work; it may be something else. +Alongside this work, it may make sense to add a `Span` alternative to `withContiguousStorageIfAvailable()`, `RawSpan` alternative to `withUnsafeBytes`, etc., and seek to deprecate any closure-based API around unsafe pointers. + + ##### Sharing piecewise-contiguous memory Some types store their internal representation in a piecewise-contiguous manner, such as trees and ropes. Some operations naturally return information in a piecewise-contiguous manner, such as network operations. These could supply results by iterating through a list of contiguous chunks of memory. @@ -939,6 +1255,9 @@ This would probably consist of a new type of custom conversion in the language. ##### Interopability with C++'s `std::span` and with llvm's `-fbounds-safety` The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are an opportunity for better, safer interoperation with a type such as `Span`. + + + ## Acknowledgments Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. From 50a38df1fb6016337bf9356240b87f9fb43d4fb5 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 24 May 2024 16:25:29 -0700 Subject: [PATCH 21/73] Update --- proposals/nnnn-safe-shared-contiguous-storage.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index dc69d19fe4..ffd8d77155 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -786,7 +786,7 @@ extension RawSpan { ```swift extension RawSpan { - public typealias Index = Span.Index + public typealias Index = RawSpan.Index public typealias SubSequence = Self public var startIndex: Index { get } @@ -815,8 +815,6 @@ extension RawSpan { } ``` -**TODO**: What does `typealias Index = Span.Index` mean? - ##### Index validation utiliities: Every time `RawSpan` uses an index or an integer offset, it checks for their validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: From 6add19bce8032e244ed8ee97f9442dbb1a9c80ac Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 20 Jun 2024 02:54:52 -0700 Subject: [PATCH 22/73] lots of updates --- .../nnnn-safe-shared-contiguous-storage.md | 951 +++++++++--------- 1 file changed, 461 insertions(+), 490 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index ffd8d77155..964196f05f 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -6,21 +6,20 @@ * Status: **Awaiting implementation** * Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) * Bug: rdar://48132971, rdar://96837923 -* Implementation: (pending) -* Upcoming Feature Flag: (pending) -* Review: ([pitch](https://forums.swift.org/t/69888)) +* Implementation: Prototyped in https://github.com/apple/swift-collections (on branch "future") +* Review: ([pitch 1](https://forums.swift.org/t/69888))([pitch 2]()) ## Introduction -We introduce `Span`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without giving up on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. +We introduce `Span`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without comprimising on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. -In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, with no compromise to memory safety. +In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, without compromising Swift's memory safety. -This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/apple/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/apple/swift-evolution/pull/2305). This proposal also supersedes the rejected proposal [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. Additionally, we refer to the proposals for [`BitwiseCopyable`](https://github.com/apple/swift-evolution/blob/main/proposals/0426-bitwise-copyable.md) and [Non-copyable Generics](https://github.com/apple/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md). +This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/apple/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/apple/swift-evolution/pull/2305). This proposal also supersedes the rejected proposal [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. ## Motivation -Swift needs safe and performant types for local processing over values in contiguous memory. Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()` and use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. +Swift needs safe and performant types for local processing over values in contiguous memory. Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()`, forcing them to use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. The `UnsafeBufferPointer` passed to a `withUnsafeXXX` closure-style API, while performant, is unsafe in multiple ways: @@ -28,8 +27,18 @@ The `UnsafeBufferPointer` passed to a `withUnsafeXXX` closure-style API, while p 2. `subscript` is only bounds-checked in debug builds of client code 3. It might escape the duration of the closure -Even if the body of the `withUnsafeXXX` call does not escape the pointer, other functions called inside the closure have to be written in terms of unsafe pointers. This requires programmer vigilance across a project and pollutes code that otherwise could be written in terms of safe constructs. +Even if the body of the `withUnsafeXXX` call does not escape the pointer, other functions called inside the closure have to be written in terms of unsafe pointers. This requires programmer vigilance across a project and potentially spreads the use of unsafe types, even when it could have been written in terms of safe constructs. +We want to take advantage of the features of non-escapable types to replace some closure-taking API with simple properties, resulting in more composable code: + +```swift +let array = Array("Hello\0".utf8) +array.withUnsafeBufferPointer { + // use `$0` here for direct memory access +} +let span: Span = array.storage +// use `span` in the same scope as `array` for direct memory access +``` ## Proposed solution @@ -51,16 +60,21 @@ extension HypotheticalBase64Decoder { } ``` -**TODO**: But, we don't want to encourage this use. We want to encourage one concrete function taking a `Span`. Advanced libraries might add an inlinable/alwaysEmitIntoClient generic-dispatch interface in addition to this. +Even better, an interface can be defined in terms of the concrete type `Span`: -### `RawSpan` +```swift +extension Hypothetical Base64Decoder { + public func decode(bytes: Span) -> [UInt8] +} +``` -`RawSpan` allows sharing the contiguous internal representation for values which may be heterogenously-typed, such as in decoders. Furthermore, it is a fully concrete type, without a generic parameter, which achieves better performance in debug builds of client code as well as a more straight-forwards unstanding of performance for library code. +Advanced libraries might add use an inlinable generic-dispatch interface in addition to a concrete interface defined in terms of `Span` -All `Span`s have a backing `RawSpan`. +### `RawSpan` -**TODO**: Do we have a (parent) protocol for just raw span? Do we have API to get the raw span from a span? +`RawSpan` allows sharing the contiguous internal representation for values which may be heterogenously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straight-forwards understanding of performance for library code. +`Span` can always be converted to `RawSpan`, using a conditionally-available property or a constructor. ## Detailed design @@ -68,36 +82,34 @@ All `Span`s have a backing `RawSpan`. ```swift public struct Span: Copyable, ~Escapable { - internal var _start: Span.Index + internal var _start: UnsafePointer internal var _count: Int } ``` -It provides a collection-like interface to the elements stored in that span of memory: +It provides a buffer-like interface to the elements stored in that span of memory: ```swift extension Span { - public typealias SubSequence: Self - - public struct Index: Copyable, Escapable, Strideable { /* ... */ } - public var startIndex: Index { get } - public var endIndex: Index { get } - public var indices: Range { get } - public var count: Int { get } public var isEmpty: Bool { get } + public var indices: Range { get } + + subscript(_ position: Int) -> Element { get } +} +``` + +Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, a sub-span representing a subrange of its elements _must_ have the same lifetime as the view from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. - // index-based subscripts - subscript(_ position: Index) -> dependsOn(self) Element { get } - subscript(_ bounds: Range) -> dependsOn(self) Span { get } +`Span`s representing subsets of consecutive elements can be extracted out of a larger `Span` with an API similar to the recently added `extracting()` functions of `UnsafeBufferPointer`: - // integer-offset subscripts - subscript(offset: Int) -> dependsOn(self) Element { get } - subscript(offsets: Range) -> dependsOn(self) Span { get } +```swift +extension Span { + public func extracting(_ bounds: Range) -> Self } ``` -Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, the slice _must_ have the same lifetime as the view from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. +The first element of a given span is _always_ at position zero, and its last it as position `count-1`. As a side-effect of not conforming to `Collection` or `Sequence`, `Span` is not directly supported by `for` loops at this time. It is, however, easy to use in a `for` loop via indexing: @@ -143,7 +155,6 @@ Here, the public function obtains the `Span` from the type that vends it in inli extension Array: ContiguousStorage { // note: this could borrow a temporary copy of the `Array`'s storage var storage: Span { get } - // TODO: probably _read, so it can yield a ContigArray if needed } extension ArraySlice: ContiguousStorage { // note: this could borrow a temporary copy of the `ArraySlice`'s storage @@ -160,17 +171,14 @@ extension Foundation.Data: ContiguousStorage { extension String.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the `String`'s storage var storage: Span { get } - // TODO: probably _read, so it can yield a ContigArray if needed } extension Substring.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the `Substring`'s storage var storage: Span { get } - // TODO: probably _read, so it can yield a ContigArray if needed } extension Character.UTF8View: ContiguousStorage { // note: this could borrow a temporary copy of the `Character`'s storage var storage: Span { get } - // TODO: probably _read, so it can yield a ContigArray if needed } extension SIMD: ContiguousStorage { @@ -186,22 +194,26 @@ extension CollectionOfOne: ContiguousStorage { extension Slice: ContiguousStorage where Base: ContiguousStorage { var storage: Span { get } } +``` + +In addition to the the safe types above gaining the `storage` property, the `UnsafeBufferPointer` family of types will also gain access to a `storage` property. This enables interoperability if `Span`-taking API. While a `Span` binding created from an `UnsafeBufferPointer` exists, the memory that underlies it must not be deinitialized or deallocated. +```swift extension UnsafeBufferPointer: ContiguousStorage { // note: additional preconditions apply until the end of the scope - var storage: Span { @_unsafeNonescapableResult get } + var storage: Span { get } } extension UnsafeMutableBufferPointer: ContiguousStorage { // note: additional preconditions apply until the end of the scope - var storage: Span { @_unsafeNonescapableResult get } + var storage: Span { get } } extension UnsafeRawBufferPointer: ContiguousStorage { // note: additional preconditions apply until the end of the scope - var storage: Span { @_unsafeNonescapableResult get } + var storage: Span { get } } extension UnsafeMutableRawBufferPointer: ContiguousStorage { // note: additional preconditions apply until the end of the scope - var storage: Span { @_unsafeNonescapableResult get } + var storage: Span { get } } ``` @@ -223,163 +235,6 @@ extension Span where Element: BitwiseCopyable { } ``` -### Index and slicing design considerations - -There are 3 potentially-desirable features of `Span`'s `Index` design: - -1. `Span` is its own slice type -2. Indices from a slice can be used on the base collection -3. Additional reuse-after-free checking - -Each of these introduces practical tradeoffs in the design. - -#### `Span` is its own slice type - -Collections which own their storage have the convention of separate slice types, such as `Array` and `String`. This has the advantage of clearly delineating storage ownership in the programming model and the disadvantage of introducing a second type through which to interact. - -`UnsafeBufferPointer` may or may not (unsafely) own its storage, and hence has a separate slice type. It's `baseAddress` has a `deallocate` method for situations in which it does (unsafely) own its storage, and that method should only be called on the `baseAddress` of the original allocation, not the start of a slice. However, many uses of `UnsafeBufferPointer` are unowned use cases, where having a separate slice type is [cumbersome](https://github.com/apple/swift/blob/bcd08c0c9a74974b4757b4b8a2d1796659b1d940/stdlib/public/core/StringComparison.swift#L175). - -`Span` does not own its storage and there is no concern about leaking larger allocations. Thus, it would benefit from being its own slice type, even if doing so increases the size of the type from 2 to 3 words (depending on other design tradeoffs discussed below). We propose making `Span` be its own slice type. - -#### Indices from a slice can be used on the base collection - -There is very strong stdlib precedent that indices from the base collection can be used in a slice and vice-versa. - -```swift -let myCollection = [0,1,2,3,4,5,6] -let idx = myCollection.index(myCollection.startIndex, offsetBy: 4) -myCollection[idx] // 4 -let slice = myCollection[idx...] // [4, 5, 6] -slice[idx] // 4 -myCollection[slice.indices] // [4, 5, 6] -``` - -Code can be written to take advantage of this fact. For example, a simplistic parser can be written as mutating methods on a slice. The slice's indices can be saved for reference into the original collection or another slice. - -```swift -extension Slice where Base == UnsafeRawBufferPointer { - mutating func parse(numBytes: Int) -> Self { - let end = index(startIndex, offsetBy: numBytes) - defer { self = self[end...] } - return self[.. Int { - parse(numBytes: MemoryLayout.stride).loadUnaligned(as: Int.self) - } - - mutating func parseHeader() -> Self { - // Comments show what happens when ran with `myCollection` - - let copy = self - parseInt() // 0 - parseInt() // 1 - parse(numBytes: 8) // [2, 0, 0, 0, 0, 0, 0, 0] - parseInt() // 3 - parse(numBytes: 7) // [4, 0, 0, 0, 0, 0, 0] - - // self: [0, 5, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0] - parseInt() // 1280 (0x00_00_05_00 little endian) - // self: [0, 6, 0, 0, 0, 0, 0, 0, 0] - - return copy[..( - _ c: C -) -> C.Element where C.Index == Int { - c[0] -} - -getFirst(myCollection) // 0 -getFirst(slice) // Fatal error: Index out of bounds -``` - -Preserving index interchange across views and the base is a nice-to-have for `Span`, and we propose keeping it. However, we are evaluating the tradeoffs it requires. - - -#### Additional reuse-after-free checking - -`Span` bounds-checks its indices, which is important for safety. If the index is based around a pointer (instead of an offset), then bounds checks will also ensure that indices are not used with the wrong span in most situations. However, it is possible for a memory address to be reused after being freed, and using a stale index into this reused memory may introduce safety problems. - -```swift -var idx: Span.Index - -let array1: Array = ... -let span1 = array1.span -idx = span1.startIndex.advanced(by: ...) -... -// array1 is freed - -let array2: Array = ... -let span2 = array2.span -// array2 happens to be allocated within the same memory of array1 -// but with a different base address whose offset is not an even -// multiple of `MemoryLayout.stride`. - -span2[idx] // unaligned load, what happens? -``` - -If `T` is `BitwiseCopyable`, then the unaligned load is not undefined behavior, but the value that is loaded is garbage. Whether the program is well-behaved going forwards depends on whether it is resilient to getting garbage values. - -If `T` is not `BitwiseCopyable`, then the unaligned load may introduce undefined behavior. No matter how well-written the rest of the program is, it has a critical safety and security flaw. - -When the reused allocation happens to be stride-aligned, there is no undefined behavior from undefined loads, nor are there "garbage" values in the strictest sense, but it is still reflective of a programming bug. The program may be interacting with an unexpected value. - -Bounds checks protect against critical programmer errors. It would be nice, pending engineering tradeoffs, to also protect against some reuse after free errors and invalid index reuse, especially those that may lead to undefined behavior. - -Future improvements to microarchitecture may make reuse after free checks cheaper, however we need something for the forseeable future. Any validation we can do reduces the need to switch to other mitigation strategies or make other tradeoffs. - -#### Design approaches for indices - -##### Index is an offset (`Int` or a wrapper around `Int`) - -When `Index` is an offset, there is no undefined behavior from unaligned loads because the `Span`'s base address is advanced by `MemoryLayout.stride * offset`. - -However, there is no protection against invalidly using an index derived from a different span, provided the offset is in-bounds. - -If `Span` is 2 words (base address and count), then indices cannot be interchanged between slices and the base span. `Span` would need to additionally store a base offset, bringing it up to 3 words in size. - -**TODO**: What's the perf impact of having a base offset? Bounds checking would need `(baseOffset..<(count &- baseOffset)).contains(i)`. - -##### Index is a pointer (wrapper around `UnsafeRawPointer`) - -When Index holds a pointer, `Span` only needs to be 2 words in size, as valid index interchange across slices falls out naturally. Additionally, invalid reuse of an index across spans will typically be caught during bounds checking. - -However, in a reuse-after-free situation, unaligned loads (i.e. undefined behavior) are possible. If stride is not a multiple of 2, then alignement checking can be expensive. Alternatively, we could choose not to detect these bugs. - -##### Index is a fat pointer (pointer and allocation ID) - -We can create a per-allocation ID (e.g. a cryptographic `UInt64`) for both `Span` and `Span.Index` to store. This makes `Span` 3 words in size and `Span.Index` 2 words in size. This provides the most protection possible against all forms of invalid index use, including reuse-after-free. - -However, making `Span.Index` be 2 words in size is unfortunate. `Range` is now 4 words in size, storing the allocation ID twice. Anything built on top of `Span` that wishes to store multiple indices is either bloated or must hand-extract the pointers and hand-manage the allocation ID. - -##### Bitpacking an allocation ID hash - -As an alternative to the above, we could create a smaller hash value of an allocation ID and use that for checking. - -If index is an offset, it could use e.g. 48 bits for the offset and 16 bits for the hash value. If index is a pointer, it could use **TODO** bits for the hash value. - -**TODO**: perf impact of this approach - -We recommend going with **TODO** ### Complete `Span` API: @@ -406,9 +261,9 @@ extension Span { /// - Parameters: /// - buffer: an `UnsafeBufferPointer` to initialized elements. /// - owner: a binding whose lifetime must exceed that of - /// the returned `Span`. + /// the newly created `Span`. public init?( - unsafeBufferPointer buffer: UnsafeBufferPointer, + unsafeElements buffer: UnsafeBufferPointer, owner: borrowing Owner ) -> dependsOn(owner) Self? @@ -422,9 +277,9 @@ extension Span { /// - pointer: a pointer to the first initialized element. /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of - /// the returned `Span`. + /// the newly created `Span`. public init( - unsafePointer pointer: UnsafePointer, + unsafeStart pointer: UnsafePointer, count: Int, owner: borrowing Owner ) -> dependsOn(owner) Self @@ -445,10 +300,9 @@ extension Span where Element: BitwiseCopyable { /// - unsafeBytes: a buffer to initialized elements. /// - type: the type to use when interpreting the bytes in memory. /// - owner: a binding whose lifetime must exceed that of - /// the returned `Span`. + /// the newly created `Span`. public init( unsafeBytes buffer: UnsafeRawBufferPointer, - as type: Element.Type, owner: borrowing Owner ) -> dependsOn(owner) Self @@ -466,11 +320,10 @@ extension Span where Element: BitwiseCopyable { /// - type: the type to use when interpreting the bytes in memory. /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of - /// the returned `Span`. + /// the newly created `Span`. public init( - unsafeRawPointer pointer: UnsafeRawPointer, - as type: Element.Type, - count: Int, + unsafeStart pointer: UnsafeRawPointer, + byteCount: Int, owner: borrowing Owner ) -> dependsOn(owner) Self } @@ -478,103 +331,37 @@ extension Span where Element: BitwiseCopyable { ##### `Collection`-like API: -The following typealiases, properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Sequence`, `Collection`, `BidirectionalCollection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation, allowing them to return borrowed nonescapable values or borrowed noncopyable values. +The following typealiases, properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation where applicable, allowing them to return borrowed nonescapable values or borrowed noncopyable values. ```swift extension Span { - public typealias Index = Span.Index - public typealias SubSequence = Self - - public func makeIterator() -> dependsOn(self) Span.Iterator - - public var startIndex: Index { get } - public var endIndex: Index { get } public var count: Int { get } public var isEmpty: Bool { get } + public var indices: Range { get } - public var indices: Range { get } - - // indexing operations - public func index(after i: Index) -> Index - public func index(before i: Index) -> Index - public func index(_ i: Index, offsetBy distance: Int) -> Index - public func index( - _ i: Index, offsetBy distance: Int, limitedBy limit: Index - ) -> Index? - - public func formIndex(after i: inout Index) - public func formIndex(before i: inout Index) - public func formIndex(_ i: inout Index, offsetBy distance: Int) - public func formIndex( - _ i: inout Index, offsetBy distance: Int, limitedBy limit: Index - ) -> Bool - - public func distance(from start: Index, to end: Index) -> Int - - // subscripts - public subscript( - _ position: Index - ) -> dependsOn(self) Element { get } - public subscript( - _ bounds: Range - ) -> dependsOn(self) Span { get } - public subscript( - _ bounds: some RangeExpression - ) -> dependsOn(self) Span { get } - public subscript( - x: UnboundedRange - ) -> dependsOn(self) Span { get } - - // utility properties - public var first Element? { get } - public var last Element? { get } - - // one-sided slicing operations - public func prefix(upTo: Index) -> dependsOn(self) Span - public func prefix(through: Index) -> dependsOn(self) Span - public func prefix(_ maxLength: Int) -> dependsOn(self) Span - public func dropLast(_ k: Int = 1) -> dependsOn(self) Span - public func suffix(from: Index) -> dependsOn(self) Span - public func suffix(_ maxLength: Int) -> dependsOn(self) Span - public func dropFirst(_ k: Int = 1) -> dependsOn(self) Span + public subscript(_ position: Int) -> dependsOn(self) Element { get } } ``` -##### Additions not in the `Collection` family API: +##### Accessing subranges of elements: + +In SE-0437, `UnsafeBufferPointer`'s slicing subscript was replaced by the `extracting(_ bounds:)` functions, due to the copyability assumption baked into the standard library's `Slice` type. `Span` has similar requirements as `UnsafeBufferPointer`, in that it must be possible to obtain an instance representing a subrange of the same memory as another `Span`. We therefore follow in the footsteps of SE-0437, and add a family of `extracting()` functions: ```swift extension Span { - // Integer-offset subscripts - - /// Accesses the element at the specified offset in the `Span`. - /// - /// - Parameter offset: The offset of the element to access. `offset` - /// must be greater or equal to zero, and less than the `count` property. - /// - /// - Complexity: O(1) - public subscript(offset: Int) -> dependsOn(self) Element { get } + func extracting(_ bounds: Range) -> dependsOn(self) Self + func extracting(_ bounds: some RangeExpression) -> dependsOn(self) Self + func extracting(_: UnboundedRange) -> dependsOn(self) Self +} +``` - /// Accesses the contiguous subrange of elements at the specified - /// range of offsets in this `Span`. - /// - /// - Parameter offsets: A range of offsets. The bounds of the range - /// must be greater or equal to zero, and less than the `count` property. - /// - /// - Complexity: O(1) - public subscript(offsets: Range) -> dependsOn(self) Span { get } +##### Unchecked access to elements and subranges of elements: - /// Accesses the contiguous subrange of elements at the specified - /// range of offsets in this `Span`. - /// - /// - Parameter offsets: A range of offsets. The bounds of the range - /// must be greater or equal to zero, and less than the `count` property. - /// - /// - Complexity: O(1) - public subscript( - offsets: some RangeExpression - ) -> dependsOn(self) Span { get } +The `subscript` and the `extracting()` functions mentioned above all have always-on bounds checking of their parameters, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly. - // Unchecked subscripts +```swift +extension Span { + // Unchecked subscripting and extraction /// Accesses the element at the specified `position`. /// @@ -582,9 +369,9 @@ extension Span { /// /// - Parameter position: The position of the element to access. `position` /// must be a valid index that is not equal to the `endIndex` property. - /// - /// - Complexity: O(1) - public subscript(unchecked position: Index) -> dependsOn(self) Element { get } + public subscript( + unchecked position: Index + ) -> dependsOn(self) Element { get } /// Accesses a contiguous subrange of the elements represented by this `Span` /// @@ -592,11 +379,9 @@ extension Span { /// /// - Parameter bounds: A range of the collection's indices. The bounds of /// the range must be valid indices of the collection. - /// - /// - Complexity: O(1) - public subscript( + public extracting( uncheckedBounds bounds: Range - ) -> dependsOn(self) Span { get } + ) -> dependsOn(self) Self /// Accesses the contiguous subrange of the elements represented by /// this `Span`, specified by a range expression. @@ -605,70 +390,29 @@ extension Span { /// /// - Parameter bounds: A range of the collection's indices. The bounds of /// the range must be valid indices of the collection. - /// - /// - Complexity: O(1) public subscript( uncheckedBounds bounds: some RangeExpression ) -> dependsOn(self) Span - - // Unchecked integer-offset subscripts - - /// Accesses the element at the specified offset in the `Span`. - /// - /// This subscript does not validate `offset`; this is an unsafe operation. - /// - /// - Parameter offset: The offset of the element to access. `offset` - /// must be greater or equal to zero, and less than the `count` property. - /// - /// - Complexity: O(1) - public subscript( - uncheckedOffset offset: Int - ) -> dependsOn(self) Element { get } - - /// Accesses the contiguous subrange of elements at the specified - /// range of offsets in this `Span`. - /// - /// This subscript does not validate `offsets`; this is an unsafe operation. - /// - /// - Parameter offsets: A range of offsets. The bounds of the range - /// must be greater or equal to zero, and less than the `count` property. - /// - /// - Complexity: O(1) - public subscript( - uncheckedOffsets offsets: Range - ) -> dependsOn(self) Span { get } } ``` ##### Index validation utilities: -Every time `Span` uses an index or an integer offset, it checks for their validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: +Every time `Span` uses a position parameter, it checks for its validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: ```swift extension Span { - /// Traps if `position` is not a valid index for this `Span` + /// Traps if `position` is not a valid offset into this `Span` /// /// - Parameters: - /// - position: an Index to validate + /// - position: an position to validate public boundsCheckPrecondition(_ position: Index) - /// Traps if `bounds` is not a valid range of indices for this `Span` + /// Traps if `bounds` is not a valid range of offsets into this `Span` /// /// - Parameters: - /// - position: a range of indices to validate + /// - position: a range of positions to validate public boundsCheckPrecondition(_ bounds: Range) - - /// Traps if `offset` is not a valid offset into this `Span` - /// - /// - Parameters: - /// - offset: an offset to validate - public boundsCheckPrecondition(offset: Int) - - /// Traps if `offsets` is not a valid range of offsets into this `Span` - /// - /// - Parameters: - /// - offsets: a range of offsets to validate - public boundsCheckPrecondition(offsets: Range) } ``` @@ -716,13 +460,9 @@ extension Span where Element: BitwiseCopyable { } ``` -**TODO**: `public var rawSpan: RawSpan` API, as well a conformance to a raw span protocol if there is one. - ### RawSpan -In addition to `Span`, we propose the addition of `RawSpan` which can represent heterogenously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents initialized untyped bytes. Its API supports slicing, along with the operations `load(as:)` and `loadUnaligned(as:)`. - -`RawSpan` is a specialized type supporting parsing and decoding applications in particular, as well as applications where heavily-used code paths require concrete types as much as possible. +In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogenously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that intends to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports extracting sub-spans, along with the operations `load(as:)` and `loadUnaligned(as:)`. #### Complete `RawSpan` API: @@ -745,9 +485,9 @@ extension RawSpan { /// - Parameters: /// - buffer: an `UnsafeRawBufferPointer` to initialized memory. /// - owner: a binding whose lifetime must exceed that of - /// the returned `RawSpan`. + /// the newly created `RawSpan`. public init?( - unsafeBufferPointer buffer: UnsafeBufferPointer, + unsafeBytes buffer: UnsafeBufferPointer, owner: borrowing Owner ) -> dependsOn(owner) Self? @@ -761,7 +501,7 @@ extension RawSpan { /// - pointer: a pointer to the first initialized element. /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of - /// the returned `Span`. + /// the newly created `RawSpan`. public init( unsafeRawPointer pointer: UnsafeRawPointer, count: Int, @@ -774,44 +514,129 @@ extension RawSpan { /// - span: An existing `Span`, which will define both this /// `RawSpan`'s lifetime and the memory it represents. @inlinable @inline(__always) - public init( + public init( _ span: borrowing Span ) -> dependsOn(span) Self } ``` -##### Indexing Operations: +##### Accessing the memory of a `RawSpan`: -`RawSpan` has these `Collection`-like indexing operations: +The basic operations to access the contents of the memory underlying a `RawSpan` are `load(as:)` and `loadUnaligned(as:)`. ```swift extension RawSpan { - public typealias Index = RawSpan.Index - public typealias SubSequence = Self - - public var startIndex: Index { get } - public var endIndex: Index { get } - public var count: Int { get } - public var isEmpty: Bool { get } + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified offset. + /// + /// The memory at this pointer plus `offset` must be properly aligned for + /// accessing `T` and initialized to `T` or another type that is layout + /// compatible with `T`. + /// + /// - Parameters: + /// - offset: The offset from this pointer, in bytes. `offset` must be + /// nonnegative. The default is zero. + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes at + /// `offset`. The returned instance is memory-managed and unassociated + /// with the value in the memory referenced by this pointer. + public func load( + fromByteOffset offset: Int = 0, as: T.Type + ) -> T - public var indices: Range { get } + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified offset. + /// + /// The memory at this pointer plus `offset` must be properly aligned for + /// accessing `T` and initialized to `T` or another type that is layout + /// compatible with `T`. + /// + /// This function does not validate the bounds of the memory access; + /// this is an unsafe operation. + /// + /// - Parameters: + /// - offset: The offset from this pointer, in bytes. `offset` must be + /// nonnegative. The default is zero. + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes at + /// `offset`. The returned instance is memory-managed and unassociated + /// with the value in the memory referenced by this pointer. + public func load( + fromUncheckedByteOffset offset: Int, as: T.Type + ) -> T - // indexing operations - public func index(after i: Index) -> Index - public func index(before i: Index) -> Index - public func index(_ i: Index, offsetBy distance: Int) -> Index - public func index( - _ i: Index, offsetBy distance: Int, limitedBy limit: Index - ) -> Index? + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified offset. + /// + /// - Parameters: + /// - offset: The offset from this pointer, in bytes. `offset` must be + /// nonnegative. The default is zero. + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes at + /// `offset`. The returned instance isn't associated + /// with the value in the range of memory referenced by this pointer. + public func loadUnaligned( + fromByteOffset offset: Int = 0, as: T.Type + ) -> T - public func formIndex(after i: inout Index) - public func formIndex(before i: inout Index) - public func formIndex(_ i: inout Index, offsetBy distance: Int) - public func formIndex( - _ i: inout Index, offsetBy distance: Int, limitedBy limit: Index - ) -> Bool + /// Returns a new instance of the given type, constructed from the raw memory + /// at the specified offset. + /// + /// This function does not validate the bounds of the memory access; + /// this is an unsafe operation. + /// + /// - Parameters: + /// - offset: The offset from this pointer, in bytes. `offset` must be + /// nonnegative. The default is zero. + /// - type: The type of the instance to create. + /// - Returns: A new instance of type `T`, read from the raw bytes at + /// `offset`. The returned instance isn't associated + /// with the value in the range of memory referenced by this pointer. + public func loadUnaligned( + fromUncheckedByteOffset offset: Int, as: T.Type + ) -> T +} +``` - public func distance(from start: Index, to end: Index) -> Int +A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homogenously as instances of `T`. + +```swift + /// View the memory span represented by this view as a different type + /// + /// The memory must be laid out identically to the in-memory representation of `T`. + /// + /// - Parameters: + /// - type: The type you wish to view the memory as + /// - Returns: A new `Span` over elements of type `T` + public func view(as: T.Type) -> dependsOn(self) Span +} +``` + +##### Index-related Operations: + +```swift +extension RawSpan { + /// The number of bytes in the span. + public var count: Int { get } + + /// A Boolean value indicating whether the span is empty. + public var isEmpty: Bool { get } + + /// The indices that are valid for subscripting the span, in ascending + /// order. + public var indices: Range { get } + + /// Traps if `offset` is not a valid offset into this `RawSpan` + /// + /// - Parameters: + /// - position: an offset to validate + public func boundsCheckPrecondition(_ offset: Int) + + /// Traps if `bounds` is not a valid range of offsets into this `RawSpan` + /// + /// - Parameters: + /// - offsets: a range of offsets to validate + public func boundsCheckPrecondition(_ offsets: Range) } ``` @@ -847,113 +672,120 @@ extension RawSpan { } ``` -##### Slicing of `RawSpan` instances: +##### Accessing subranges of elements: -`RawSpan` has `Collection`-like slicing operations. Like `Span`, it also has unchecked slicing operations and can be sliced using integer offsets: +Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports a similar set of `extracting()` functions as `Span`: ```swift extension RawSpan { - public subscript(bounds: Range) -> dependsOn(self) Self { get } - public subscript(unchecked bounds: Range) -> dependsOn(self) Self { get } - - public subscript(bounds: some RangeExpression) -> dependsOn(self) Self { get } - public subscript(unchecked bounds: some RangeExpression) -> dependsOn(self) Self { get } - public subscript(x: UnboundedRange) -> dependsOn(self) Self { get } - - public subscript(offsets: Range) -> dependsOn(self) Self { get } - public subscript(uncheckedOffsets offsets: Range) -> dependsOn(self) Self { get } - - public subscript(offsets: some RangeExpression) -> dependsOn(self) Self { get } - public subscript(uncheckedOffsets offsets: some RangeExpression) -> dependsOn(self) Self { get } -} -``` - -`RawSpan` has the following functions for loading arbitrary types from the memory it represents: + /// Constructs a new span over the bytes within the supplied range of + /// positions within this span. + /// + /// The returned span's first byte is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. + /// + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `RawSpan`. + /// + /// - Returns: A `Span` over the bytes within `bounds` + public func extracting(_ bounds: Range) -> Self -```swift -extension RawSpan { + /// Constructs a new span over the bytes within the supplied range of + /// positions within this span. + /// + /// The returned span's first byte is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. + /// + /// This function does not validate `bounds`; this is an unsafe operation. + /// + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `RawSpan`. + /// + /// - Returns: A `Span` over the bytes within `bounds` + public func extracting(uncheckedBounds bounds: Range) -> Self - /// Returns a new instance of the given type, constructed from the raw memory - /// at the specified byte offset. + /// Constructs a new span over the bytes within the supplied range of + /// positions within this span. /// - /// The memory at `offset` bytes from the start of this `Span` - /// must be properly aligned for accessing `T` and initialized to `T` - /// or another type that is layout compatible with `T`. + /// The returned span's first byte is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. /// - /// - Parameters: - /// - offset: The offset from the start of this `Span`, in bytes. - /// `offset` must be nonnegative. The default is zero. - /// - type: The type of the instance to create. - /// - Returns: A new instance of type `T`, read from the raw bytes at - /// `offset`. The returned instance is memory-managed and unassociated - /// with the value in the memory referenced by this `Span`. - public func load( - fromByteOffset: Int = 0, as: T.Type - ) -> T + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `RawSpan`. + /// + /// - Returns: A `Span` over the bytes within `bounds` + public func extracting(_ bounds: some RangeExpression) -> Self - /// Returns a new instance of the given type, constructed from the raw memory - /// at the specified index. + /// Constructs a new span over the bytes within the supplied range of + /// positions within this span. /// - /// The memory starting at `index` must be properly aligned for accessing `T` - /// and initialized to `T` or another type that is layout compatible with `T`. + /// The returned span's first byte is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. /// - /// - Parameters: - /// - index: The index into this `Span` - /// - type: The type of the instance to create. - /// - Returns: A new instance of type `T`, read from the raw bytes starting at - /// `index`. The returned instance is memory-managed and isn't associated - /// with the value in the memory referenced by this `Span`. - public func load( - from index: Index, as: T.Type - ) -> T + /// This function does not validate `bounds`; this is an unsafe operation. + /// + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `RawSpan`. + /// + /// - Returns: A `Span` over the bytes within `bounds` + public func extracting( + uncheckedBounds bounds: some RangeExpression + ) -> Self - /// Returns a new instance of the given type, constructed from the raw memory - /// at the specified byte offset. + /// Constructs a new span over all the bytes of this span. /// - /// The memory at `offset` bytes from the start of this `Span` - /// must be laid out identically to the in-memory representation of `T`. + /// The returned span's first byte is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. /// - /// - Parameters: - /// - offset: The offset from the start of this `Span`, in bytes. - /// `offset` must be nonnegative. The default is zero. - /// - type: The type of the instance to create. - /// - Returns: A new instance of type `T`, read from the raw bytes at - /// `offset`. The returned instance isn't associated - /// with the value in the memory referenced by this `Span`. - public func loadUnaligned( - fromByteOffset: Int = 0, as: T.Type - ) -> T + /// - Returns: A `RawSpan` over all the items of this span. + public func extracting(_: UnboundedRange) -> Self - /// Returns a new instance of the given type, constructed from the raw memory - /// at the specified index. + /// Returns a span containing the initial bytes of this span, + /// up to the specified maximum byte count. /// - /// The memory starting at `index` must be laid out identically - /// to the in-memory representation of `T`. + /// If the maximum length exceeds the length of this span, + /// the result contains all the bytes. /// - /// - Parameters: - /// - index: The index into this `Span` - /// - type: The type of the instance to create. - /// - Returns: A new instance of type `T`, read from the raw bytes starting at - /// `index`. The returned instance isn't associated - /// with the value in the memory referenced by this `Span`. - public func loadUnaligned( - from index: Index, as: T.Type - ) -> T -``` + /// - Parameter maxLength: The maximum number of bytes to return. + /// `maxLength` must be greater than or equal to zero. + /// - Returns: A span with at most `maxLength` bytes. + public func extracting(first maxLength: Int) -> Self -**TODO**: What about unchecked variants? Those would/could be the bottom API called by data parsers which have already checked the bounds earlier (e.g. for error-throwing purposes). + /// Returns a span over all but the given number of trailing bytes. + /// + /// If the number of elements to drop exceeds the number of elements in + /// the span, the result is an empty span. + /// + /// - Parameter k: The number of bytes to drop off the end of + /// the span. `k` must be greater than or equal to zero. + /// - Returns: A span leaving off the specified number of bytes at the end. + public func extracting(droppingLast k: Int) -> Self -A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homogenously as instances of `T`. + /// Returns a span containing the trailing bytes of the span, + /// up to the given maximum length. + /// + /// If the maximum length exceeds the length of this span, + /// the result contains all the bytes. + /// + /// - Parameter maxLength: The maximum number of bytes to return. + /// `maxLength` must be greater than or equal to zero. + /// - Returns: A span with at most `maxLength` bytes. + public func extracting(last maxLength: Int) -> Self -```swift - /// View the memory span represented by this view as a different type + /// Returns a span over all but the given number of initial bytes. /// - /// The memory must be laid out identically to the in-memory representation of `T`. + /// If the number of elements to drop exceeds the number of bytes in + /// the span, the result is an empty span. /// - /// - Parameters: - /// - type: The type you wish to view the memory as - /// - Returns: A new `Span` over elements of type `T` - public func view(as: T.Type) -> dependsOn(self) Span + /// - Parameter k: The number of bytes to drop from the beginning of + /// the span. `k` must be greater than or equal to zero. + /// - Returns: A span starting after the specified number of bytes. + public func extracting(droppingFirst k: Int = 1) -> Self } ``` @@ -961,14 +793,12 @@ A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homoge The below (severable) API make `RawSpan` well-suited for use in binary parsers and decoders. - #### Out of bounds errors The stdlib's lowest level (safe) interfaces, direct indexing, trap on error ([Logic failures](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#logic-failures)). Some operations, such as the key-based subcript on `Dictionary`, are expected to fail often and have no useful information to communicate other than to return `nil` ([Simple domain errors](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#simple-domain-errors)). Data parsing is generally expected to succeed, but when it doesn't we want an error that we can propagate upwards with enough information in that we can try to recover ([Recoverable errors](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#recoverable-errors)). For example, if our data is provided in chunks of contiguous memory, we might be able to recover by buffering more bytes and trying again. - ```swift /// An error indicating that out-of-bounds access was attempted @frozen @@ -986,28 +816,24 @@ public struct OutOfBoundsError: Error { #### Index-advancing operations -The following parsing primitives - -(most general/powerful, but they require developer to manage indices) +The following parsing primitives provide useful error-reporting wrappers for loading arbitrary types from a `RawSpan`. They advance the passed-in read offset on success, relieving developers of most of the ceremony of managing the index. ```swift extension RawSpan { /// Parse an instance of `T`, advancing `position`. @inlinable public func parse( - _ position: inout Index, as t: T.Type = T.self + _ position: inout Int, as t: T.Type = T.self ) throws(OutOfBoundsError) -> T /// Parse `numBytes` of data, advancing `position`. @inlinable public func parse( - _ position: inout Index, numBytes: some FixedWidthInteger + _ position: inout Int, numBytes: some FixedWidthInteger ) throws (OutOfBoundsError) -> Self } ``` -However, they do require that a developer manage indices. - #### Cursor-mutating operations `Cursor` provides a more convenient interface to the index-advancing primitives by encapsulating the current position as well as subrange within the input in which to operate. @@ -1018,7 +844,6 @@ When parsing data, there are often multiple subranges of the data that we are pa *Alternative*: If `Cursor` does not store the subrange, it would be 3 words in size rather than 5 words. The developer would have to pre-slice and manage the slice, and future API on cursor could not peek outside of the subrange's bounds (e.g. checking for start-of-line). - ```swift extension RawSpan { @frozen @@ -1113,8 +938,6 @@ func parsePNGChunk( } ``` - - ## Source compatibility This proposal is additive and source-compatible with existing code. @@ -1139,16 +962,21 @@ This document proposes adding the `ContiguousStorage` protocol to the standard l Eventually we want a similar usage pattern for a `MutableSpan` as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. ##### Naming -The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and the term "View" in the API of `String`. A possible alternative name is `StorageSpan`, which mark it as a relative of C++'s `std::span`. -##### Adding `load` and `loadUnaligned` to `Span`on `Span` instead of adding `RawSpan +The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and we tried to allude to that in a previous pitch where we called this type `StorageView`. We also considered the name `StorageSpan`, but that did not add much beyond the name `Span`. `Span` clearly identifies itself as a relative of C++'s `std::span`. + +##### Adding `load` and `loadUnaligned` to `Span`on `Span` instead of adding `RawSpan` TKTKTK +##### A more sophisticated approach to indexing + +This is discussed more fully in the [indexing appendix](#Indexing) below. + ## Future directions ##### Defining `BorrowingIterator` with support in `for` loops -This proposal defines a `Span.Iterator` that is borrowed and non-escapable. This is not compatible with `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: +This proposal does not define an `IteratorProtocol` conformance, since it would need to be borrowed and non-escapable. This is not compatible with `IteratorProtocol`. As such, `Span` is not directly usable in `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: ```swift borrowing view: Span = ... @@ -1163,8 +991,8 @@ In the meantime, it is possible to loop through a `Span`'s elements by direct in func doSomething(_ e: borrowing Element) { ... } let view: Span = ... // either: -var i = view.startIndex -while i < view.endIndex { +var i = 0 +while i < view.count { doSomething(view[i]) view.index(after: &i) } @@ -1173,24 +1001,14 @@ while i < view.endIndex { for i in 0..` should provide a better, safer alternative to mutable memory However, it alone does not track initialization state of each address, and that will continue to be the responsibility of the developer. - ##### Delegating initialization of memory with `OutputSpan` Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. In the standard library we have `Array.init(unsafeUninitializedCapacity:initializingWith:)` and `String.init(unsafeUninitializedCapacity:initializingUTF8With:)`. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. @@ -1224,8 +1041,6 @@ Some data structures can delegate initialization of their initial memory represe Alternatively, a divide-and-conqueor style initialization order might be solvable via an API layer without run-time bookkeeping, but with more complex ergonomics. - - ##### Resizable, contiguously-stored, untyped collection in the standard library The example in the [motivation](#motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. @@ -1251,11 +1066,167 @@ myStrnlen(array) // 8 This would probably consist of a new type of custom conversion in the language. A type author would provide a way to convert from their type to an owned `Span`, and the compiler would insert that conversion where needed. This would enhance readability and reduce boilerplate. ##### Interopability with C++'s `std::span` and with llvm's `-fbounds-safety` -The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are an opportunity for better, safer interoperation with a type such as `Span`. +The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are opportunities for better, safer interoperation with Swift, via a type such as `Span`. + +## Acknowledgments +Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. -## Acknowledgments +## Appendix: Index and slicing design considerations -Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. +There are 3 potentially-desirable features of `Span`'s `Index` design: + +1. `Span` is its own slice type +2. Indices from a slice can be used on the base collection +3. Additional reuse-after-free checking + +Each of these introduces practical tradeoffs in the design. + +#### `Span` is its own slice type + +Collections which own their storage have the convention of separate slice types, such as `Array` and `String`. This has the advantage of clearly delineating storage ownership in the programming model and the disadvantage of introducing a second type through which to interact. + +`UnsafeBufferPointer` may or may not (unsafely) own its storage, and hence has a separate slice type. It's `baseAddress` has a `deallocate` method for situations in which it does (unsafely) own its storage, and that method should only be called on the `baseAddress` of the original allocation, not the start of a slice. However, many uses of `UnsafeBufferPointer` are unowned use cases, where having a separate slice type is [cumbersome](https://github.com/apple/swift/blob/bcd08c0c9a74974b4757b4b8a2d1796659b1d940/stdlib/public/core/StringComparison.swift#L175). + +`Span` does not own its storage and there is no concern about leaking larger allocations. Thus, it would benefit from being its own slice type, even if doing so increases the size of the type from 2 to 3 words (depending on other design tradeoffs discussed below). We propose making `Span` be its own slice type. + +#### Indices from a slice can be used on the base collection + +There is very strong stdlib precedent that indices from the base collection can be used in a slice and vice-versa. + +```swift +let myCollection = [0,1,2,3,4,5,6] +let idx = myCollection.index(myCollection.startIndex, offsetBy: 4) +myCollection[idx] // 4 +let slice = myCollection[idx...] // [4, 5, 6] +slice[idx] // 4 +myCollection[slice.indices] // [4, 5, 6] +``` + +Code can be written to take advantage of this fact. For example, a simplistic parser can be written as mutating methods on a slice. The slice's indices can be saved for reference into the original collection or another slice. + +```swift +extension Slice where Base == UnsafeRawBufferPointer { + mutating func parse(numBytes: Int) -> Self { + let end = index(startIndex, offsetBy: numBytes) + defer { self = self[end...] } + return self[.. Int { + parse(numBytes: MemoryLayout.stride).loadUnaligned(as: Int.self) + } + + mutating func parseHeader() -> Self { + // Comments show what happens when ran with `myCollection` + + let copy = self + parseInt() // 0 + parseInt() // 1 + parse(numBytes: 8) // [2, 0, 0, 0, 0, 0, 0, 0] + parseInt() // 3 + parse(numBytes: 7) // [4, 0, 0, 0, 0, 0, 0] + + // self: [0, 5, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0] + parseInt() // 1280 (0x00_00_05_00 little endian) + // self: [0, 6, 0, 0, 0, 0, 0, 0, 0] + + return copy[..( + _ c: C +) -> C.Element where C.Index == Int { + c[0] +} + +getFirst(myCollection) // 0 +getFirst(slice) // Fatal error: Index out of bounds +``` + +Preserving index interchange across views and the base is a nice-to-have for `Span`, and we propose keeping it. However, we are evaluating the tradeoffs it requires. + +#### Additional reuse-after-free checking + +`Span` bounds-checks its indices, which is important for safety. If the index is based around a pointer (instead of an offset), then bounds checks will also ensure that indices are not used with the wrong span in most situations. However, it is possible for a memory address to be reused after being freed, and using a stale index into this reused memory may introduce safety problems. + +```swift +var idx: Span.Index + +let array1: Array = ... +let span1 = array1.span +idx = span1.startIndex.advanced(by: ...) +... +// array1 is freed + +let array2: Array = ... +let span2 = array2.span +// array2 happens to be allocated within the same memory of array1 +// but with a different base address whose offset is not an even +// multiple of `MemoryLayout.stride`. + +span2[idx] // unaligned load, what happens? +``` + +If `T` is `BitwiseCopyable`, then the unaligned load is not undefined behavior, but the value that is loaded is garbage. Whether the program is well-behaved going forwards depends on whether it is resilient to getting garbage values. + +If `T` is not `BitwiseCopyable`, then the unaligned load may introduce undefined behavior. No matter how well-written the rest of the program is, it has a critical safety and security flaw. + +When the reused allocation happens to be stride-aligned, there is no undefined behavior from undefined loads, nor are there "garbage" values in the strictest sense, but it is still reflective of a programming bug. The program may be interacting with an unexpected value. + +Bounds checks protect against critical programmer errors. It would be nice, pending engineering tradeoffs, to also protect against some reuse after free errors and invalid index reuse, especially those that may lead to undefined behavior. + +Future improvements to microarchitecture may make reuse after free checks cheaper, however we need something for the forseeable future. Any validation we can do reduces the need to switch to other mitigation strategies or make other tradeoffs. + +#### Design approaches for indices + +##### Index is an offset (`Int` or a wrapper around `Int`) + +When `Index` is an offset, there is no undefined behavior from unaligned loads because the `Span`'s base address is advanced by `MemoryLayout.stride * offset`. + +However, there is no protection against invalidly using an index derived from a different span, provided the offset is in-bounds. + +If `Span` is 2 words (base address and count), then indices cannot be interchanged between slices and the base span. `Span` would need to additionally store a base offset, bringing it up to 3 words in size. + +**TODO**: What's the perf impact of having a base offset? Bounds checking would need `(baseOffset..<(count &- baseOffset)).contains(i)`. + +##### Index is a pointer (wrapper around `UnsafeRawPointer`) + +When Index holds a pointer, `Span` only needs to be 2 words in size, as valid index interchange across slices falls out naturally. Additionally, invalid reuse of an index across spans will typically be caught during bounds checking. + +However, in a reuse-after-free situation, unaligned loads (i.e. undefined behavior) are possible. If stride is not a multiple of 2, then alignement checking can be expensive. Alternatively, we could choose not to detect these bugs. + +##### Index is a fat pointer (pointer and allocation ID) + +We can create a per-allocation ID (e.g. a cryptographic `UInt64`) for both `Span` and `Span.Index` to store. This makes `Span` 3 words in size and `Span.Index` 2 words in size. This provides the most protection possible against all forms of invalid index use, including reuse-after-free. + +However, making `Span.Index` be 2 words in size is unfortunate. `Range` is now 4 words in size, storing the allocation ID twice. Anything built on top of `Span` that wishes to store multiple indices is either bloated or must hand-extract the pointers and hand-manage the allocation ID. + +##### Bitpacking an allocation ID hash + +As an alternative to the above, we could create a smaller hash value of an allocation ID and use that for checking. + +If index is an offset, it could use e.g. 48 bits for the offset and 16 bits for the hash value. If index is a pointer, it could use **TODO** bits for the hash value. + +**TODO**: perf impact of this approach + +We recommend going with **TODO** From 5d19eadff5414a897fdcaed66c0c9d9e133ccbb9 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 20 Jun 2024 16:39:32 -0700 Subject: [PATCH 23/73] Apply suggestions from code review Co-authored-by: Michael Ilseman --- .../nnnn-safe-shared-contiguous-storage.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 964196f05f..0bea719f8a 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -33,12 +33,15 @@ We want to take advantage of the features of non-escapable types to replace some ```swift let array = Array("Hello\0".utf8) + +// Old array.withUnsafeBufferPointer { // use `$0` here for direct memory access } + +// New let span: Span = array.storage // use `span` in the same scope as `array` for direct memory access -``` ## Proposed solution @@ -196,7 +199,7 @@ extension Slice: ContiguousStorage where Base: ContiguousStorage { } ``` -In addition to the the safe types above gaining the `storage` property, the `UnsafeBufferPointer` family of types will also gain access to a `storage` property. This enables interoperability if `Span`-taking API. While a `Span` binding created from an `UnsafeBufferPointer` exists, the memory that underlies it must not be deinitialized or deallocated. +In addition to the the safe types above gaining the `storage` property, the `UnsafeBufferPointer` family of types will also gain access to a `storage` property. This enables interoperability of `Span`-taking API. While a `Span` binding created from an `UnsafeBufferPointer` exists, the memory that underlies it must not be deinitialized or deallocated. ```swift extension UnsafeBufferPointer: ContiguousStorage { @@ -248,7 +251,7 @@ public struct Span: Copyable, ~Escapable { ##### Creating a `Span`: -The initialization of a `Span` instance is an unsafe operation. When it is initialized correctly, subsequent uses of the borrowed instance are safe. Typically these initializers will be used internally to a container's implementation of functions or computed properties that return a borrowed `Span`. +The initialization of a `Span` instance from an unsafe pointer is an unsafe operation. When it is initialized correctly, subsequent uses of the borrowed instance are safe. Typically these initializers will be used internally to a container's implementation of functions or computed properties that return a borrowed `Span`. ```swift extension Span { @@ -331,7 +334,7 @@ extension Span where Element: BitwiseCopyable { ##### `Collection`-like API: -The following typealiases, properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation where applicable, allowing them to return borrowed nonescapable values or borrowed noncopyable values. +The following properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation where applicable, allowing them to return borrowed nonescapable values or borrowed noncopyable values. ```swift extension Span { @@ -390,7 +393,7 @@ extension Span { /// /// - Parameter bounds: A range of the collection's indices. The bounds of /// the range must be valid indices of the collection. - public subscript( + public func extracting( uncheckedBounds bounds: some RangeExpression ) -> dependsOn(self) Span } @@ -406,7 +409,7 @@ extension Span { /// /// - Parameters: /// - position: an position to validate - public boundsCheckPrecondition(_ position: Index) + public boundsCheckPrecondition(_ position: Int) /// Traps if `bounds` is not a valid range of offsets into this `Span` /// @@ -468,7 +471,7 @@ In addition to `Span`, we propose the addition of `RawSpan`, to represent het ```swift public struct RawSpan: Copyable, ~Escapable { - internal var _start: RawSpan.Index + internal var _start: UnsafeRawPointer internal var _count: Int } ``` @@ -487,7 +490,7 @@ extension RawSpan { /// - owner: a binding whose lifetime must exceed that of /// the newly created `RawSpan`. public init?( - unsafeBytes buffer: UnsafeBufferPointer, + unsafeBytes buffer: UnsafeRawBufferPointer, owner: borrowing Owner ) -> dependsOn(owner) Self? From d46c815bf8a3d53b709965f70b2e615bddc1ab81 Mon Sep 17 00:00:00 2001 From: Michael Ilseman Date: Thu, 20 Jun 2024 14:59:33 -0600 Subject: [PATCH 24/73] Move byte parsing helpers into a future direction --- .../nnnn-safe-shared-contiguous-storage.md | 112 +++++------------- 1 file changed, 31 insertions(+), 81 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 0bea719f8a..9869753980 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -792,60 +792,49 @@ extension RawSpan { } ``` -### Byte parsing helpers -The below (severable) API make `RawSpan` well-suited for use in binary parsers and decoders. +## Source compatibility -#### Out of bounds errors +This proposal is additive and source-compatible with existing code. -The stdlib's lowest level (safe) interfaces, direct indexing, trap on error ([Logic failures](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#logic-failures)). Some operations, such as the key-based subcript on `Dictionary`, are expected to fail often and have no useful information to communicate other than to return `nil` ([Simple domain errors](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#simple-domain-errors)). +## ABI compatibility -Data parsing is generally expected to succeed, but when it doesn't we want an error that we can propagate upwards with enough information in that we can try to recover ([Recoverable errors](https://github.com/apple/swift/blob/main/docs/ErrorHandlingRationale.md#recoverable-errors)). For example, if our data is provided in chunks of contiguous memory, we might be able to recover by buffering more bytes and trying again. +This proposal is additive and ABI-compatible with existing code. -```swift -/// An error indicating that out-of-bounds access was attempted -@frozen -public struct OutOfBoundsError: Error { - /// The number of elements expected - public var expected: Int +## Implications on adoption - /// The number of elements found - public var has: Int +The additions described in this proposal require a new version of the standard library and runtime. - @inlinable - public init(expected: Int, has: Int) -} -``` +## Alternatives considered -#### Index-advancing operations +##### Make `Span` a noncopyable type +Making `Span` non-copyable was in the early vision of this type. However, we found that would make `Span` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. -The following parsing primitives provide useful error-reporting wrappers for loading arbitrary types from a `RawSpan`. They advance the passed-in read offset on success, relieving developers of most of the ceremony of managing the index. +##### A protocol in addition to `ContiguousStorage` for unsafe buffers +This document proposes adding the `ContiguousStorage` protocol to the standard library's `Unsafe{Mutable,Raw}BufferPointer` types. On the surface this seems like whitewashing the unsafety of these types. The lifetime constraint only applies to the binding used to obtain a `Span`, and the initialization precondition can only be enforced by documentation. Nothing will prevent unsafe code from deinitializing a portion of the storage while a `Span` is alive. There is no safe bridge from `UnsafeBufferPointer` to `ContiguousStorage`. We considered having the unsafe buffer types conforming to a different version of `ContiguousStorage`, which would vend a `Span` through a closure-taking API. Unfortunately such a closure would be perfectly capable of capturing the `UnsafeBufferPointer` binding and be as unsafe as can be. For this reason, the `UnsafeBufferPointer` family will conform to `ContiguousStorage`, with safety being enforced in documentation. -```swift -extension RawSpan { - /// Parse an instance of `T`, advancing `position`. - @inlinable - public func parse( - _ position: inout Int, as t: T.Type = T.self - ) throws(OutOfBoundsError) -> T +##### Use a non-escapable index type +Eventually we want a similar usage pattern for a `MutableSpan` as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. - /// Parse `numBytes` of data, advancing `position`. - @inlinable - public func parse( - _ position: inout Int, numBytes: some FixedWidthInteger - ) throws (OutOfBoundsError) -> Self -} -``` +##### Naming -#### Cursor-mutating operations +The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and we tried to allude to that in a previous pitch where we called this type `StorageView`. We also considered the name `StorageSpan`, but that did not add much beyond the name `Span`. `Span` clearly identifies itself as a relative of C++'s `std::span`. -`Cursor` provides a more convenient interface to the index-advancing primitives by encapsulating the current position as well as subrange within the input in which to operate. +##### Adding `load` and `loadUnaligned` to `Span`on `Span` instead of adding `RawSpan` -When parsing data, there are often multiple subranges of the data that we are parsing within. For example, when parsing an entire file, we might treat each line as a separate record, and we might individually parse different fields in each line. Knowing whether we are at the start or end of the file requires checking the file's original bounds and knowing whether we are at the start of a line requires either knowing the line's bounds or peeking-behind the record's current parse range for a newline character. +TKTKTK `Cursor` stores and manages a parsing subrange, which alleviates the developer from managing one layer of slicing. -*Alternative*: If `Cursor` does not store the subrange, it would be 3 words in size rather than 5 words. The developer would have to pre-slice and manage the slice, and future API on cursor could not peek outside of the subrange's bounds (e.g. checking for start-of-line). +##### A more sophisticated approach to indexing + +This is discussed more fully in the [indexing appendix](#Indexing) below. + +## Future directions + +### Byte parsing helpers + +A handful of helper API can make `RawSpan` better suited for binary parsers and decoders. ```swift extension RawSpan { @@ -853,15 +842,9 @@ extension RawSpan { public struct Cursor: Copyable, ~Escapable { public let base: RawSpan - /// The range within which we parse - public let parseRange: Range - /// The current parsing position public var position: RawSpan.Index - @inlinable - public init(_ base: RawSpan, in range: Range) - @inlinable public init(_ base: RawSpan) @@ -881,19 +864,18 @@ extension RawSpan { @inlinable public var parsedBytes: RawSpan { get } - /// The number of bytes left to parse + /// The remaining bytes left to parse @inlinable - public var remainingBytes: Int { get } + public var remainingBytes: RawSpan { get } } @inlinable public func makeCursor() -> Cursor - - @inlinable - public func makeCursor(in range: Range) -> Cursor } ``` +Alternatively, if some future `RawSpan.Iterator` were 3 words in size (start, current position, and end) instead of 2 (current pointer and end), that is it were a "resettable" iterator, it could host this API instead of introducing a new `Cursor` type or concept. + #### Example: Parsing PNG The below parses [PNG Chunks](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks). @@ -941,42 +923,10 @@ func parsePNGChunk( } ``` -## Source compatibility - -This proposal is additive and source-compatible with existing code. -## ABI compatibility -This proposal is additive and ABI-compatible with existing code. -## Implications on adoption - -The additions described in this proposal require a new version of the standard library and runtime. - -## Alternatives considered - -##### Make `Span` a noncopyable type -Making `Span` non-copyable was in the early vision of this type. However, we found that would make `Span` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. - -##### A protocol in addition to `ContiguousStorage` for unsafe buffers -This document proposes adding the `ContiguousStorage` protocol to the standard library's `Unsafe{Mutable,Raw}BufferPointer` types. On the surface this seems like whitewashing the unsafety of these types. The lifetime constraint only applies to the binding used to obtain a `Span`, and the initialization precondition can only be enforced by documentation. Nothing will prevent unsafe code from deinitializing a portion of the storage while a `Span` is alive. There is no safe bridge from `UnsafeBufferPointer` to `ContiguousStorage`. We considered having the unsafe buffer types conforming to a different version of `ContiguousStorage`, which would vend a `Span` through a closure-taking API. Unfortunately such a closure would be perfectly capable of capturing the `UnsafeBufferPointer` binding and be as unsafe as can be. For this reason, the `UnsafeBufferPointer` family will conform to `ContiguousStorage`, with safety being enforced in documentation. - -##### Use a non-escapable index type -Eventually we want a similar usage pattern for a `MutableSpan` as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. - -##### Naming - -The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and we tried to allude to that in a previous pitch where we called this type `StorageView`. We also considered the name `StorageSpan`, but that did not add much beyond the name `Span`. `Span` clearly identifies itself as a relative of C++'s `std::span`. - -##### Adding `load` and `loadUnaligned` to `Span`on `Span` instead of adding `RawSpan` - -TKTKTK - -##### A more sophisticated approach to indexing - -This is discussed more fully in the [indexing appendix](#Indexing) below. - -## Future directions +--- ##### Defining `BorrowingIterator` with support in `for` loops This proposal does not define an `IteratorProtocol` conformance, since it would need to be borrowed and non-escapable. This is not compatible with `IteratorProtocol`. As such, `Span` is not directly usable in `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: From d87f0417f2c2f1e70ad300ed0b0053b9c1465621 Mon Sep 17 00:00:00 2001 From: Michael Ilseman Date: Thu, 20 Jun 2024 16:33:33 -0600 Subject: [PATCH 25/73] Fill out the index appendix --- .../nnnn-safe-shared-contiguous-storage.md | 36 ++++++++----------- 1 file changed, 15 insertions(+), 21 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 9869753980..a54f8e1c3f 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -1029,6 +1029,8 @@ Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposa ## Appendix: Index and slicing design considerations +Early prototypes of this proposal defined an `Index` type, `Iterator` types, etc. We are proposing `Int`-based API and are deferring defining `Index` and `Iterator` until more of the non-escapable collection story is sorted out. The below is some of our research into different potential designs of an `Index` type. + There are 3 potentially-desirable features of `Span`'s `Index` design: 1. `Span` is its own slice type @@ -1041,9 +1043,10 @@ Each of these introduces practical tradeoffs in the design. Collections which own their storage have the convention of separate slice types, such as `Array` and `String`. This has the advantage of clearly delineating storage ownership in the programming model and the disadvantage of introducing a second type through which to interact. -`UnsafeBufferPointer` may or may not (unsafely) own its storage, and hence has a separate slice type. It's `baseAddress` has a `deallocate` method for situations in which it does (unsafely) own its storage, and that method should only be called on the `baseAddress` of the original allocation, not the start of a slice. However, many uses of `UnsafeBufferPointer` are unowned use cases, where having a separate slice type is [cumbersome](https://github.com/apple/swift/blob/bcd08c0c9a74974b4757b4b8a2d1796659b1d940/stdlib/public/core/StringComparison.swift#L175). +When types do not own their storage, separate slice types can be [cumbersome](https://github.com/apple/swift/blob/bcd08c0c9a74974b4757b4b8a2d1796659b1d940/stdlib/public/core/StringComparison.swift#L175). The reason `UnsafeBufferPointer` has a separate slice type is because it wants to allow indices to be reused across slices and its `Index` is a relative offset from the start (`Int`) rather than an absolute position (such as a pointer). + +`Span` does not own its storage and there is no concern about leaking larger allocations. It would benefit from being its own slice type. -`Span` does not own its storage and there is no concern about leaking larger allocations. Thus, it would benefit from being its own slice type, even if doing so increases the size of the type from 2 to 3 words (depending on other design tradeoffs discussed below). We propose making `Span` be its own slice type. #### Indices from a slice can be used on the base collection @@ -1116,11 +1119,10 @@ getFirst(myCollection) // 0 getFirst(slice) // Fatal error: Index out of bounds ``` -Preserving index interchange across views and the base is a nice-to-have for `Span`, and we propose keeping it. However, we are evaluating the tradeoffs it requires. #### Additional reuse-after-free checking -`Span` bounds-checks its indices, which is important for safety. If the index is based around a pointer (instead of an offset), then bounds checks will also ensure that indices are not used with the wrong span in most situations. However, it is possible for a memory address to be reused after being freed, and using a stale index into this reused memory may introduce safety problems. +`Span` bounds-checks its indices, which is important for safety. If the index is based around a pointer (instead of an offset), then bounds checks will also ensure that indices are not used with the wrong span in most situations. However, it is possible for a memory address to be reused after being freed and using a stale index into this reused memory may introduce safety problems. ```swift var idx: Span.Index @@ -1137,12 +1139,12 @@ let span2 = array2.span // but with a different base address whose offset is not an even // multiple of `MemoryLayout.stride`. -span2[idx] // unaligned load, what happens? +span2[idx] // misaligned load, what happens? ``` -If `T` is `BitwiseCopyable`, then the unaligned load is not undefined behavior, but the value that is loaded is garbage. Whether the program is well-behaved going forwards depends on whether it is resilient to getting garbage values. +If `T` is `BitwiseCopyable`, then the misaligned load is not undefined behavior, but the value that is loaded is garbage. Whether the program is well-behaved going forwards depends on whether it is resilient to getting garbage values. -If `T` is not `BitwiseCopyable`, then the unaligned load may introduce undefined behavior. No matter how well-written the rest of the program is, it has a critical safety and security flaw. +If `T` is not `BitwiseCopyable`, then the misaligned load may introduce undefined behavior. No matter how well-written the rest of the program is, it has a critical safety and security flaw. When the reused allocation happens to be stride-aligned, there is no undefined behavior from undefined loads, nor are there "garbage" values in the strictest sense, but it is still reflective of a programming bug. The program may be interacting with an unexpected value. @@ -1154,32 +1156,24 @@ Future improvements to microarchitecture may make reuse after free checks cheape ##### Index is an offset (`Int` or a wrapper around `Int`) -When `Index` is an offset, there is no undefined behavior from unaligned loads because the `Span`'s base address is advanced by `MemoryLayout.stride * offset`. +When `Index` is an offset, there is no undefined behavior from misaligned loads because the `Span`'s base address is advanced by `MemoryLayout.stride * offset`. However, there is no protection against invalidly using an index derived from a different span, provided the offset is in-bounds. -If `Span` is 2 words (base address and count), then indices cannot be interchanged between slices and the base span. `Span` would need to additionally store a base offset, bringing it up to 3 words in size. - -**TODO**: What's the perf impact of having a base offset? Bounds checking would need `(baseOffset..<(count &- baseOffset)).contains(i)`. +Since `Span` is 2 words (base address and count), indices cannot be interchanged between slices and the base span. In order to do so, `Span` would need to additionally store a base offset, bringing it up to 3 words in size. ##### Index is a pointer (wrapper around `UnsafeRawPointer`) When Index holds a pointer, `Span` only needs to be 2 words in size, as valid index interchange across slices falls out naturally. Additionally, invalid reuse of an index across spans will typically be caught during bounds checking. -However, in a reuse-after-free situation, unaligned loads (i.e. undefined behavior) are possible. If stride is not a multiple of 2, then alignement checking can be expensive. Alternatively, we could choose not to detect these bugs. +However, in a reuse-after-free situation, misaligned loads (i.e. undefined behavior) are possible. If stride is not a multiple of 2, then alignment checking can be expensive. Alternatively, we could choose not to detect these bugs. ##### Index is a fat pointer (pointer and allocation ID) -We can create a per-allocation ID (e.g. a cryptographic `UInt64`) for both `Span` and `Span.Index` to store. This makes `Span` 3 words in size and `Span.Index` 2 words in size. This provides the most protection possible against all forms of invalid index use, including reuse-after-free. - -However, making `Span.Index` be 2 words in size is unfortunate. `Range` is now 4 words in size, storing the allocation ID twice. Anything built on top of `Span` that wishes to store multiple indices is either bloated or must hand-extract the pointers and hand-manage the allocation ID. - -##### Bitpacking an allocation ID hash +We can create a per-allocation ID (e.g. a cryptographic `UInt64`) for both `Span` and `Span.Index` to store. This would make `Span` 3 words in size and `Span.Index` 2 words in size. This provides the most protection possible against all forms of invalid index use, including reuse-after-free. However, making `Span` be 3 words and `Span.Index` 2 words for this feature is unfortunate. -As an alternative to the above, we could create a smaller hash value of an allocation ID and use that for checking. +We could instead go with 2 word `Span` and 2 word `Span.Index` by storing the span's `baseAddress` in the `Index`'s second word. This will detect invalid reuse of indices across spans in addition to misaligned reuse-after-free errors. However, indices could not be interchanged without a way for the slice type to know the original span's base address (e.g. through a separate slice type or making `Span` 3 words in size). -If index is an offset, it could use e.g. 48 bits for the offset and 16 bits for the hash value. If index is a pointer, it could use **TODO** bits for the hash value. +In either approach, making `Span.Index` be 2 words in size is unfortunate. `Range` is now 4 words in size, storing the allocation ID twice. Anything built on top of `Span` that wishes to store multiple indices is either bloated or must hand-extract the pointers and hand-manage the allocation ID. -**TODO**: perf impact of this approach -We recommend going with **TODO** From a5239b4a2044864d37d656f3321e799d1c300884 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 20 Jun 2024 18:12:22 -0700 Subject: [PATCH 26/73] tweaks and corrections --- .../nnnn-safe-shared-contiguous-storage.md | 210 +++++++++++------- 1 file changed, 129 insertions(+), 81 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index a54f8e1c3f..4b78009ca4 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -42,6 +42,7 @@ array.withUnsafeBufferPointer { // New let span: Span = array.storage // use `span` in the same scope as `array` for direct memory access +``` ## Proposed solution @@ -71,7 +72,7 @@ extension Hypothetical Base64Decoder { } ``` -Advanced libraries might add use an inlinable generic-dispatch interface in addition to a concrete interface defined in terms of `Span` +Advanced libraries might add use an inlinable generic-dispatch interface in addition to a concrete interface defined in terms of `Span`, serving as adaptor code for top-level API. ### `RawSpan` @@ -96,7 +97,6 @@ It provides a buffer-like interface to the elements stored in that span of memor extension Span { public var count: Int { get } public var isEmpty: Bool { get } - public var indices: Range { get } subscript(_ position: Int) -> Element { get } } @@ -117,7 +117,7 @@ The first element of a given span is _always_ at position zero, and its last it As a side-effect of not conforming to `Collection` or `Sequence`, `Span` is not directly supported by `for` loops at this time. It is, however, easy to use in a `for` loop via indexing: ```swift -for i in mySpan.indices { +for i in 0..: Copyable, ~Escapable { - internal var _start: Span.Index + internal var _start: UnsafePointer internal var _count: Int } ``` @@ -265,10 +265,10 @@ extension Span { /// - buffer: an `UnsafeBufferPointer` to initialized elements. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. - public init?( + public init( unsafeElements buffer: UnsafeBufferPointer, owner: borrowing Owner - ) -> dependsOn(owner) Self? + ) -> dependsOn(owner) Self /// Unsafely create a `Span` over initialized memory. /// @@ -340,7 +340,6 @@ The following properties, functions and subscripts have direct counterparts in t extension Span { public var count: Int { get } public var isEmpty: Bool { get } - public var indices: Range { get } public subscript(_ position: Int) -> dependsOn(self) Element { get } } @@ -352,9 +351,84 @@ In SE-0437, `UnsafeBufferPointer`'s slicing subscript was replaced by the `extra ```swift extension Span { - func extracting(_ bounds: Range) -> dependsOn(self) Self - func extracting(_ bounds: some RangeExpression) -> dependsOn(self) Self - func extracting(_: UnboundedRange) -> dependsOn(self) Self + /// Constructs a new span over the items within the supplied range of + /// positions within this span. + /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. + /// + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `Span`. + /// + /// - Returns: A `Span` over the items within `bounds` + func extracting(_ bounds: Range) -> Self + + /// Constructs a new span over the items within the supplied range of + /// positions within this span. + /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. + /// + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `Span`. + /// + /// - Returns: A `Span` over the items within `bounds` + func extracting(_ bounds: some RangeExpression) -> Self + + /// Constructs a new span over all the items of this span. + /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. + /// + /// - Returns: A `Span` over all the items of this span. + func extracting(_: UnboundedRange) -> Self + + // extracting prefixes and suffixes + + /// Returns a span containing the initial elements of this span, + /// up to the specified maximum length. + /// + /// If the maximum length exceeds the length of this span, + /// the result contains all the elements. + /// + /// - Parameter maxLength: The maximum number of elements to return. + /// `maxLength` must be greater than or equal to zero. + /// - Returns: A span with at most `maxLength` elements. + borrowing public func extracting(first maxLength: Int) -> Self + + /// Returns a span over all but the given number of trailing elements. + /// + /// If the number of elements to drop exceeds the number of elements in + /// the span, the result is an empty span. + /// + /// - Parameter k: The number of elements to drop off the end of + /// the span. `k` must be greater than or equal to zero. + /// - Returns: A span leaving off the specified number of elements at the end. + borrowing public func extracting(droppingLast k: Int) -> Self + + /// Returns a span containing the final elements of the span, + /// up to the given maximum length. + /// + /// If the maximum length exceeds the length of this span, + /// the result contains all the elements. + /// + /// - Parameter maxLength: The maximum number of elements to return. + /// `maxLength` must be greater than or equal to zero. + /// - Returns: A span with at most `maxLength` elements. + borrowing public func extracting(last maxLength: Int) -> Self + + /// Returns a span over all but the given number of initial elements. + /// + /// If the number of elements to drop exceeds the number of elements in + /// the span, the result is an empty span. + /// + /// - Parameter k: The number of elements to drop from the beginning of + /// the span. `k` must be greater than or equal to zero. + /// - Returns: A span starting after the specified number of elements. + borrowing public func extracting(droppingFirst k: Int = 1) -> Self } ``` @@ -370,32 +444,39 @@ extension Span { /// /// This subscript does not validate `position`; this is an unsafe operation. /// - /// - Parameter position: The position of the element to access. `position` - /// must be a valid index that is not equal to the `endIndex` property. - public subscript( - unchecked position: Index - ) -> dependsOn(self) Element { get } + /// - Parameter position: The offset of the element to access. `position` + /// must be greater or equal to zero, and less than `count`. + public subscript(unchecked position: Int) -> Element { get } - /// Accesses a contiguous subrange of the elements represented by this `Span` + /// Constructs a new span over the items within the supplied range of + /// positions within this span. + /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. + /// + /// This function does not validate `bounds`; this is an unsafe operation. /// - /// This subscript does not validate `bounds`; this is an unsafe operation. + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `Span`. /// - /// - Parameter bounds: A range of the collection's indices. The bounds of - /// the range must be valid indices of the collection. - public extracting( - uncheckedBounds bounds: Range - ) -> dependsOn(self) Self + /// - Returns: A `Span` over the items within `bounds` + public func extracting(uncheckedBounds bounds: Range) -> Self - /// Accesses the contiguous subrange of the elements represented by - /// this `Span`, specified by a range expression. + /// Constructs a new span over the items within the supplied range of + /// positions within this span. /// - /// This subscript does not validate `bounds`; this is an unsafe operation. + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not generally share their indices with the + /// span from which they are extracted. /// - /// - Parameter bounds: A range of the collection's indices. The bounds of - /// the range must be valid indices of the collection. - public func extracting( - uncheckedBounds bounds: some RangeExpression - ) -> dependsOn(self) Span + /// This function does not validate `bounds`; this is an unsafe operation. + /// + /// - Parameter bounds: A valid range of positions. Every position in + /// this range must be within the bounds of this `Span`. + /// + /// - Returns: A `Span` over the items within `bounds` + public func extracting(uncheckedBounds bounds: some RangeExpression) -> Self } ``` @@ -415,7 +496,7 @@ extension Span { /// /// - Parameters: /// - position: a range of positions to validate - public boundsCheckPrecondition(_ bounds: Range) + public boundsCheckPrecondition(_ bounds: Range) } ``` @@ -489,10 +570,10 @@ extension RawSpan { /// - buffer: an `UnsafeRawBufferPointer` to initialized memory. /// - owner: a binding whose lifetime must exceed that of /// the newly created `RawSpan`. - public init?( + public init( unsafeBytes buffer: UnsafeRawBufferPointer, owner: borrowing Owner - ) -> dependsOn(owner) Self? + ) -> dependsOn(owner) Self /// Unsafely create a `RawSpan` over initialized memory. /// @@ -506,8 +587,8 @@ extension RawSpan { /// - owner: a binding whose lifetime must exceed that of /// the newly created `RawSpan`. public init( - unsafeRawPointer pointer: UnsafeRawPointer, - count: Int, + unsafeStart pointer: UnsafeRawPointer, + byteCount: Int, owner: borrowing Owner ) -> dependsOn(owner) Self @@ -604,7 +685,8 @@ extension RawSpan { A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homogenously as instances of `T`. ```swift - /// View the memory span represented by this view as a different type +extension RawSpan { + /// View the memory span represented by this view as a different type /// /// The memory must be laid out identically to the in-memory representation of `T`. /// @@ -620,15 +702,11 @@ A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homoge ```swift extension RawSpan { /// The number of bytes in the span. - public var count: Int { get } + public var byteCount: Int { get } /// A Boolean value indicating whether the span is empty. public var isEmpty: Bool { get } - /// The indices that are valid for subscripting the span, in ascending - /// order. - public var indices: Range { get } - /// Traps if `offset` is not a valid offset into this `RawSpan` /// /// - Parameters: @@ -643,41 +721,9 @@ extension RawSpan { } ``` -##### Index validation utiliities: - -Every time `RawSpan` uses an index or an integer offset, it checks for their validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: - -```swift -extension RawSpan { - /// Traps if `position` is not a valid index for this `RawSpan` - /// - /// - Parameters: - /// - position: an Index to validate - public boundsCheckPrecondition(_ position: Index) - - /// Traps if `bounds` is not a valid range of indices for this `RawSpan` - /// - /// - Parameters: - /// - bounds: a range of indices to validate - public boundsCheckPrecondition(_ bounds: Range) - - /// Traps if `offset` is not a valid offset into this `RawSpan` - /// - /// - Parameters: - /// - offset: an offset to validate - public boundsCheckPrecondition(offset: Int) - - /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` - /// - /// - Parameters: - /// - offsets: a range of offsets to validate - public boundsCheckPrecondition(offsets: Range) -} -``` - ##### Accessing subranges of elements: -Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports a similar set of `extracting()` functions as `Span`: +Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports the same set of `extracting()` functions as `Span`: ```swift extension RawSpan { @@ -691,7 +737,7 @@ extension RawSpan { /// - Parameter bounds: A valid range of positions. Every position in /// this range must be within the bounds of this `RawSpan`. /// - /// - Returns: A `Span` over the bytes within `bounds` + /// - Returns: A span over the bytes within `bounds` public func extracting(_ bounds: Range) -> Self /// Constructs a new span over the bytes within the supplied range of @@ -706,7 +752,7 @@ extension RawSpan { /// - Parameter bounds: A valid range of positions. Every position in /// this range must be within the bounds of this `RawSpan`. /// - /// - Returns: A `Span` over the bytes within `bounds` + /// - Returns: A span over the bytes within `bounds` public func extracting(uncheckedBounds bounds: Range) -> Self /// Constructs a new span over the bytes within the supplied range of @@ -719,7 +765,7 @@ extension RawSpan { /// - Parameter bounds: A valid range of positions. Every position in /// this range must be within the bounds of this `RawSpan`. /// - /// - Returns: A `Span` over the bytes within `bounds` + /// - Returns: A span over the bytes within `bounds` public func extracting(_ bounds: some RangeExpression) -> Self /// Constructs a new span over the bytes within the supplied range of @@ -734,7 +780,7 @@ extension RawSpan { /// - Parameter bounds: A valid range of positions. Every position in /// this range must be within the bounds of this `RawSpan`. /// - /// - Returns: A `Span` over the bytes within `bounds` + /// - Returns: A span over the bytes within `bounds` public func extracting( uncheckedBounds bounds: some RangeExpression ) -> Self @@ -745,9 +791,11 @@ extension RawSpan { /// slices, extracted spans do not generally share their indices with the /// span from which they are extracted. /// - /// - Returns: A `RawSpan` over all the items of this span. + /// - Returns: A span over all the bytes of this span. public func extracting(_: UnboundedRange) -> Self + // extracting prefixes and suffixes + /// Returns a span containing the initial bytes of this span, /// up to the specified maximum byte count. /// @@ -843,7 +891,7 @@ extension RawSpan { public let base: RawSpan /// The current parsing position - public var position: RawSpan.Index + public var position: Int @inlinable public init(_ base: RawSpan) @@ -929,7 +977,7 @@ func parsePNGChunk( --- ##### Defining `BorrowingIterator` with support in `for` loops -This proposal does not define an `IteratorProtocol` conformance, since it would need to be borrowed and non-escapable. This is not compatible with `IteratorProtocol`. As such, `Span` is not directly usable in `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: +This proposal does not define an `IteratorProtocol` conformance, since it would need to be borrowed and non-escapable. This is not compatible with `IteratorProtocol`. As such, `Span` is not directly usable in `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: ```swift borrowing view: Span = ... From 99f305ac5c7ca0e8c0b1a44eab628b5a2ec94019 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 20 Jun 2024 18:22:53 -0700 Subject: [PATCH 27/73] add missing keywords --- proposals/nnnn-safe-shared-contiguous-storage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 4b78009ca4..0434de1885 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -490,13 +490,13 @@ extension Span { /// /// - Parameters: /// - position: an position to validate - public boundsCheckPrecondition(_ position: Int) + public func boundsCheckPrecondition(_ position: Int) /// Traps if `bounds` is not a valid range of offsets into this `Span` /// /// - Parameters: /// - position: a range of positions to validate - public boundsCheckPrecondition(_ bounds: Range) + public func boundsCheckPrecondition(_ bounds: Range) } ``` From b3db4b4ca38a365ca40c3316a6a96f6a44098ffc Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 21 Jun 2024 08:09:47 -0700 Subject: [PATCH 28/73] Apply editing suggestions from review Co-authored-by: Karoy Lorentey --- .../nnnn-safe-shared-contiguous-storage.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 0434de1885..7ad0c353ff 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -518,9 +518,9 @@ extension Span { /// for the `withUnsafeBufferPointer(_:)` method. The closure's /// parameter is valid only for the duration of its execution. /// - Returns: The return value of the `body` closure parameter. - func withUnsafeBufferPointer( - _ body: (_ buffer: UnsafeBufferPointer) -> Result - ) -> Result + func withUnsafeBufferPointer( + _ body: (_ buffer: UnsafeBufferPointer) throws(E) -> Result + ) throws(E) -> Result } extension Span where Element: BitwiseCopyable { @@ -538,9 +538,9 @@ extension Span where Element: BitwiseCopyable { /// The closure's parameter is valid only for the duration of /// its execution. /// - Returns: The return value of the `body` closure parameter. - func withUnsafeBytes( - _ body: (_ buffer: UnsafeRawBufferPointer) -> Result - ) -> Result + func withUnsafeBytes( + _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result + ) throws(E) -> Result } ``` @@ -597,7 +597,6 @@ extension RawSpan { /// - Parameters: /// - span: An existing `Span`, which will define both this /// `RawSpan`'s lifetime and the memory it represents. - @inlinable @inline(__always) public init( _ span: borrowing Span ) -> dependsOn(span) Self @@ -747,7 +746,7 @@ extension RawSpan { /// slices, extracted spans do not generally share their indices with the /// span from which they are extracted. /// - /// This function does not validate `bounds`; this is an unsafe operation. + /// This function may not always validate `bounds`; this is an unsafe operation. /// /// - Parameter bounds: A valid range of positions. Every position in /// this range must be within the bounds of this `RawSpan`. @@ -999,7 +998,7 @@ while i < view.count { } // ...or: -for i in 0.. Date: Fri, 21 Jun 2024 18:24:43 -0700 Subject: [PATCH 29/73] annotation adjustments, various edits --- .../nnnn-safe-shared-contiguous-storage.md | 166 +++++++++++------- 1 file changed, 99 insertions(+), 67 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 7ad0c353ff..a0680b0a39 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -15,7 +15,16 @@ We introduce `Span`, an abstraction for container-agnostic access to contiguo In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, without compromising Swift's memory safety. -This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/apple/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/apple/swift-evolution/pull/2305). This proposal also supersedes the rejected proposal [SE-0256](https://github.com/apple/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). The overall feature of ownership and lifetime constraints has previously been discussed in the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. +This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/apple/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/swiftlang/swift-evolution/pull/2305), as well as related to the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. This proposal is also related to the following proposals: +- [SE-0426] BitwiseCopyable +- [SE-0427] Noncopyable generics +- [SE-0377] `borrowing` and `consuming` parameter ownership modifiers +- [SE-0256] `{Mutable}ContiguousCollection` protocol (rejected, superseded by this proposal) + +[SE-0426]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0426-bitwise-copyable.md +[SE-0427]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md +[SE-0377]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0377-parameter-ownership-modifiers.md +[SE-0256]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0256-contiguous-collection.md ## Motivation @@ -94,11 +103,11 @@ public struct Span: Copyable, ~Escapable { It provides a buffer-like interface to the elements stored in that span of memory: ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { public var count: Int { get } public var isEmpty: Bool { get } - subscript(_ position: Int) -> Element { get } + subscript(_ position: Int) -> Element { _read } } ``` @@ -107,7 +116,7 @@ Note that `Span` does _not_ conform to `Collection`. This is because `Collection `Span`s representing subsets of consecutive elements can be extracted out of a larger `Span` with an API similar to the recently added `extracting()` functions of `UnsafeBufferPointer`: ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { public func extracting(_ bounds: Range) -> Self } ``` @@ -225,21 +234,19 @@ extension UnsafeMutableRawBufferPointer: ContiguousStorage { `Span` has an unsafe hatch for use with unsafe code. ```swift -extension Span { - func withUnsafeBufferPointer( - _ body: (_ buffer: UnsafeBufferPointer) -> Result - ) -> Result +extension Span where Element: ~Copyable & ~Escapable { + func withUnsafeBufferPointer( + _ body: (_ buffer: UnsafeBufferPointer) throws(E) -> Result + ) throws(E) -> Result } extension Span where Element: BitwiseCopyable { - func withUnsafeBytes( - _ body: (_ buffer: UnsafeRawBufferPointer) -> Result - ) -> Result + func withUnsafeBytes( + _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result + ) throws(E) -> Result } ``` - - ### Complete `Span` API: ```swift @@ -254,7 +261,7 @@ public struct Span: Copyable, ~Escapable { The initialization of a `Span` instance from an unsafe pointer is an unsafe operation. When it is initialized correctly, subsequent uses of the borrowed instance are safe. Typically these initializers will be used internally to a container's implementation of functions or computed properties that return a borrowed `Span`. ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { /// Unsafely create a `Span` over initialized memory. /// @@ -265,10 +272,10 @@ extension Span { /// - buffer: an `UnsafeBufferPointer` to initialized elements. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. - public init( + public init( unsafeElements buffer: UnsafeBufferPointer, owner: borrowing Owner - ) -> dependsOn(owner) Self + ) /// Unsafely create a `Span` over initialized memory. /// @@ -281,11 +288,11 @@ extension Span { /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. - public init( + public init( unsafeStart pointer: UnsafePointer, count: Int, owner: borrowing Owner - ) -> dependsOn(owner) Self + ) } extension Span where Element: BitwiseCopyable { @@ -304,10 +311,10 @@ extension Span where Element: BitwiseCopyable { /// - type: the type to use when interpreting the bytes in memory. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. - public init( + public init( unsafeBytes buffer: UnsafeRawBufferPointer, owner: borrowing Owner - ) -> dependsOn(owner) Self + ) /// Unsafely create a `Span` over a span of initialized memory. /// @@ -324,24 +331,24 @@ extension Span where Element: BitwiseCopyable { /// - count: the number of initialized elements in the view. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. - public init( + public init( unsafeStart pointer: UnsafeRawPointer, byteCount: Int, owner: borrowing Owner - ) -> dependsOn(owner) Self + ) } ``` -##### `Collection`-like API: +##### Basic API: The following properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation where applicable, allowing them to return borrowed nonescapable values or borrowed noncopyable values. ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { public var count: Int { get } public var isEmpty: Bool { get } - public subscript(_ position: Int) -> dependsOn(self) Element { get } + public subscript(_ position: Int) -> Element { _read } } ``` @@ -350,7 +357,7 @@ extension Span { In SE-0437, `UnsafeBufferPointer`'s slicing subscript was replaced by the `extracting(_ bounds:)` functions, due to the copyability assumption baked into the standard library's `Slice` type. `Span` has similar requirements as `UnsafeBufferPointer`, in that it must be possible to obtain an instance representing a subrange of the same memory as another `Span`. We therefore follow in the footsteps of SE-0437, and add a family of `extracting()` functions: ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { /// Constructs a new span over the items within the supplied range of /// positions within this span. /// @@ -385,9 +392,11 @@ extension Span { /// /// - Returns: A `Span` over all the items of this span. func extracting(_: UnboundedRange) -> Self - - // extracting prefixes and suffixes - +} +``` +Additionally, we add specialized versions that extract prefixes and suffixes: +```swift +extension Span where Element: ~Copyable & ~Escapable { /// Returns a span containing the initial elements of this span, /// up to the specified maximum length. /// @@ -437,7 +446,7 @@ extension Span { The `subscript` and the `extracting()` functions mentioned above all have always-on bounds checking of their parameters, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly. ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { // Unchecked subscripting and extraction /// Accesses the element at the specified `position`. @@ -446,7 +455,7 @@ extension Span { /// /// - Parameter position: The offset of the element to access. `position` /// must be greater or equal to zero, and less than `count`. - public subscript(unchecked position: Int) -> Element { get } + public subscript(unchecked position: Int) -> Element { _read } /// Constructs a new span over the items within the supplied range of /// positions within this span. @@ -485,7 +494,7 @@ extension Span { Every time `Span` uses a position parameter, it checks for its validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { /// Traps if `position` is not a valid offset into this `Span` /// /// - Parameters: @@ -505,7 +514,7 @@ extension Span { We provide two functions for interoperability with C or other legacy pointer-taking functions. ```swift -extension Span { +extension Span where Element: ~Copyable & ~Escapable { /// Calls a closure with a pointer to the viewed contiguous storage. /// /// The buffer pointer passed as an argument to `body` is valid only @@ -573,7 +582,7 @@ extension RawSpan { public init( unsafeBytes buffer: UnsafeRawBufferPointer, owner: borrowing Owner - ) -> dependsOn(owner) Self + ) /// Unsafely create a `RawSpan` over initialized memory. /// @@ -590,7 +599,7 @@ extension RawSpan { unsafeStart pointer: UnsafeRawPointer, byteCount: Int, owner: borrowing Owner - ) -> dependsOn(owner) Self + ) /// Create a `RawSpan` over the memory represented by a `Span` /// @@ -599,7 +608,7 @@ extension RawSpan { /// `RawSpan`'s lifetime and the memory it represents. public init( _ span: borrowing Span - ) -> dependsOn(span) Self + ) } ``` @@ -692,11 +701,35 @@ extension RawSpan { /// - Parameters: /// - type: The type you wish to view the memory as /// - Returns: A new `Span` over elements of type `T` - public func view(as: T.Type) -> dependsOn(self) Span + public func view(as: T.Type) -> Span } ``` -##### Index-related Operations: +`RawSpan` provides `withUnsafeBytes` for interoperability with C or other legacy pointer-taking functions: + +```swift +extension RawSpan { + /// Calls the given closure with a pointer to the underlying bytes of + /// the viewed contiguous storage. + /// + /// The buffer pointer passed as an argument to `body` is valid only + /// during the execution of `withUnsafeBytes(_:)`. + /// Do not store or return the pointer for later use. + /// + /// - Parameter body: A closure with an `UnsafeRawBufferPointer` + /// parameter that points to the viewed contiguous storage. + /// If `body` has a return value, that value is also + /// used as the return value for the `withUnsafeBytes(_:)` method. + /// The closure's parameter is valid only for the duration of + /// its execution. + /// - Returns: The return value of the `body` closure parameter. + func withUnsafeBytes( + _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result + ) throws(E) -> Result +} +``` + +##### Examining `RawSpan` bounds: ```swift extension RawSpan { @@ -746,7 +779,7 @@ extension RawSpan { /// slices, extracted spans do not generally share their indices with the /// span from which they are extracted. /// - /// This function may not always validate `bounds`; this is an unsafe operation. + /// This function does not validate `bounds`; this is an unsafe operation. /// /// - Parameter bounds: A valid range of positions. Every position in /// this range must be within the bounds of this `RawSpan`. @@ -871,15 +904,17 @@ The ideas in this proposal previously used the name `BufferView`. While the use TKTKTK -`Cursor` stores and manages a parsing subrange, which alleviates the developer from managing one layer of slicing. - ##### A more sophisticated approach to indexing This is discussed more fully in the [indexing appendix](#Indexing) below. ## Future directions -### Byte parsing helpers +#### coroutine accessors + +This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. + +#### Byte parsing helpers A handful of helper API can make `RawSpan` better suited for binary parsers and decoders. @@ -921,11 +956,13 @@ extension RawSpan { } ``` +`Cursor` stores and manages a parsing subrange, which alleviates the developer from managing one layer of slicing. + Alternatively, if some future `RawSpan.Iterator` were 3 words in size (start, current position, and end) instead of 2 (current pointer and end), that is it were a "resettable" iterator, it could host this API instead of introducing a new `Cursor` type or concept. -#### Example: Parsing PNG +##### Example: Parsing PNG -The below parses [PNG Chunks](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks). +The code snippet below parses [PNG Chunks](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks): ```swift struct PNGChunk: ~Escapable { @@ -933,7 +970,7 @@ struct PNGChunk: ~Escapable { public init( _ contents: RawSpan, _ owner: borrowing Owner - ) throws (PNGValidationError) -> dependsOn(owner) Self { + ) throws (PNGValidationError) { self.contents = contents try self._validate() } @@ -958,7 +995,7 @@ struct PNGChunk: ~Escapable { func parsePNGChunk( _ span: RawSpan, _ owner: borrowing Owner -) throws -> dependsOn(owner) PNGChunk { +) throws -> PNGChunk { var cursor = span.makeCursor() let length = try cursor.parse(UInt32.self).bigEndian @@ -970,12 +1007,8 @@ func parsePNGChunk( } ``` +#### Defining `BorrowingIterator` with support in `for` loops - - ---- - -##### Defining `BorrowingIterator` with support in `for` loops This proposal does not define an `IteratorProtocol` conformance, since it would need to be borrowed and non-escapable. This is not compatible with `IteratorProtocol`. As such, `Span` is not directly usable in `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: ```swift @@ -998,22 +1031,22 @@ while i < view.count { } // ...or: -for i in 0 ..< view.count { +for i in 0..` +#### Safe mutations of memory with `MutableSpan` Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. @@ -1025,7 +1058,7 @@ The `UnsafeMutableBufferPointer` passed to a `withUnsafeMutableXXX` closure-styl 4. Exclusivity of writes is not enforced 5. Initialization of any particular memory address is not ensured -I.e., it is unsafe in all the ways `UnsafeBufferPointer`-passing closure APIs are unsafe in addition to being unsafe in exclusivity and in initialization. +in other words, it is unsafe in all the same ways as `UnsafeBufferPointer`-passing closure APIs, in addition to enforcing neither exclusivity nor initialization state. Loading an uninitialized non-`BitwiseCopyable` value leads to undefined behavior. Loading an uninitialized `BitwiseCopyable` value does not immediately lead to undefined behavior, but it produces a garbage value which may lead to misbehavior of the program. @@ -1033,21 +1066,20 @@ A `MutableSpan` should provide a better, safer alternative to mutable memory However, it alone does not track initialization state of each address, and that will continue to be the responsibility of the developer. -##### Delegating initialization of memory with `OutputSpan` - -Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. In the standard library we have `Array.init(unsafeUninitializedCapacity:initializingWith:)` and `String.init(unsafeUninitializedCapacity:initializingUTF8With:)`. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. +#### Delegating initialization of memory with `OutputSpan` -`OutputSpan` would need run-time bookkeeping (e.g. a bitvector with a bit per-address) to track initialization state to safely support random access and random-order initialization. +Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. For example, the standard library features the initializer`Array.init(unsafeUninitializedCapacity:initializingWith:)`, which depends on `UnsafeMutableBufferPointer` and is known to be error-prone. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. -Alternatively, a divide-and-conqueor style initialization order might be solvable via an API layer without run-time bookkeeping, but with more complex ergonomics. +We can define an `OutputSpan` type, which could support appending to the initialized portion of its underlying storage. Such an `OutputSpan` would also be a useful abstraction to pass user-allocated storage to low-level API such as networking calls or file i/o. -##### Resizable, contiguously-stored, untyped collection in the standard library +#### Resizable, contiguously-stored, untyped collection in the standard library The example in the [motivation](#motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. Even if `Span` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, just as `Array` will: resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may still want to add an untyped-element equivalent of `Array` at a later time. -##### Syntactic Sugar for Automatic Conversions +#### Syntactic Sugar for Automatic Conversions + In the context of a resilient library, a generic entry point in terms of `some ContiguousStorage` may add unwanted overhead. As detailed above, an entry point in an evolution-enabled library requires an inlinable generic public entry point which forwards to a publicly-accessible function defined in terms of `Span`. If `Span` does become a widely-used type to interface between libraries, we could simplify these conversions with a bit of compiler help. We could provide an automatic way to use a `ContiguousStorage`-conforming type with a function that takes a `Span` of the appropriate element type: @@ -1065,7 +1097,8 @@ myStrnlen(array) // 8 This would probably consist of a new type of custom conversion in the language. A type author would provide a way to convert from their type to an owned `Span`, and the compiler would insert that conversion where needed. This would enhance readability and reduce boilerplate. -##### Interopability with C++'s `std::span` and with llvm's `-fbounds-safety` +#### Interopability with C++'s `std::span` and with llvm's `-fbounds-safety` + The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are opportunities for better, safer interoperation with Swift, via a type such as `Span`. ## Acknowledgments @@ -1223,4 +1256,3 @@ We could instead go with 2 word `Span` and 2 word `Span.Index` by storing the sp In either approach, making `Span.Index` be 2 words in size is unfortunate. `Range` is now 4 words in size, storing the allocation ID twice. Anything built on top of `Span` that wishes to store multiple indices is either bloated or must hand-extract the pointers and hand-manage the allocation ID. - From a0d3b8780e9ae9f7b96286cf08b0e5f605e29997 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sat, 22 Jun 2024 13:26:51 -0700 Subject: [PATCH 30/73] some more edits --- .../nnnn-safe-shared-contiguous-storage.md | 22 +++++++------------ 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index a0680b0a39..01c0a4dcdf 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -894,25 +894,21 @@ Making `Span` non-copyable was in the early vision of this type. However, we fou This document proposes adding the `ContiguousStorage` protocol to the standard library's `Unsafe{Mutable,Raw}BufferPointer` types. On the surface this seems like whitewashing the unsafety of these types. The lifetime constraint only applies to the binding used to obtain a `Span`, and the initialization precondition can only be enforced by documentation. Nothing will prevent unsafe code from deinitializing a portion of the storage while a `Span` is alive. There is no safe bridge from `UnsafeBufferPointer` to `ContiguousStorage`. We considered having the unsafe buffer types conforming to a different version of `ContiguousStorage`, which would vend a `Span` through a closure-taking API. Unfortunately such a closure would be perfectly capable of capturing the `UnsafeBufferPointer` binding and be as unsafe as can be. For this reason, the `UnsafeBufferPointer` family will conform to `ContiguousStorage`, with safety being enforced in documentation. ##### Use a non-escapable index type -Eventually we want a similar usage pattern for a `MutableSpan` as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. +Eventually we want a similar usage pattern for a `MutableSpan` (described [below](#MutableSpan)) as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. ##### Naming The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and we tried to allude to that in a previous pitch where we called this type `StorageView`. We also considered the name `StorageSpan`, but that did not add much beyond the name `Span`. `Span` clearly identifies itself as a relative of C++'s `std::span`. -##### Adding `load` and `loadUnaligned` to `Span`on `Span` instead of adding `RawSpan` - -TKTKTK - ##### A more sophisticated approach to indexing This is discussed more fully in the [indexing appendix](#Indexing) below. ## Future directions -#### coroutine accessors +#### Coroutine Accessors -This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. +This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for a type to provide borrowing access to its internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. #### Byte parsing helpers @@ -1046,7 +1042,7 @@ Alongside this work, it may make sense to add a `Span` alternative to `withConti Some types store their internal representation in a piecewise-contiguous manner, such as trees and ropes. Some operations naturally return information in a piecewise-contiguous manner, such as network operations. These could supply results by iterating through a list of contiguous chunks of memory. -#### Safe mutations of memory with `MutableSpan` +#### Safe mutations of memory with `MutableSpan` Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. @@ -1062,15 +1058,13 @@ in other words, it is unsafe in all the same ways as `UnsafeBufferPointer`-passi Loading an uninitialized non-`BitwiseCopyable` value leads to undefined behavior. Loading an uninitialized `BitwiseCopyable` value does not immediately lead to undefined behavior, but it produces a garbage value which may lead to misbehavior of the program. -A `MutableSpan` should provide a better, safer alternative to mutable memory in the same way that `Span` provides a better, safer read-only type. `MutableSpan` would also automatically enforce exclusivity of writes. - -However, it alone does not track initialization state of each address, and that will continue to be the responsibility of the developer. +A `MutableSpan` should provide a better, safer alternative to mutable memory in the same way that `Span` provides a better, safer read-only type. `MutableSpan` would apply to initialized memory and would enforce exclusivity of writes, thereby preserving the initialization state of its memory between mutations. -#### Delegating initialization of memory with `OutputSpan` +#### Delegating initialization of memory with `OutputSpan` -Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. For example, the standard library features the initializer`Array.init(unsafeUninitializedCapacity:initializingWith:)`, which depends on `UnsafeMutableBufferPointer` and is known to be error-prone. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. +Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. For example, the standard library features the initializer`Array.init(unsafeUninitializedCapacity:initializingWith:)`, which depends on `UnsafeMutableBufferPointer` and is known to be error-prone. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. -We can define an `OutputSpan` type, which could support appending to the initialized portion of its underlying storage. Such an `OutputSpan` would also be a useful abstraction to pass user-allocated storage to low-level API such as networking calls or file i/o. +We can define an `OutputSpan` type, which could support appending to the initialized portion of its underlying storage. `OutputSpan` allows for uninitialized memory beyond the last position appended. Such an `OutputSpan` would also be a useful abstraction to pass user-allocated storage to low-level API such as networking calls or file I/O. #### Resizable, contiguously-stored, untyped collection in the standard library From 90890a583fd488bddad69ec4d8cebe88f8cab22e Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sat, 22 Jun 2024 14:03:23 -0700 Subject: [PATCH 31/73] move `ContiguousStorage` to future directions The state of the compiler does not allow us to propose it at this time. --- .../nnnn-safe-shared-contiguous-storage.md | 119 +++++++----------- 1 file changed, 45 insertions(+), 74 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 01c0a4dcdf..0ee98ff7c0 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -26,7 +26,7 @@ This proposal is related to two other features being proposed along with it: [No [SE-0377]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0377-parameter-ownership-modifiers.md [SE-0256]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0256-contiguous-collection.md -## Motivation +## Motivation Swift needs safe and performant types for local processing over values in contiguous memory. Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()`, forcing them to use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. @@ -61,28 +61,6 @@ By relying on borrowing, `Span` can provide simultaneous access to a non-copyabl `Span` is the currency type for local processing over values in contiguous memory. It is the replacement for any API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that does not need to escape the value. -### `ContiguousStorage` - -A type can indicate that it can provide a `Span` by conforming to the `ContiguousStorage` protocol. `ContiguousStorage` forms a bridge between multi-type or generically-typed interfaces and a performant concrete implementation. - -For example, for the hypothetical base64 decoding library mentioned above, a possible API could be: - -```swift -extension HypotheticalBase64Decoder { - public func decode(bytes: some ContiguousStorage) -> [UInt8] -} -``` - -Even better, an interface can be defined in terms of the concrete type `Span`: - -```swift -extension Hypothetical Base64Decoder { - public func decode(bytes: Span) -> [UInt8] -} -``` - -Advanced libraries might add use an inlinable generic-dispatch interface in addition to a concrete interface defined in terms of `Span`, serving as adaptor code for top-level API. - ### `RawSpan` `RawSpan` allows sharing the contiguous internal representation for values which may be heterogenously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straight-forwards understanding of performance for library code. @@ -131,75 +109,45 @@ for i in 0..: ~Copyable, ~Escapable { - associatedtype Element: ~Copyable & ~Escapable - - var storage: Span { get } -} -``` - -The key safety feature is that a `Span` cannot escape to a scope where the value it borrowed no longer exists. - -A function that wishes to read from contiguous storage can declare a parameter type of `some ContiguousStorage`. The implementation will internally consist of a brief generic section, followed by business logic implemented in terms of a concrete `Span`. Frameworks that support library evolution (resilient frameworks) have an additional concern. Resilient frameworks have an ABI boundary that may differ from the API proper. Resilient frameworks may wish to adopt a pattern such as the following: - -```swift -extension MyResilientType { - // public API - @inlinable public func essentialFunction(_ a: some ContiguousStorage) -> Int { - self.essentialFunction(a.storage) - } - - // ABI boundary - public func essentialFunction(_ a: Span) -> Int { ... } -} -``` - -Here, the public function obtains the `Span` from the type that vends it in inlinable code, then calls a concrete, opaque function defined in terms of `Span`. Inlining the generic shim in the client is often a critical optimization. The need for such a pattern and related improvements are discussed in the future directions below (see [Syntactic Sugar for Automatic Conversions](#Conversions).) - #### Extensions to Standard Library and Foundation types ```swift -extension Array: ContiguousStorage { +extension Array { // note: this could borrow a temporary copy of the `Array`'s storage - var storage: Span { get } + var storage: Span { _read } } -extension ArraySlice: ContiguousStorage { +extension ArraySlice { // note: this could borrow a temporary copy of the `ArraySlice`'s storage - var storage: Span { get } + var storage: Span { _read } } -extension ContiguousArray: ContiguousStorage { +extension ContiguousArray { var storage: Span { get } } -extension Foundation.Data: ContiguousStorage { +extension Foundation.Data { var storage: Span { get } } -extension String.UTF8View: ContiguousStorage { +extension String.UTF8View { // note: this could borrow a temporary copy of the `String`'s storage - var storage: Span { get } + var storage: Span { _read } } -extension Substring.UTF8View: ContiguousStorage { +extension Substring.UTF8View { // note: this could borrow a temporary copy of the `Substring`'s storage - var storage: Span { get } + var storage: Span { _read } } -extension Character.UTF8View: ContiguousStorage { +extension Character.UTF8View { // note: this could borrow a temporary copy of the `Character`'s storage - var storage: Span { get } + var storage: Span { _read } } -extension SIMD: ContiguousStorage { +extension SIMD { var storage: Span { get } } -extension KeyValuePairs: ContiguousStorage<(Self.Key, Self.Value)> { +extension KeyValuePairs { var storage: Span<(Self.Key, Self.Value)> { get } } -extension CollectionOfOne: ContiguousStorage { +extension CollectionOfOne { var storage: Span { get } } @@ -211,19 +159,19 @@ extension Slice: ContiguousStorage where Base: ContiguousStorage { In addition to the the safe types above gaining the `storage` property, the `UnsafeBufferPointer` family of types will also gain access to a `storage` property. This enables interoperability of `Span`-taking API. While a `Span` binding created from an `UnsafeBufferPointer` exists, the memory that underlies it must not be deinitialized or deallocated. ```swift -extension UnsafeBufferPointer: ContiguousStorage { +extension UnsafeBufferPointer { // note: additional preconditions apply until the end of the scope var storage: Span { get } } -extension UnsafeMutableBufferPointer: ContiguousStorage { +extension UnsafeMutableBufferPointer { // note: additional preconditions apply until the end of the scope var storage: Span { get } } -extension UnsafeRawBufferPointer: ContiguousStorage { +extension UnsafeRawBufferPointer { // note: additional preconditions apply until the end of the scope var storage: Span { get } } -extension UnsafeMutableRawBufferPointer: ContiguousStorage { +extension UnsafeMutableRawBufferPointer { // note: additional preconditions apply until the end of the scope var storage: Span { get } } @@ -1072,9 +1020,32 @@ The example in the [motivation](#motivation) section mentions the `Foundation.Da Even if `Span` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, just as `Array` will: resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may still want to add an untyped-element equivalent of `Array` at a later time. +#### A `ContiguousStorage` protocol + +An earlier version of this proposal proposed a `ContiguousStorage` protocol by which a type could indicate that it can provide a `Span`. `ContiguousStorage` would form a bridge between generically-typed interfaces and a performant concrete implementation. + +For example, for the hypothetical base64 decoding library mentioned in the [motivation](#Motivation) section, a possible API could be: + +```swift +extension HypotheticalBase64Decoder { + public func decode(bytes: some ContiguousStorage) -> [UInt8] +} +``` + +`ContiguousStorage` would have the following definition: + +```swift +public protocol ContiguousStorage: ~Copyable, ~Escapable { + associatedtype Element: ~Copyable & ~Escapable + var storage: Span { _read } +} +``` + +Two issues prevent us from proposing it at this time: (a) the ability to suppress requirements on `associatedtype` declarations was deferred during the review of [SE-0427], and (b) we cannot declare a `_read` accessor as a protocol requirement, since `_read` is not considered stable. + #### Syntactic Sugar for Automatic Conversions -In the context of a resilient library, a generic entry point in terms of `some ContiguousStorage` may add unwanted overhead. As detailed above, an entry point in an evolution-enabled library requires an inlinable generic public entry point which forwards to a publicly-accessible function defined in terms of `Span`. If `Span` does become a widely-used type to interface between libraries, we could simplify these conversions with a bit of compiler help. +Even with a `ContiguousStorage` protocol, a generic entry point in terms of `some ContiguousStorage` may add unwanted overhead to resilient libraries. As detailed above, an entry point in an evolution-enabled library requires an inlinable generic public entry point which forwards to a publicly-accessible function defined in terms of `Span`. If `Span` does become a widely-used type to interface between libraries, we could simplify these conversions with a bit of compiler help. We could provide an automatic way to use a `ContiguousStorage`-conforming type with a function that takes a `Span` of the appropriate element type: @@ -1101,7 +1072,7 @@ Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposa -## Appendix: Index and slicing design considerations +### Appendix: Index and slicing design considerations Early prototypes of this proposal defined an `Index` type, `Iterator` types, etc. We are proposing `Int`-based API and are deferring defining `Index` and `Iterator` until more of the non-escapable collection story is sorted out. The below is some of our research into different potential designs of an `Index` type. From 2d463ab216d3f41bf87955054d66682e5c98be57 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 24 Jun 2024 20:19:47 -0700 Subject: [PATCH 32/73] edits about unsafe initializer usage --- .../nnnn-safe-shared-contiguous-storage.md | 30 +++++++------------ 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 0ee98ff7c0..022fbc2686 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -156,30 +156,22 @@ extension Slice: ContiguousStorage where Base: ContiguousStorage { } ``` -In addition to the the safe types above gaining the `storage` property, the `UnsafeBufferPointer` family of types will also gain access to a `storage` property. This enables interoperability of `Span`-taking API. While a `Span` binding created from an `UnsafeBufferPointer` exists, the memory that underlies it must not be deinitialized or deallocated. +#### Using `Span` with C functions or other unsafe code: + +The `UnsafeBufferPointer` family of types can be be adapted for use with `Span`-taking API by using unsafe `Span` initializers. A `Span` instance obtained this way loses a static guarantee of temporal safety, because it is possible to deinitialize or deallocate the source `UnsafeMutableBufferPointer` before the end of the `Span` instance's scope. ```swift -extension UnsafeBufferPointer { - // note: additional preconditions apply until the end of the scope - var storage: Span { get } -} -extension UnsafeMutableBufferPointer { - // note: additional preconditions apply until the end of the scope - var storage: Span { get } -} -extension UnsafeRawBufferPointer { - // note: additional preconditions apply until the end of the scope - var storage: Span { get } +extension HypotheticalBase64Decoder { + public func decode(bytes: Span) -> [UInt8] } -extension UnsafeMutableRawBufferPointer { - // note: additional preconditions apply until the end of the scope - var storage: Span { get } + +data.withUnsafeBytes { (buffer: UnsafeRawBufferPointer) in + let span = Span(unsafeBytes: buffer, owner: buffer) + let decoded = myBase64Decoder.decode(span) } ``` -#### Using `Span` with C functions or other unsafe code: - -`Span` has an unsafe hatch for use with unsafe code. +`Span` has an unsafe hatch for use with functions that take an unsafe argument: ```swift extension Span where Element: ~Copyable & ~Escapable { @@ -206,7 +198,7 @@ public struct Span: Copyable, ~Escapable { ##### Creating a `Span`: -The initialization of a `Span` instance from an unsafe pointer is an unsafe operation. When it is initialized correctly, subsequent uses of the borrowed instance are safe. Typically these initializers will be used internally to a container's implementation of functions or computed properties that return a borrowed `Span`. +The initialization of a `Span` instance from an unsafe pointer is an unsafe operation. Typically these initializers will be used internally to a container's implementation and return a borrowed `Span` tied to the container's lifetime. Safe usage relies on a guarantee that the represented storage is managed correctly and outlives the `Span` instance. ```swift extension Span where Element: ~Copyable & ~Escapable { From c8b2d5c9771c8c7ee19075b128260d6b6e7fca56 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 24 Jun 2024 20:20:30 -0700 Subject: [PATCH 33/73] =?UTF-8?q?remove=20=E2=80=9Cgenerally=E2=80=9D=20fr?= =?UTF-8?q?om=20index-sharing=20note?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../nnnn-safe-shared-contiguous-storage.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 022fbc2686..0de82f3adb 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -302,7 +302,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// positions within this span. /// /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// - Parameter bounds: A valid range of positions. Every position in @@ -315,7 +315,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// positions within this span. /// /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// - Parameter bounds: A valid range of positions. Every position in @@ -327,7 +327,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// Constructs a new span over all the items of this span. /// /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// - Returns: A `Span` over all the items of this span. @@ -401,7 +401,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// positions within this span. /// /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// This function does not validate `bounds`; this is an unsafe operation. @@ -416,7 +416,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// positions within this span. /// /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// This function does not validate `bounds`; this is an unsafe operation. @@ -703,7 +703,7 @@ extension RawSpan { /// positions within this span. /// /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// - Parameter bounds: A valid range of positions. Every position in @@ -716,7 +716,7 @@ extension RawSpan { /// positions within this span. /// /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// This function does not validate `bounds`; this is an unsafe operation. @@ -731,7 +731,7 @@ extension RawSpan { /// positions within this span. /// /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// - Parameter bounds: A valid range of positions. Every position in @@ -744,7 +744,7 @@ extension RawSpan { /// positions within this span. /// /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// This function does not validate `bounds`; this is an unsafe operation. @@ -760,7 +760,7 @@ extension RawSpan { /// Constructs a new span over all the bytes of this span. /// /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not generally share their indices with the + /// slices, extracted spans do not share their indices with the /// span from which they are extracted. /// /// - Returns: A span over all the bytes of this span. From 859a071ee2f4b1417da8e19ecdb1b6e5ebb4bcc4 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 25 Jun 2024 12:49:58 -0700 Subject: [PATCH 34/73] improve index validation functions --- .../nnnn-safe-shared-contiguous-storage.md | 44 ++++++++++++------- 1 file changed, 29 insertions(+), 15 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 0de82f3adb..c9031247e4 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -435,17 +435,31 @@ Every time `Span` uses a position parameter, it checks for its validity, unless ```swift extension Span where Element: ~Copyable & ~Escapable { - /// Traps if `position` is not a valid offset into this `Span` + /// Return true if `offset` is a valid offset into this `Span` /// /// - Parameters: - /// - position: an position to validate - public func boundsCheckPrecondition(_ position: Int) + /// - position: an index to validate + /// - Returns: true if `offset` is a valid index + public func validateBounds(_ offset: Int) -> Bool - /// Traps if `bounds` is not a valid range of offsets into this `Span` + /// Traps if `offset` is not a valid offset into this `Span` /// /// - Parameters: - /// - position: a range of positions to validate - public func boundsCheckPrecondition(_ bounds: Range) + /// - position: an index to validate + public func assertValidity(_ offset: Int) + + /// Return true if `offsets` is a valid range of offsets into this `Span` + /// + /// - Parameters: + /// - offsets: a range of indices to validate + /// - Returns: true if `offsets` is a valid range of indices + public func validateBounds(_ offsets: Range) -> Bool + + /// Traps if `offsets` is not a valid range of offsets into this `Span` + /// + /// - Parameters: + /// - offsets: a range of indices to validate + public func assertValidity(_ offsets: Range) } ``` @@ -679,17 +693,17 @@ extension RawSpan { /// A Boolean value indicating whether the span is empty. public var isEmpty: Bool { get } + /// Return true if `offset` is a valid offset into this `RawSpan` + public func validateBounds(_ offset: Int) -> Bool + /// Traps if `offset` is not a valid offset into this `RawSpan` - /// - /// - Parameters: - /// - position: an offset to validate - public func boundsCheckPrecondition(_ offset: Int) + public func assertValidity(_ offset: Int) - /// Traps if `bounds` is not a valid range of offsets into this `RawSpan` - /// - /// - Parameters: - /// - offsets: a range of offsets to validate - public func boundsCheckPrecondition(_ offsets: Range) + /// Return true if `offsets` is a valid range of offsets into this `RawSpan` + public func validateBounds(_ offsets: Range) -> Bool + + /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` + public func assertValidity(_ offsets: Range) } ``` From e924bab3d24424271982613e5dec24c699bc9dd0 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 25 Jun 2024 12:51:03 -0700 Subject: [PATCH 35/73] omit some duplicated documentation --- .../nnnn-safe-shared-contiguous-storage.md | 96 +------------------ 1 file changed, 4 insertions(+), 92 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index c9031247e4..48767440ff 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -709,119 +709,31 @@ extension RawSpan { ##### Accessing subranges of elements: -Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports the same set of `extracting()` functions as `Span`: +Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports the same set of `extracting()` functions as `Span`. The documentation is omitted here, as it is substantially the same as for `Span`: ```swift extension RawSpan { - /// Constructs a new span over the bytes within the supplied range of - /// positions within this span. - /// - /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `RawSpan`. - /// - /// - Returns: A span over the bytes within `bounds` + public func extracting(_ bounds: Range) -> Self - /// Constructs a new span over the bytes within the supplied range of - /// positions within this span. - /// - /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// This function does not validate `bounds`; this is an unsafe operation. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `RawSpan`. - /// - /// - Returns: A span over the bytes within `bounds` public func extracting(uncheckedBounds bounds: Range) -> Self - - /// Constructs a new span over the bytes within the supplied range of - /// positions within this span. - /// - /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `RawSpan`. - /// - /// - Returns: A span over the bytes within `bounds` + public func extracting(_ bounds: some RangeExpression) -> Self - - /// Constructs a new span over the bytes within the supplied range of - /// positions within this span. - /// - /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// This function does not validate `bounds`; this is an unsafe operation. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `RawSpan`. - /// - /// - Returns: A span over the bytes within `bounds` + public func extracting( uncheckedBounds bounds: some RangeExpression ) -> Self - /// Constructs a new span over all the bytes of this span. - /// - /// The returned span's first byte is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Returns: A span over all the bytes of this span. public func extracting(_: UnboundedRange) -> Self // extracting prefixes and suffixes - /// Returns a span containing the initial bytes of this span, - /// up to the specified maximum byte count. - /// - /// If the maximum length exceeds the length of this span, - /// the result contains all the bytes. - /// - /// - Parameter maxLength: The maximum number of bytes to return. - /// `maxLength` must be greater than or equal to zero. - /// - Returns: A span with at most `maxLength` bytes. public func extracting(first maxLength: Int) -> Self - /// Returns a span over all but the given number of trailing bytes. - /// - /// If the number of elements to drop exceeds the number of elements in - /// the span, the result is an empty span. - /// - /// - Parameter k: The number of bytes to drop off the end of - /// the span. `k` must be greater than or equal to zero. - /// - Returns: A span leaving off the specified number of bytes at the end. public func extracting(droppingLast k: Int) -> Self - /// Returns a span containing the trailing bytes of the span, - /// up to the given maximum length. - /// - /// If the maximum length exceeds the length of this span, - /// the result contains all the bytes. - /// - /// - Parameter maxLength: The maximum number of bytes to return. - /// `maxLength` must be greater than or equal to zero. - /// - Returns: A span with at most `maxLength` bytes. public func extracting(last maxLength: Int) -> Self - /// Returns a span over all but the given number of initial bytes. - /// - /// If the number of elements to drop exceeds the number of bytes in - /// the span, the result is an empty span. - /// - /// - Parameter k: The number of bytes to drop from the beginning of - /// the span. `k` must be greater than or equal to zero. - /// - Returns: A span starting after the specified number of bytes. public func extracting(droppingFirst k: Int = 1) -> Self } ``` From 1844c97f09ec017308bc181b4af28630075ce3c3 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 25 Jun 2024 13:37:56 -0700 Subject: [PATCH 36/73] add html anchors to important sections --- proposals/nnnn-safe-shared-contiguous-storage.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 48767440ff..c84050ea26 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -67,7 +67,7 @@ By relying on borrowing, `Span` can provide simultaneous access to a non-copyabl `Span` can always be converted to `RawSpan`, using a conditionally-available property or a constructor. -## Detailed design +## Detailed design `Span` is a simple representation of a span of initialized memory. @@ -198,7 +198,7 @@ public struct Span: Copyable, ~Escapable { ##### Creating a `Span`: -The initialization of a `Span` instance from an unsafe pointer is an unsafe operation. Typically these initializers will be used internally to a container's implementation and return a borrowed `Span` tied to the container's lifetime. Safe usage relies on a guarantee that the represented storage is managed correctly and outlives the `Span` instance. +The initialization of a `Span` instance from an unsafe pointer is an unsafe operation. Typically these initializers will be used internally to a container's implementation and return a borrowed `Span` tied to the container's lifetime. Safe usage relies on a guarantee that the represented storage is managed correctly and outlives the `Span` instance. ```swift extension Span where Element: ~Copyable & ~Escapable { @@ -751,7 +751,7 @@ This proposal is additive and ABI-compatible with existing code. The additions described in this proposal require a new version of the standard library and runtime. -## Alternatives considered +## Alternatives considered ##### Make `Span` a noncopyable type Making `Span` non-copyable was in the early vision of this type. However, we found that would make `Span` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. @@ -770,7 +770,7 @@ The ideas in this proposal previously used the name `BufferView`. While the use This is discussed more fully in the [indexing appendix](#Indexing) below. -## Future directions +## Future directions #### Coroutine Accessors From 385cccb6b3191ead3e7865d510a08b994902c67f Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 25 Jun 2024 17:22:03 -0700 Subject: [PATCH 37/73] add link to second pitch thread - with whitespace tweaks --- proposals/nnnn-safe-shared-contiguous-storage.md | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index c84050ea26..ed3871e089 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -7,7 +7,7 @@ * Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) * Bug: rdar://48132971, rdar://96837923 * Implementation: Prototyped in https://github.com/apple/swift-collections (on branch "future") -* Review: ([pitch 1](https://forums.swift.org/t/69888))([pitch 2]()) +* Review: ([pitch 1](https://forums.swift.org/t/69888))([pitch 2](https://forums.swift.org/t/72745)) ## Introduction @@ -738,7 +738,6 @@ extension RawSpan { } ``` - ## Source compatibility This proposal is additive and source-compatible with existing code. @@ -934,7 +933,7 @@ We can define an `OutputSpan` type, which could support appending to the init #### Resizable, contiguously-stored, untyped collection in the standard library -The example in the [motivation](#motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. +The example in the [motivation](#Motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. Even if `Span` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, just as `Array` will: resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may still want to add an untyped-element equivalent of `Array` at a later time. @@ -988,8 +987,6 @@ The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class templa Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. - - ### Appendix: Index and slicing design considerations Early prototypes of this proposal defined an `Index` type, `Iterator` types, etc. We are proposing `Int`-based API and are deferring defining `Index` and `Iterator` until more of the non-escapable collection story is sorted out. The below is some of our research into different potential designs of an `Index` type. @@ -1010,10 +1007,9 @@ When types do not own their storage, separate slice types can be [cumbersome](ht `Span` does not own its storage and there is no concern about leaking larger allocations. It would benefit from being its own slice type. - #### Indices from a slice can be used on the base collection -There is very strong stdlib precedent that indices from the base collection can be used in a slice and vice-versa. +There is very strong stdlib precedent that indices from the base collection can be used in a slice and vice-versa. ```swift let myCollection = [0,1,2,3,4,5,6] @@ -1024,7 +1020,7 @@ slice[idx] // 4 myCollection[slice.indices] // [4, 5, 6] ``` -Code can be written to take advantage of this fact. For example, a simplistic parser can be written as mutating methods on a slice. The slice's indices can be saved for reference into the original collection or another slice. +Code can be written to take advantage of this fact. For example, a simplistic parser can be written as mutating methods on a slice. The slice's indices can be saved for reference into the original collection or another slice. ```swift extension Slice where Base == UnsafeRawBufferPointer { @@ -1082,7 +1078,6 @@ getFirst(myCollection) // 0 getFirst(slice) // Fatal error: Index out of bounds ``` - #### Additional reuse-after-free checking `Span` bounds-checks its indices, which is important for safety. If the index is based around a pointer (instead of an offset), then bounds checks will also ensure that indices are not used with the wrong span in most situations. However, it is possible for a memory address to be reused after being freed and using a stale index into this reused memory may introduce safety problems. @@ -1119,7 +1114,7 @@ Future improvements to microarchitecture may make reuse after free checks cheape ##### Index is an offset (`Int` or a wrapper around `Int`) -When `Index` is an offset, there is no undefined behavior from misaligned loads because the `Span`'s base address is advanced by `MemoryLayout.stride * offset`. +When `Index` is an offset, there is no undefined behavior from misaligned loads because the `Span`'s base address is advanced by `MemoryLayout.stride * offset`. However, there is no protection against invalidly using an index derived from a different span, provided the offset is in-bounds. @@ -1138,4 +1133,3 @@ We can create a per-allocation ID (e.g. a cryptographic `UInt64`) for both `Span We could instead go with 2 word `Span` and 2 word `Span.Index` by storing the span's `baseAddress` in the `Index`'s second word. This will detect invalid reuse of indices across spans in addition to misaligned reuse-after-free errors. However, indices could not be interchanged without a way for the slice type to know the original span's base address (e.g. through a separate slice type or making `Span` 3 words in size). In either approach, making `Span.Index` be 2 words in size is unfortunate. `Range` is now 4 words in size, storing the allocation ID twice. Anything built on top of `Span` that wishes to store multiple indices is either bloated or must hand-extract the pointers and hand-manage the allocation ID. - From 5266e65ef1926f16ce7c65b8b5c773f208d09917 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 28 Jun 2024 11:25:52 -0700 Subject: [PATCH 38/73] more cleanup surrounding `ContiguousStorage` --- proposals/nnnn-safe-shared-contiguous-storage.md | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index ed3871e089..56ae108a07 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -142,17 +142,13 @@ extension Character.UTF8View { } extension SIMD { - var storage: Span { get } + var storage: Span { _read } } extension KeyValuePairs { var storage: Span<(Self.Key, Self.Value)> { get } } extension CollectionOfOne { - var storage: Span { get } -} - -extension Slice: ContiguousStorage where Base: ContiguousStorage { - var storage: Span { get } + var storage: Span { _read } } ``` @@ -755,9 +751,6 @@ The additions described in this proposal require a new version of the standard l ##### Make `Span` a noncopyable type Making `Span` non-copyable was in the early vision of this type. However, we found that would make `Span` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. -##### A protocol in addition to `ContiguousStorage` for unsafe buffers -This document proposes adding the `ContiguousStorage` protocol to the standard library's `Unsafe{Mutable,Raw}BufferPointer` types. On the surface this seems like whitewashing the unsafety of these types. The lifetime constraint only applies to the binding used to obtain a `Span`, and the initialization precondition can only be enforced by documentation. Nothing will prevent unsafe code from deinitializing a portion of the storage while a `Span` is alive. There is no safe bridge from `UnsafeBufferPointer` to `ContiguousStorage`. We considered having the unsafe buffer types conforming to a different version of `ContiguousStorage`, which would vend a `Span` through a closure-taking API. Unfortunately such a closure would be perfectly capable of capturing the `UnsafeBufferPointer` binding and be as unsafe as can be. For this reason, the `UnsafeBufferPointer` family will conform to `ContiguousStorage`, with safety being enforced in documentation. - ##### Use a non-escapable index type Eventually we want a similar usage pattern for a `MutableSpan` (described [below](#MutableSpan)) as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. @@ -897,7 +890,7 @@ for i in 0..: ~Copyable, ~Escapable { Two issues prevent us from proposing it at this time: (a) the ability to suppress requirements on `associatedtype` declarations was deferred during the review of [SE-0427], and (b) we cannot declare a `_read` accessor as a protocol requirement, since `_read` is not considered stable. +Many of the standard library collections could conform to `ContiguousStorage`, but we would not include the `Unsafe{Mutable,Raw}BufferPointer` types among them. Conversion of these will continue to use the explicit `Span(unsafe{Bytes,Elements,Start}:)` initializers. + #### Syntactic Sugar for Automatic Conversions Even with a `ContiguousStorage` protocol, a generic entry point in terms of `some ContiguousStorage` may add unwanted overhead to resilient libraries. As detailed above, an entry point in an evolution-enabled library requires an inlinable generic public entry point which forwards to a publicly-accessible function defined in terms of `Span`. If `Span` does become a widely-used type to interface between libraries, we could simplify these conversions with a bit of compiler help. From 913f6e13bf8c997438d14738a355bbed4d186fe7 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 28 Jun 2024 11:32:05 -0700 Subject: [PATCH 39/73] whitespace fixes --- proposals/nnnn-safe-shared-contiguous-storage.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 56ae108a07..4628df1406 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -163,7 +163,7 @@ extension HypotheticalBase64Decoder { data.withUnsafeBytes { (buffer: UnsafeRawBufferPointer) in let span = Span(unsafeBytes: buffer, owner: buffer) - let decoded = myBase64Decoder.decode(span) + let decoded = myBase64Decoder.decode(span) } ``` @@ -644,7 +644,7 @@ A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homoge ```swift extension RawSpan { - /// View the memory span represented by this view as a different type + /// View the memory span represented by this view as a different type /// /// The memory must be laid out identically to the in-memory representation of `T`. /// @@ -953,7 +953,7 @@ public protocol ContiguousStorage: ~Copyable, ~Escapable { Two issues prevent us from proposing it at this time: (a) the ability to suppress requirements on `associatedtype` declarations was deferred during the review of [SE-0427], and (b) we cannot declare a `_read` accessor as a protocol requirement, since `_read` is not considered stable. -Many of the standard library collections could conform to `ContiguousStorage`, but we would not include the `Unsafe{Mutable,Raw}BufferPointer` types among them. Conversion of these will continue to use the explicit `Span(unsafe{Bytes,Elements,Start}:)` initializers. +Many of the standard library collections could conform to `ContiguousStorage`, but we would not include the `Unsafe{Mutable,Raw}BufferPointer` types among them. Conversion of these will continue to use the explicit `Span(unsafe{Bytes,Elements,Start}:)` initializers. #### Syntactic Sugar for Automatic Conversions From ba482d9e690c1eb84ecf5c24b71d77027ae52ab1 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sun, 30 Jun 2024 01:45:05 -0700 Subject: [PATCH 40/73] =?UTF-8?q?Change=20some=20uses=20of=20the=20word=20?= =?UTF-8?q?=E2=80=9Cview=E2=80=9D=20to=20=E2=80=9Cspan=E2=80=9D=20instead?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- proposals/nnnn-safe-shared-contiguous-storage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 4628df1406..4f4c469e3c 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -55,7 +55,7 @@ let span: Span = array.storage ## Proposed solution -`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. A view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed during the lifetime of the view. `Span`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. +`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. `Span`view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed while the `Span` exists. `Span`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. By relying on borrowing, `Span` can provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see the future directions for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) @@ -89,7 +89,7 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` -Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, a sub-span representing a subrange of its elements _must_ have the same lifetime as the view from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. +Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, a sub-span representing a subrange of its elements _must_ have the same lifetime as the `Span` from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. `Span`s representing subsets of consecutive elements can be extracted out of a larger `Span` with an API similar to the recently added `extracting()` functions of `UnsafeBufferPointer`: From 9370b134f28678da8ddc9c7b9a3e46cbff6cb1eb Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sun, 30 Jun 2024 02:33:50 -0700 Subject: [PATCH 41/73] fix misspelling --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 4f4c469e3c..b37047ad8e 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -63,7 +63,7 @@ By relying on borrowing, `Span` can provide simultaneous access to a non-copyabl ### `RawSpan` -`RawSpan` allows sharing the contiguous internal representation for values which may be heterogenously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straight-forwards understanding of performance for library code. +`RawSpan` allows sharing the contiguous internal representation for values which may be heterogenously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straightforward understanding of performance in library code. `Span` can always be converted to `RawSpan`, using a conditionally-available property or a constructor. From 572a23608de8946e4dd1c09dd509c52f4100a4c4 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sun, 30 Jun 2024 02:34:23 -0700 Subject: [PATCH 42/73] add missing doc-comment paragraph --- proposals/nnnn-safe-shared-contiguous-storage.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index b37047ad8e..dd61568fc2 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -339,6 +339,10 @@ extension Span where Element: ~Copyable & ~Escapable { /// If the maximum length exceeds the length of this span, /// the result contains all the elements. /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not share their indices with the + /// span from which they are extracted. + /// /// - Parameter maxLength: The maximum number of elements to return. /// `maxLength` must be greater than or equal to zero. /// - Returns: A span with at most `maxLength` elements. @@ -349,6 +353,10 @@ extension Span where Element: ~Copyable & ~Escapable { /// If the number of elements to drop exceeds the number of elements in /// the span, the result is an empty span. /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not share their indices with the + /// span from which they are extracted. + /// /// - Parameter k: The number of elements to drop off the end of /// the span. `k` must be greater than or equal to zero. /// - Returns: A span leaving off the specified number of elements at the end. @@ -360,6 +368,10 @@ extension Span where Element: ~Copyable & ~Escapable { /// If the maximum length exceeds the length of this span, /// the result contains all the elements. /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not share their indices with the + /// span from which they are extracted. + /// /// - Parameter maxLength: The maximum number of elements to return. /// `maxLength` must be greater than or equal to zero. /// - Returns: A span with at most `maxLength` elements. @@ -370,6 +382,10 @@ extension Span where Element: ~Copyable & ~Escapable { /// If the number of elements to drop exceeds the number of elements in /// the span, the result is an empty span. /// + /// The returned span's first item is always at offset 0; unlike buffer + /// slices, extracted spans do not share their indices with the + /// span from which they are extracted. + /// /// - Parameter k: The number of elements to drop from the beginning of /// the span. `k` must be greater than or equal to zero. /// - Returns: A span starting after the specified number of elements. From c9c312ca9e2cbd195f3f8d0faff7e86a9fe6ff59 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sun, 30 Jun 2024 02:34:44 -0700 Subject: [PATCH 43/73] change `uncheckedBounds` to `unchecked` --- proposals/nnnn-safe-shared-contiguous-storage.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index dd61568fc2..4971ca3257 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -422,7 +422,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// this range must be within the bounds of this `Span`. /// /// - Returns: A `Span` over the items within `bounds` - public func extracting(uncheckedBounds bounds: Range) -> Self + public func extracting(unchecked bounds: Range) -> Self /// Constructs a new span over the items within the supplied range of /// positions within this span. @@ -437,7 +437,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// this range must be within the bounds of this `Span`. /// /// - Returns: A `Span` over the items within `bounds` - public func extracting(uncheckedBounds bounds: some RangeExpression) -> Self + public func extracting(unchecked bounds: some RangeExpression) -> Self } ``` @@ -728,12 +728,12 @@ extension RawSpan { public func extracting(_ bounds: Range) -> Self - public func extracting(uncheckedBounds bounds: Range) -> Self + public func extracting(unchecked bounds: Range) -> Self public func extracting(_ bounds: some RangeExpression) -> Self public func extracting( - uncheckedBounds bounds: some RangeExpression + unchecked bounds: some RangeExpression ) -> Self public func extracting(_: UnboundedRange) -> Self From 42170bfde35b6f76b729a824c6823c207e4ab064 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sun, 30 Jun 2024 02:36:17 -0700 Subject: [PATCH 44/73] fix doc-comments --- proposals/nnnn-safe-shared-contiguous-storage.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 4971ca3257..1f42fc4487 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -221,7 +221,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// /// - Parameters: /// - pointer: a pointer to the first initialized element. - /// - count: the number of initialized elements in the view. + /// - count: the number of initialized elements in the span. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. public init( @@ -244,7 +244,6 @@ extension Span where Element: BitwiseCopyable { /// /// - Parameters: /// - unsafeBytes: a buffer to initialized elements. - /// - type: the type to use when interpreting the bytes in memory. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. public init( @@ -263,8 +262,7 @@ extension Span where Element: BitwiseCopyable { /// /// - Parameters: /// - unsafeRawPointer: a pointer to the first initialized element. - /// - type: the type to use when interpreting the bytes in memory. - /// - count: the number of initialized elements in the view. + /// - byteCount: the number of initialized bytes in the span. /// - owner: a binding whose lifetime must exceed that of /// the newly created `Span`. public init( @@ -557,8 +555,8 @@ extension RawSpan { /// meaning that as long as `owner` is alive the memory will remain valid. /// /// - Parameters: - /// - pointer: a pointer to the first initialized element. - /// - count: the number of initialized elements in the view. + /// - pointer: a pointer to the first initialized byte. + /// - byteCount: the number of initialized bytes in the span. /// - owner: a binding whose lifetime must exceed that of /// the newly created `RawSpan`. public init( From 1319b1d3d55cdc3f959057e0e72444a92d015549 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Sun, 30 Jun 2024 02:54:19 -0700 Subject: [PATCH 45/73] rework `load` and company --- .../nnnn-safe-shared-contiguous-storage.md | 45 +++++++++++++------ 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 1f42fc4487..c7c1448341 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -589,6 +589,9 @@ extension RawSpan { /// accessing `T` and initialized to `T` or another type that is layout /// compatible with `T`. /// + /// This is an unsafe operation. Failure to meet the preconditions + /// above may produce an invalid value of `T`. + /// /// - Parameters: /// - offset: The offset from this pointer, in bytes. `offset` must be /// nonnegative. The default is zero. @@ -596,7 +599,7 @@ extension RawSpan { /// - Returns: A new instance of type `T`, read from the raw bytes at /// `offset`. The returned instance is memory-managed and unassociated /// with the value in the memory referenced by this pointer. - public func load( + public func unsafeLoad( fromByteOffset offset: Int = 0, as: T.Type ) -> T @@ -607,8 +610,9 @@ extension RawSpan { /// accessing `T` and initialized to `T` or another type that is layout /// compatible with `T`. /// - /// This function does not validate the bounds of the memory access; - /// this is an unsafe operation. + /// This is an unsafe operation. This function does not validate the bounds + /// of the memory access, and failure to meet the preconditions + /// above may produce an invalid value of `T`. /// /// - Parameters: /// - offset: The offset from this pointer, in bytes. `offset` must be @@ -617,13 +621,19 @@ extension RawSpan { /// - Returns: A new instance of type `T`, read from the raw bytes at /// `offset`. The returned instance is memory-managed and unassociated /// with the value in the memory referenced by this pointer. - public func load( + public func unsafeLoad( fromUncheckedByteOffset offset: Int, as: T.Type ) -> T /// Returns a new instance of the given type, constructed from the raw memory /// at the specified offset. /// + /// The memory at this pointer plus `offset` must be initialized to `T` + /// or another type that is layout compatible with `T`. + /// + /// This is an unsafe operation. Failure to meet the preconditions + /// above may produce an invalid value of `T`. + /// /// - Parameters: /// - offset: The offset from this pointer, in bytes. `offset` must be /// nonnegative. The default is zero. @@ -631,15 +641,19 @@ extension RawSpan { /// - Returns: A new instance of type `T`, read from the raw bytes at /// `offset`. The returned instance isn't associated /// with the value in the range of memory referenced by this pointer. - public func loadUnaligned( + public func unsafeLoadUnaligned( fromByteOffset offset: Int = 0, as: T.Type ) -> T /// Returns a new instance of the given type, constructed from the raw memory /// at the specified offset. /// - /// This function does not validate the bounds of the memory access; - /// this is an unsafe operation. + /// The memory at this pointer plus `offset` must be initialized to `T` + /// or another type that is layout compatible with `T`. + /// + /// This is an unsafe operation. This function does not validate the bounds + /// of the memory access, and failure to meet the preconditions + /// above may produce an invalid value of `T`. /// /// - Parameters: /// - offset: The offset from this pointer, in bytes. `offset` must be @@ -648,7 +662,7 @@ extension RawSpan { /// - Returns: A new instance of type `T`, read from the raw bytes at /// `offset`. The returned instance isn't associated /// with the value in the range of memory referenced by this pointer. - public func loadUnaligned( + public func unsafeLoadUnaligned( fromUncheckedByteOffset offset: Int, as: T.Type ) -> T } @@ -658,14 +672,19 @@ A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homoge ```swift extension RawSpan { - /// View the memory span represented by this view as a different type + /// View the bytes of this span as type `T` + /// + /// This is the equivalent of `unsafeBitCast(_:to:)`. The + /// underlying bytes must be initialized as type `T`, or be + /// initialized to a type that is layout-compatible with `T`. /// - /// The memory must be laid out identically to the in-memory representation of `T`. + /// This is an unsafe operation. Failure to meet the preconditions + /// above may produce invalid values of `T`. /// /// - Parameters: - /// - type: The type you wish to view the memory as - /// - Returns: A new `Span` over elements of type `T` - public func view(as: T.Type) -> Span + /// - type: The type as which to view the bytes of this span. + /// - Returns: A typed span viewing these bytes as instances of `T`. + public func unsafeView(as type: T.Type) -> Span } ``` From 66bcb1944ca53d3d9ee3db729cab6cecadf17dcb Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 1 Jul 2024 09:14:07 -0700 Subject: [PATCH 46/73] add the `SurjectiveBitPattern` future direction --- .../nnnn-safe-shared-contiguous-storage.md | 43 +++++++++++-------- 1 file changed, 25 insertions(+), 18 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index c7c1448341..866b6e442e 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -576,9 +576,9 @@ extension RawSpan { } ``` -##### Accessing the memory of a `RawSpan`: +##### Accessing the memory of a `RawSpan`: -The basic operations to access the contents of the memory underlying a `RawSpan` are `load(as:)` and `loadUnaligned(as:)`. +`RawSpan` has basic operations to access the contents of its memory: `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. These operations are not type-safe, in that the loaded value returned by the operation can be invalid. Some types have a property that makes this operation safe, but there is no [formal identification](#SurjectiveBitPattern) for these types at this time. ```swift extension RawSpan { @@ -606,12 +606,10 @@ extension RawSpan { /// Returns a new instance of the given type, constructed from the raw memory /// at the specified offset. /// - /// The memory at this pointer plus `offset` must be properly aligned for - /// accessing `T` and initialized to `T` or another type that is layout - /// compatible with `T`. + /// The memory at this pointer plus `offset` must be initialized to `T` + /// or another type that is layout compatible with `T`. /// - /// This is an unsafe operation. This function does not validate the bounds - /// of the memory access, and failure to meet the preconditions + /// This is an unsafe operation. Failure to meet the preconditions /// above may produce an invalid value of `T`. /// /// - Parameters: @@ -619,19 +617,24 @@ extension RawSpan { /// nonnegative. The default is zero. /// - type: The type of the instance to create. /// - Returns: A new instance of type `T`, read from the raw bytes at - /// `offset`. The returned instance is memory-managed and unassociated - /// with the value in the memory referenced by this pointer. - public func unsafeLoad( - fromUncheckedByteOffset offset: Int, as: T.Type + /// `offset`. The returned instance isn't associated + /// with the value in the range of memory referenced by this pointer. + public func unsafeLoadUnaligned( + fromByteOffset offset: Int = 0, as: T.Type ) -> T +``` +These functions have the following counterparts which omit bounds-checking for cases where redundant checks affect performance: +```swift /// Returns a new instance of the given type, constructed from the raw memory /// at the specified offset. /// - /// The memory at this pointer plus `offset` must be initialized to `T` - /// or another type that is layout compatible with `T`. + /// The memory at this pointer plus `offset` must be properly aligned for + /// accessing `T` and initialized to `T` or another type that is layout + /// compatible with `T`. /// - /// This is an unsafe operation. Failure to meet the preconditions + /// This is an unsafe operation. This function does not validate the bounds + /// of the memory access, and failure to meet the preconditions /// above may produce an invalid value of `T`. /// /// - Parameters: @@ -639,10 +642,10 @@ extension RawSpan { /// nonnegative. The default is zero. /// - type: The type of the instance to create. /// - Returns: A new instance of type `T`, read from the raw bytes at - /// `offset`. The returned instance isn't associated - /// with the value in the range of memory referenced by this pointer. - public func unsafeLoadUnaligned( - fromByteOffset offset: Int = 0, as: T.Type + /// `offset`. The returned instance is memory-managed and unassociated + /// with the value in the memory referenced by this pointer. + public func unsafeLoad( + fromUncheckedByteOffset offset: Int, as: T.Type ) -> T /// Returns a new instance of the given type, constructed from the raw memory @@ -801,6 +804,10 @@ This is discussed more fully in the [indexing appendix](#Indexing) below. This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for a type to provide borrowing access to its internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. +#### Layout constraint for surjective mapping of bit patterns + +We could add a layout constraint refining`BitwiseCopyable`, specifically for types whose mapping from bit pattern to values is a [surjective function](https://en.wikipedia.org/wiki/Surjective_function). Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) + #### Byte parsing helpers A handful of helper API can make `RawSpan` better suited for binary parsers and decoders. From f84aefc790ded7e3b3fa50a957a8c78c38229df5 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 1 Jul 2024 12:15:57 -0700 Subject: [PATCH 47/73] more about `SurjectiveBitPattern`, plus an alternative --- proposals/nnnn-safe-shared-contiguous-storage.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 866b6e442e..f2d5e11014 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -804,9 +804,11 @@ This is discussed more fully in the [indexing appendix](#Indexing) below. This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for a type to provide borrowing access to its internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. -#### Layout constraint for surjective mapping of bit patterns +#### Layout constraint for surjective maps of bit patterns -We could add a layout constraint refining`BitwiseCopyable`, specifically for types whose mapping from bit pattern to values is a [surjective function](https://en.wikipedia.org/wiki/Surjective_function). Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) +We could add a layout constraint refining`BitwiseCopyable`, specifically for types whose mapping from bit pattern to values is a [surjective function](https://en.wikipedia.org/wiki/Surjective_function), `SurjectiveBitPattern`. Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) + +An alternative to a layout constraint is to add a type validation step to ensure that a loaded bit pattern has resulted in an instance in which all relevant invariants are respected. This alternative would be more flexible, but may have a higher runtime cost. #### Byte parsing helpers From 4b13bcd43f14aaa1421a25292cde61bb5a752a32 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 1 Jul 2024 12:16:44 -0700 Subject: [PATCH 48/73] move reference to SE-0256 to the ContiguousStorage item --- proposals/nnnn-safe-shared-contiguous-storage.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index f2d5e11014..20c617b21e 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -19,12 +19,10 @@ This proposal is related to two other features being proposed along with it: [No - [SE-0426] BitwiseCopyable - [SE-0427] Noncopyable generics - [SE-0377] `borrowing` and `consuming` parameter ownership modifiers -- [SE-0256] `{Mutable}ContiguousCollection` protocol (rejected, superseded by this proposal) [SE-0426]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0426-bitwise-copyable.md [SE-0427]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md [SE-0377]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0377-parameter-ownership-modifiers.md -[SE-0256]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0256-contiguous-collection.md ## Motivation @@ -974,7 +972,7 @@ Even if `Span` were to replace all uses of a constant `Data` in API, something l #### A `ContiguousStorage` protocol -An earlier version of this proposal proposed a `ContiguousStorage` protocol by which a type could indicate that it can provide a `Span`. `ContiguousStorage` would form a bridge between generically-typed interfaces and a performant concrete implementation. +An earlier version of this proposal proposed a `ContiguousStorage` protocol by which a type could indicate that it can provide a `Span`. `ContiguousStorage` would form a bridge between generically-typed interfaces and a performant concrete implementation. It would supersede the rejected [SE-0256](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). For example, for the hypothetical base64 decoding library mentioned in the [motivation](#Motivation) section, a possible API could be: From 7a8857119caa38f6a6232ea768413bc2dc828eaa Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 3 Jul 2024 14:38:15 -0700 Subject: [PATCH 49/73] reword coroutine accessors --- proposals/nnnn-safe-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 20c617b21e..854315cff8 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -800,7 +800,7 @@ This is discussed more fully in the [indexing appendix](#Indexing) below. #### Coroutine Accessors -This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for a type to provide borrowing access to its internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. +This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for some types to be able to provide borrowing access to their internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. #### Layout constraint for surjective maps of bit patterns From a183439d75c20065dd04807f597ae74e6f55e425 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 16 Jul 2024 14:15:44 -0700 Subject: [PATCH 50/73] remove undesirable annotations and default values --- proposals/nnnn-safe-shared-contiguous-storage.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 854315cff8..24cf233a59 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -342,7 +342,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// - Parameter maxLength: The maximum number of elements to return. /// `maxLength` must be greater than or equal to zero. /// - Returns: A span with at most `maxLength` elements. - borrowing public func extracting(first maxLength: Int) -> Self + public func extracting(first maxLength: Int) -> Self /// Returns a span over all but the given number of trailing elements. /// @@ -356,7 +356,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// - Parameter k: The number of elements to drop off the end of /// the span. `k` must be greater than or equal to zero. /// - Returns: A span leaving off the specified number of elements at the end. - borrowing public func extracting(droppingLast k: Int) -> Self + public func extracting(droppingLast k: Int) -> Self /// Returns a span containing the final elements of the span, /// up to the given maximum length. @@ -371,7 +371,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// - Parameter maxLength: The maximum number of elements to return. /// `maxLength` must be greater than or equal to zero. /// - Returns: A span with at most `maxLength` elements. - borrowing public func extracting(last maxLength: Int) -> Self + public func extracting(last maxLength: Int) -> Self /// Returns a span over all but the given number of initial elements. /// @@ -385,7 +385,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// - Parameter k: The number of elements to drop from the beginning of /// the span. `k` must be greater than or equal to zero. /// - Returns: A span starting after the specified number of elements. - borrowing public func extracting(droppingFirst k: Int = 1) -> Self + public func extracting(droppingFirst k: Int) -> Self } ``` @@ -764,7 +764,7 @@ extension RawSpan { public func extracting(last maxLength: Int) -> Self - public func extracting(droppingFirst k: Int = 1) -> Self + public func extracting(droppingFirst k: Int) -> Self } ``` From 0a126190de6bbad52abb256df2d34f2cadc51559 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 16 Jul 2024 11:14:07 -0700 Subject: [PATCH 51/73] add containment utilities --- .../nnnn-safe-shared-contiguous-storage.md | 40 ++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 24cf233a59..a020107a37 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -389,6 +389,32 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` +##### Identifying whether a span is a subrange of another: + +When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span: + +```swift +extension Span where Element: ~Copyable & ~Escapable { + /// Returns true if the memory represented by `span` is a subrange of + /// the memory represented by `self` + /// + /// Parameters: + /// - span: a span of the same type as `self` + /// Returns: whether `span` is a subrange of `self` + public func contains(_ span: borrowing Self) -> Bool + + /// Returns the offsets where the memory of `span` is located within + /// the memory represented by `self` + /// + /// Note: `span` must be a subrange of `self` + /// + /// Parameters: + /// - span: a subrange of `self` + /// Returns: A range of offsets within `self` + public func offsets(of span: borrowing Self) -> Range +} +``` + ##### Unchecked access to elements and subranges of elements: The `subscript` and the `extracting()` functions mentioned above all have always-on bounds checking of their parameters, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly. @@ -739,7 +765,7 @@ extension RawSpan { ##### Accessing subranges of elements: -Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports the same set of `extracting()` functions as `Span`. The documentation is omitted here, as it is substantially the same as for `Span`: +Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports the same set of `extracting()` functions as `Span`. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: ```swift extension RawSpan { @@ -768,6 +794,18 @@ extension RawSpan { } ``` +##### Identifying whether a span is a subrange of another: + +When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: + +```swift +extension RawSpan { + public func contains(_ span: borrowing Self) -> Bool + + public func offsets(of span: borrowing Self) -> Range +} +``` + ## Source compatibility This proposal is additive and source-compatible with existing code. From 3c9ef5125c805058e17ad44a18fb01dd987902d0 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 16 Jul 2024 18:22:56 -0700 Subject: [PATCH 52/73] Apply suggestions from code review Thanks to @benrimmington's eagle eyes Co-authored-by: Ben Rimmington --- .../nnnn-safe-shared-contiguous-storage.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index a020107a37..b8b6a0fd5e 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -11,11 +11,11 @@ ## Introduction -We introduce `Span`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without comprimising on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. +We introduce `Span`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without compromising on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, without compromising Swift's memory safety. -This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/apple/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/swiftlang/swift-evolution/pull/2305), as well as related to the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. This proposal is also related to the following proposals: +This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/swiftlang/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/swiftlang/swift-evolution/pull/2305), as well as related to the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. This proposal is also related to the following proposals: - [SE-0426] BitwiseCopyable - [SE-0427] Noncopyable generics - [SE-0377] `borrowing` and `consuming` parameter ownership modifiers @@ -61,7 +61,7 @@ By relying on borrowing, `Span` can provide simultaneous access to a non-copyabl ### `RawSpan` -`RawSpan` allows sharing the contiguous internal representation for values which may be heterogenously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straightforward understanding of performance in library code. +`RawSpan` allows sharing the contiguous internal representation for values which may be heterogeneously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straightforward understanding of performance in library code. `Span` can always be converted to `RawSpan`, using a conditionally-available property or a constructor. @@ -395,7 +395,7 @@ When working with multiple `Span` instances, it is often desirable to know whet ```swift extension Span where Element: ~Copyable & ~Escapable { - /// Returns true if the memory represented by `span` is a subrange of + /// Returns true if the memory represented by `span` is a subrange of /// the memory represented by `self` /// /// Parameters: @@ -403,7 +403,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// Returns: whether `span` is a subrange of `self` public func contains(_ span: borrowing Self) -> Bool - /// Returns the offsets where the memory of `span` is located within + /// Returns the offsets where the memory of `span` is located within /// the memory represented by `self` /// /// Note: `span` must be a subrange of `self` @@ -543,7 +543,7 @@ extension Span where Element: BitwiseCopyable { ### RawSpan -In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogenously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that intends to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports extracting sub-spans, along with the operations `load(as:)` and `loadUnaligned(as:)`. +In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogeneously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that intends to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports extracting sub-spans, along with the operations `load(as:)` and `loadUnaligned(as:)`. #### Complete `RawSpan` API: @@ -695,7 +695,7 @@ These functions have the following counterparts which omit bounds-checking for c } ``` -A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homogenously as instances of `T`. +A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homogeneously as instances of `T`. ```swift extension RawSpan { @@ -1052,7 +1052,7 @@ myStrnlen(array) // 8 This would probably consist of a new type of custom conversion in the language. A type author would provide a way to convert from their type to an owned `Span`, and the compiler would insert that conversion where needed. This would enhance readability and reduce boilerplate. -#### Interopability with C++'s `std::span` and with llvm's `-fbounds-safety` +#### Interoperability with C++'s `std::span` and with llvm's `-fbounds-safety` The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class template from the C++ standard library is a similar representation of a contiguous range of memory. LLVM may soon have a [bounds-checking mode](https://discourse.llvm.org/t/70854) for C. These are opportunities for better, safer interoperation with Swift, via a type such as `Span`. @@ -1076,7 +1076,7 @@ Each of these introduces practical tradeoffs in the design. Collections which own their storage have the convention of separate slice types, such as `Array` and `String`. This has the advantage of clearly delineating storage ownership in the programming model and the disadvantage of introducing a second type through which to interact. -When types do not own their storage, separate slice types can be [cumbersome](https://github.com/apple/swift/blob/bcd08c0c9a74974b4757b4b8a2d1796659b1d940/stdlib/public/core/StringComparison.swift#L175). The reason `UnsafeBufferPointer` has a separate slice type is because it wants to allow indices to be reused across slices and its `Index` is a relative offset from the start (`Int`) rather than an absolute position (such as a pointer). +When types do not own their storage, separate slice types can be [cumbersome](https://github.com/swiftlang/swift/blob/swift-5.10.1-RELEASE/stdlib/public/core/StringComparison.swift#L175). The reason `UnsafeBufferPointer` has a separate slice type is because it wants to allow indices to be reused across slices and its `Index` is a relative offset from the start (`Int`) rather than an absolute position (such as a pointer). `Span` does not own its storage and there is no concern about leaking larger allocations. It would benefit from being its own slice type. @@ -1181,7 +1181,7 @@ When the reused allocation happens to be stride-aligned, there is no undefined b Bounds checks protect against critical programmer errors. It would be nice, pending engineering tradeoffs, to also protect against some reuse after free errors and invalid index reuse, especially those that may lead to undefined behavior. -Future improvements to microarchitecture may make reuse after free checks cheaper, however we need something for the forseeable future. Any validation we can do reduces the need to switch to other mitigation strategies or make other tradeoffs. +Future improvements to microarchitecture may make reuse after free checks cheaper, however we need something for the foreseeable future. Any validation we can do reduces the need to switch to other mitigation strategies or make other tradeoffs. #### Design approaches for indices From 4496c542b117d051a9e064a4e6d0600dbf05592c Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 14 Aug 2024 11:43:29 -0700 Subject: [PATCH 53/73] remove extension to `Character.UTF8View` - this is redundant, since it is the same as `String.UTF8View` --- proposals/nnnn-safe-shared-contiguous-storage.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index b8b6a0fd5e..048f5e96dd 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -134,10 +134,6 @@ extension Substring.UTF8View { // note: this could borrow a temporary copy of the `Substring`'s storage var storage: Span { _read } } -extension Character.UTF8View { - // note: this could borrow a temporary copy of the `Character`'s storage - var storage: Span { _read } -} extension SIMD { var storage: Span { _read } From 66a78f8caf24a328b36bf30bb8a7d41d5f264c2d Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 15 Aug 2024 18:01:47 -0700 Subject: [PATCH 54/73] add closure-taking api, move initializers to future --- .../nnnn-safe-shared-contiguous-storage.md | 265 +++++++++--------- 1 file changed, 129 insertions(+), 136 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 048f5e96dd..897b46b900 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -13,16 +13,21 @@ We introduce `Span`, an abstraction for container-agnostic access to contiguous memory. It will expand the expressivity of performant Swift code without compromising on the memory safety properties we rely on: temporal safety, spatial safety, definite initialization and type safety. -In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to create a similar idiom in Swift, without compromising Swift's memory safety. +In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to enable a similar idiom in Swift, without compromising Swift's memory safety. + +This proposal is related to two compiler features: [Nonescapable types](https://github.com/swiftlang/swift-evolution/pull/2304) (`~Escapable`,) being proposed along side, and [Compile-time Lifetime Dependency Annotations][PR-2305], which will be proposed in the following weeks. A precursor to this proposal was the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. This proposal is also related to the following proposals: -This proposal is related to two other features being proposed along with it: [Nonescapable types](https://github.com/swiftlang/swift-evolution/pull/2304) (`~Escapable`) and [Compile-time Lifetime Dependency Annotations](https://github.com/swiftlang/swift-evolution/pull/2305), as well as related to the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. This proposal is also related to the following proposals: - [SE-0426] BitwiseCopyable - [SE-0427] Noncopyable generics +- [SE-0437] Non-copyable Standard Library Primitives - [SE-0377] `borrowing` and `consuming` parameter ownership modifiers [SE-0426]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0426-bitwise-copyable.md [SE-0427]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md +[SE-0437]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0437-noncopyable-stdlib-primitives.md [SE-0377]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0377-parameter-ownership-modifiers.md +[PR-2305]: https://github.com/swiftlang/swift-evolution/pull/2305 +[PR-2305-pitch]: https://forums.swift.org/t/69865 ## Motivation @@ -34,40 +39,25 @@ The `UnsafeBufferPointer` passed to a `withUnsafeXXX` closure-style API, while p 2. `subscript` is only bounds-checked in debug builds of client code 3. It might escape the duration of the closure -Even if the body of the `withUnsafeXXX` call does not escape the pointer, other functions called inside the closure have to be written in terms of unsafe pointers. This requires programmer vigilance across a project and potentially spreads the use of unsafe types, even when it could have been written in terms of safe constructs. - -We want to take advantage of the features of non-escapable types to replace some closure-taking API with simple properties, resulting in more composable code: - -```swift -let array = Array("Hello\0".utf8) - -// Old -array.withUnsafeBufferPointer { - // use `$0` here for direct memory access -} - -// New -let span: Span = array.storage -// use `span` in the same scope as `array` for direct memory access -``` +Even if the body of the `withUnsafeXXX` call does not escape the pointer, other functions called within the closure have to be written in terms of unsafe pointers. This requires programmer vigilance across a project and potentially spreads the use of unsafe types, even if the helper functions could have been written in terms of safe constructs. ## Proposed solution -`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. `Span`view does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed while the `Span` exists. `Span`'s lifetime is statically enforced as a lifetime dependency to a binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. +`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. `Span` does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed while the `Span` exists. In this first proposal, `Span`s will be constrained to closures from which they structurally cannot escape. Later, we will introduce a lifetime dependency between the `Span` and the binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. -By relying on borrowing, `Span` can provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see the future directions for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) +A `Span` provided by container represents a borrow of that container. `Span` can therefore provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see [future directions](#Directions) for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) -`Span` is the currency type for local processing over values in contiguous memory. It is the replacement for any API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that does not need to escape the value. +`Span` is the currency type for local processing over values in contiguous memory. It is a replacement for many API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that do not need to escape the owning container. ### `RawSpan` `RawSpan` allows sharing the contiguous internal representation for values which may be heterogeneously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straightforward understanding of performance in library code. -`Span` can always be converted to `RawSpan`, using a conditionally-available property or a constructor. +`Span` can always be converted to `RawSpan`, using a conditionally-available property. ## Detailed design -`Span` is a simple representation of a span of initialized memory. +`Span` is a simple representation of a region of initialized memory. ```swift public struct Span: Copyable, ~Escapable { @@ -87,9 +77,9 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` -Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability for itself as well as its elements. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, a sub-span representing a subrange of its elements _must_ have the same lifetime as the `Span` from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. +Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability of the `Collection` itself as well as of element type. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, a sub-span representing a subrange of its elements _must_ have the same lifetime as the `Span` from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. -`Span`s representing subsets of consecutive elements can be extracted out of a larger `Span` with an API similar to the recently added `extracting()` functions of `UnsafeBufferPointer`: +`Span`s representing subsets of consecutive elements can be extracted out of a larger `Span` with an API similar to the `extracting()` functions recently added to `UnsafeBufferPointer` in support of non-copyable elements: ```swift extension Span where Element: ~Copyable & ~Escapable { @@ -97,7 +87,7 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` -The first element of a given span is _always_ at position zero, and its last it as position `count-1`. +The first element of a given span is always at position zero, and its last element is always at position `count-1`. As a side-effect of not conforming to `Collection` or `Sequence`, `Span` is not directly supported by `for` loops at this time. It is, however, easy to use in a `for` loop via indexing: @@ -109,59 +99,129 @@ for i in 0.. { _read } + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable } + extension ArraySlice { - // note: this could borrow a temporary copy of the `ArraySlice`'s storage - var storage: Span { _read } + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable } + extension ContiguousArray { - var storage: Span { get } + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable } extension Foundation.Data { - var storage: Span { get } + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result } extension String.UTF8View { - // note: this could borrow a temporary copy of the `String`'s storage - var storage: Span { _read } + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result } + extension Substring.UTF8View { - // note: this could borrow a temporary copy of the `Substring`'s storage - var storage: Span { _read } + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result +} + +extension CollectionOfOne { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable } extension SIMD { - var storage: Span { _read } + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Self.Scalar: BitwiseCopyable } + extension KeyValuePairs { - var storage: Span<(Self.Key, Self.Value)> { get } -} -extension CollectionOfOne { - var storage: Span { _read } + public func withSpan( + _ body: (_ elements: Span<(key: Key, value: Value)>) throws(E) -> Result + ) throws(E) -> Result { + try Array(self).withSpan(body) + } + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable { + try Array(self).withBytes(body) + } } ``` -#### Using `Span` with C functions or other unsafe code: +In this proposal, the `withSpan()` and `withBytes()` methods are the only ways to obtain a `Span`. Initializers, required for library adoption, will be proposed alongside (lifetime annotations)[PR-2305]; for further details, see [Initializers](#Initializers) in the [future directions](#Directions) section. + +#### Using `Span` with unsafe code: -The `UnsafeBufferPointer` family of types can be be adapted for use with `Span`-taking API by using unsafe `Span` initializers. A `Span` instance obtained this way loses a static guarantee of temporal safety, because it is possible to deinitialize or deallocate the source `UnsafeMutableBufferPointer` before the end of the `Span` instance's scope. +The `UnsafeBufferPointer` family of types can be be adapted for use with `Span`-taking API by using unsafe `Span`-providing functions. A `Span` instance obtained this way loses its static guarantee of temporal safety, because it is possible to deinitialize or deallocate the underlying memory before the end of the closure's scope. The closure provides a clear signal to avoid doing so, but it is not enforceable by the compiler. ```swift -extension HypotheticalBase64Decoder { - public func decode(bytes: Span) -> [UInt8] +extension UnsafeBufferPointer { // as well as UnsafeMutableBufferPointer + public func withUnsafeSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withUnsafeBytes( + _ body: (_ elements: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable } -data.withUnsafeBytes { (buffer: UnsafeRawBufferPointer) in - let span = Span(unsafeBytes: buffer, owner: buffer) - let decoded = myBase64Decoder.decode(span) +extension UnsafeRawBufferPointer { // as well as UnsafeMutableRawBufferPointer + public func withUnsafeBytes( + _ body: (_ elements: RawSpan) throws(E) -> Result + ) throws(E) -> Result } ``` -`Span` has an unsafe hatch for use with functions that take an unsafe argument: +`Span` and `RawSpan` also have an unsafe hatch for use with functions that require an unsafe argument, such as C functions: ```swift extension Span where Element: ~Copyable & ~Escapable { @@ -175,6 +235,12 @@ extension Span where Element: BitwiseCopyable { _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result ) throws(E) -> Result } + +extension RawSpan { + func withUnsafeBytes( + _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result + ) throws(E) -> Result +} ``` ### Complete `Span` API: @@ -186,87 +252,6 @@ public struct Span: Copyable, ~Escapable { } ``` -##### Creating a `Span`: - -The initialization of a `Span` instance from an unsafe pointer is an unsafe operation. Typically these initializers will be used internally to a container's implementation and return a borrowed `Span` tied to the container's lifetime. Safe usage relies on a guarantee that the represented storage is managed correctly and outlives the `Span` instance. - -```swift -extension Span where Element: ~Copyable & ~Escapable { - - /// Unsafely create a `Span` over initialized memory. - /// - /// The memory in `buffer` must be owned by the instance `owner`, - /// meaning that as long as `owner` is alive the memory will remain valid. - /// - /// - Parameters: - /// - buffer: an `UnsafeBufferPointer` to initialized elements. - /// - owner: a binding whose lifetime must exceed that of - /// the newly created `Span`. - public init( - unsafeElements buffer: UnsafeBufferPointer, - owner: borrowing Owner - ) - - /// Unsafely create a `Span` over initialized memory. - /// - /// The memory representing `count` instances starting at - /// `pointer` must be owned by the instance `owner`, - /// meaning that as long as `owner` is alive the memory will remain valid. - /// - /// - Parameters: - /// - pointer: a pointer to the first initialized element. - /// - count: the number of initialized elements in the span. - /// - owner: a binding whose lifetime must exceed that of - /// the newly created `Span`. - public init( - unsafeStart pointer: UnsafePointer, - count: Int, - owner: borrowing Owner - ) -} - -extension Span where Element: BitwiseCopyable { - - /// Unsafely create a `Span` over initialized memory. - /// - /// The memory in `unsafeBytes` must be owned by the instance `owner` - /// meaning that as long as `owner` is alive the memory will remain valid. - /// - /// `unsafeBytes` must be correctly aligned for accessing - /// an element of type `Element`, and must contain a number of bytes - /// that is an exact multiple of `Element`'s stride. - /// - /// - Parameters: - /// - unsafeBytes: a buffer to initialized elements. - /// - owner: a binding whose lifetime must exceed that of - /// the newly created `Span`. - public init( - unsafeBytes buffer: UnsafeRawBufferPointer, - owner: borrowing Owner - ) - - /// Unsafely create a `Span` over a span of initialized memory. - /// - /// The memory representing `count` instances starting at - /// `unsafeRawPointer` must be owned by the instance `owner`, - /// meaning that as long as `owner` is alive the memory will remain valid. - /// - /// `unsafeRawPointer` must be correctly aligned for accessing - /// an element of type `Element`. - /// - /// - Parameters: - /// - unsafeRawPointer: a pointer to the first initialized element. - /// - byteCount: the number of initialized bytes in the span. - /// - owner: a binding whose lifetime must exceed that of - /// the newly created `Span`. - public init( - unsafeStart pointer: UnsafeRawPointer, - byteCount: Int, - owner: borrowing Owner - ) -} -``` - ##### Basic API: The following properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation where applicable, allowing them to return borrowed nonescapable values or borrowed noncopyable values. @@ -387,7 +372,7 @@ extension Span where Element: ~Copyable & ~Escapable { ##### Identifying whether a span is a subrange of another: -When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span: +When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span: ```swift extension Span where Element: ~Copyable & ~Escapable { @@ -413,7 +398,7 @@ extension Span where Element: ~Copyable & ~Escapable { ##### Unchecked access to elements and subranges of elements: -The `subscript` and the `extracting()` functions mentioned above all have always-on bounds checking of their parameters, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly. +The `subscript` and the `extracting()` functions mentioned above all have always-on bounds checking of their parameters, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly, such as in tight loops: ```swift extension Span where Element: ~Copyable & ~Escapable { @@ -493,6 +478,8 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` +Note: these function names could be improved. + ##### Interoperability with unsafe code: We provide two functions for interoperability with C or other legacy pointer-taking functions. @@ -539,7 +526,7 @@ extension Span where Element: BitwiseCopyable { ### RawSpan -In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogeneously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that intends to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports extracting sub-spans, along with the operations `load(as:)` and `loadUnaligned(as:)`. +In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogeneously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that intends to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports extracting sub-spans, along with the data loading operations `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. #### Complete `RawSpan` API: @@ -792,7 +779,7 @@ extension RawSpan { ##### Identifying whether a span is a subrange of another: -When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: +When working with multiple `RawSpan` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: ```swift extension RawSpan { @@ -832,15 +819,21 @@ This is discussed more fully in the [indexing appendix](#Indexing) below. ## Future directions +#### Initializing and returning `Span` instances + +A `Span` represents a region of memory and, as such, must be initialized using unsafe pointers. This is an unsafe operation which will typically be performed internally to a container's implementation. In order to bridge to safe code, these initializers require new annotations that indicate to the compiler how the newly-created `Span` can be used safely. + +These annotations have been [pitched][PR-2305-pitch] and are expected to be formally [proposed][PR-2305] soon. `Span` initializers using lifetime annotations will be proposed alongside the annotations themselves. + #### Coroutine Accessors This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for some types to be able to provide borrowing access to their internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. -#### Layout constraint for surjective maps of bit patterns +#### Layout constraint for safe loading of bit patterns We could add a layout constraint refining`BitwiseCopyable`, specifically for types whose mapping from bit pattern to values is a [surjective function](https://en.wikipedia.org/wiki/Surjective_function), `SurjectiveBitPattern`. Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) -An alternative to a layout constraint is to add a type validation step to ensure that a loaded bit pattern has resulted in an instance in which all relevant invariants are respected. This alternative would be more flexible, but may have a higher runtime cost. +An alternative to a layout constraint is to add a type validation step to ensure that if a given bit pattern were to be interpreted as a value of type `T`, then all the invariants of type `T` would be respected. This alternative would be more flexible, but may have a higher runtime cost. #### Byte parsing helpers From 32adc87e65cbd9a4be68112ad1e3addd018916f3 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 16 Aug 2024 17:41:16 -0700 Subject: [PATCH 55/73] shrink byte-parsing helpers future direction --- .../nnnn-safe-shared-contiguous-storage.md | 71 ++++--------------- 1 file changed, 12 insertions(+), 59 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 897b46b900..27ef41d97c 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -841,39 +841,27 @@ A handful of helper API can make `RawSpan` better suited for binary parsers and ```swift extension RawSpan { - @frozen public struct Cursor: Copyable, ~Escapable { public let base: RawSpan /// The current parsing position public var position: Int - @inlinable - public init(_ base: RawSpan) - - /// Parse an instance of `T` and advance - @inlinable + /// Parse an instance of `T` and advance. + /// Returns `nil` if there are not enough bytes remaining for an instance of `T`. public mutating func parse( _ t: T.Type = T.self - ) throws(OutOfBoundsError) -> T + ) -> T? - /// Parse `numBytes`and advance - @inlinable + /// Parse `numBytes`and advance. + /// Returns `nil` if there are fewer than `numBytes` remaining. public mutating func parse( numBytes: some FixedWidthInteger - ) throws (OutOfBoundsError) -> RawSpan + ) -> RawSpan? /// The bytes that we've parsed so far - @inlinable public var parsedBytes: RawSpan { get } - - /// The remaining bytes left to parse - @inlinable - public var remainingBytes: RawSpan { get } } - - @inlinable - public func makeCursor() -> Cursor } ``` @@ -883,49 +871,14 @@ Alternatively, if some future `RawSpan.Iterator` were 3 words in size (start, cu ##### Example: Parsing PNG -The code snippet below parses [PNG Chunks](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks): +The code snippet below parses a [PNG Chunk](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks): ```swift -struct PNGChunk: ~Escapable { - let contents: RawSpan - - public init( - _ contents: RawSpan, _ owner: borrowing Owner - ) throws (PNGValidationError) { - self.contents = contents - try self._validate() - } - - var length: UInt32 { - contents.loadUnaligned(as: UInt32.self).bigEndian - } - var type: UInt32 { - contents.loadUnaligned( - fromUncheckedByteOffset: 4, as: UInt32.self).bigEndian - } - var data: RawSpan { - contents[uncheckedOffsets: 8..<(contents.count-4)] - } - var crc: UInt32 { - contents.loadUnaligned( - fromUncheckedByteOffset: contents.count-4, as: UInt32.self - ).bigEndian - } -} - -func parsePNGChunk( - _ span: RawSpan, - _ owner: borrowing Owner -) throws -> PNGChunk { - var cursor = span.makeCursor() - - let length = try cursor.parse(UInt32.self).bigEndian - _ = try cursor.parse(UInt32.self) // type - _ = try cursor.parse(numBytes: length) // data - _ = try cursor.parse(UInt32.self) // crc - - return PNGChunk(cursor.parsedBytes, owner) -} +// Parse a PNG chunk +let length = try cursor.parse(UInt32.self).bigEndian +let type = try cursor.parse(UInt32.self).bigEndian +let data = try cursor.parse(numBytes: length) +let crc = try cursor.parse(UInt32.self).bigEndian ``` #### Defining `BorrowingIterator` with support in `for` loops From 3d45c07dec8d300b3e5386ae8b307658ec5cd317 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 19 Aug 2024 10:22:09 -0700 Subject: [PATCH 56/73] formatting, text moved around --- .../nnnn-safe-shared-contiguous-storage.md | 290 +++++++++--------- 1 file changed, 141 insertions(+), 149 deletions(-) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-safe-shared-contiguous-storage.md index 27ef41d97c..6baea35ffa 100644 --- a/proposals/nnnn-safe-shared-contiguous-storage.md +++ b/proposals/nnnn-safe-shared-contiguous-storage.md @@ -1,7 +1,7 @@ # Safe Access to Contiguous Storage * Proposal: [SE-NNNN](nnnn-safe-shared-contiguous-storage.md) -* Authors: [Guillaume Lessard](https://github.com/glessard), [Andrew Trick](https://github.com/atrick), [Michael Ilseman](https://github.com/milseman) +* Authors: [Guillaume Lessard](https://github.com/glessard), [Michael Ilseman](https://github.com/milseman), [Andrew Trick](https://github.com/atrick) * Review Manager: TBD * Status: **Awaiting implementation** * Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) @@ -43,23 +43,44 @@ Even if the body of the `withUnsafeXXX` call does not escape the pointer, other ## Proposed solution +#### `Span` + `Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. `Span` does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed while the `Span` exists. In this first proposal, `Span`s will be constrained to closures from which they structurally cannot escape. Later, we will introduce a lifetime dependency between the `Span` and the binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. A `Span` provided by container represents a borrow of that container. `Span` can therefore provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see [future directions](#Directions) for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) `Span` is the currency type for local processing over values in contiguous memory. It is a replacement for many API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that do not need to escape the owning container. -### `RawSpan` +#### `RawSpan` + +`RawSpan` allows sharing the contiguous internal representation for values which may be heterogeneously-typed, as well as representing heteregeneously-typed input to be parsed. Since it is a fully concrete type, it can achieve great performance in debug builds of client code as well as straightforward performance in library code. -`RawSpan` allows sharing the contiguous internal representation for values which may be heterogeneously-typed, such as in decoders. Since it is a fully concrete type, it can achieve better performance in debug builds of client code as well as a more straightforward understanding of performance in library code. +A `RawSpan` can be obtained from containers of `BitwiseCopyable` elements, as well as be initialized directly from an instance of `Span`. + +#### Extensions to Standard Library and Foundation types -`Span` can always be converted to `RawSpan`, using a conditionally-available property. +The standard library and Foundation will provide `withSpan()` and `withBytes()` closure-taking functions as safe replacements for the existing `withUnsafeBufferPointer()` and `withUnsafeBytes()` functions. These functions provide For example, `Array` will be extended as follows: + +```swift +extension Array { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable +} +``` + +The full list of standard library and Foundation types to be extended in this manner is found in the [detailed design](#Design) section. ## Detailed design `Span` is a simple representation of a region of initialized memory. ```swift +@frozen public struct Span: Copyable, ~Escapable { internal var _start: UnsafePointer internal var _count: Int @@ -73,7 +94,7 @@ extension Span where Element: ~Copyable & ~Escapable { public var count: Int { get } public var isEmpty: Bool { get } - subscript(_ position: Int) -> Element { _read } + public subscript(_ position: Int) -> Element { _read } } ``` @@ -97,108 +118,6 @@ for i in 0..( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension ArraySlice { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension ContiguousArray { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension Foundation.Data { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} - -extension String.UTF8View { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} - -extension Substring.UTF8View { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} - -extension CollectionOfOne { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension SIMD { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Self.Scalar: BitwiseCopyable -} - -extension KeyValuePairs { - public func withSpan( - _ body: (_ elements: Span<(key: Key, value: Value)>) throws(E) -> Result - ) throws(E) -> Result { - try Array(self).withSpan(body) - } - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable { - try Array(self).withBytes(body) - } -} -``` - -In this proposal, the `withSpan()` and `withBytes()` methods are the only ways to obtain a `Span`. Initializers, required for library adoption, will be proposed alongside (lifetime annotations)[PR-2305]; for further details, see [Initializers](#Initializers) in the [future directions](#Directions) section. - #### Using `Span` with unsafe code: The `UnsafeBufferPointer` family of types can be be adapted for use with `Span`-taking API by using unsafe `Span`-providing functions. A `Span` instance obtained this way loses its static guarantee of temporal safety, because it is possible to deinitialize or deallocate the underlying memory before the end of the closure's scope. The closure provides a clear signal to avoid doing so, but it is not enforceable by the compiler. @@ -252,6 +171,8 @@ public struct Span: Copyable, ~Escapable { } ``` +Initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for further details, see "[Initializers](#Initializers)" in the [future directions](#Directions) section. + ##### Basic API: The following properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation where applicable, allowing them to return borrowed nonescapable values or borrowed noncopyable values. @@ -461,7 +382,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// /// - Parameters: /// - position: an index to validate - public func assertValidity(_ offset: Int) + public func ensureValidity(_ offset: Int) /// Return true if `offsets` is a valid range of offsets into this `Span` /// @@ -474,7 +395,7 @@ extension Span where Element: ~Copyable & ~Escapable { /// /// - Parameters: /// - offsets: a range of indices to validate - public func assertValidity(_ offsets: Range) + public func ensureValidity(_ offsets: Range) } ``` @@ -537,52 +458,21 @@ public struct RawSpan: Copyable, ~Escapable { } ``` -##### Initializing a `RawSpan`: +##### Converting a `Span` to a `RawSpan`: ```swift extension RawSpan { - /// Unsafely create a `RawSpan` over initialized memory. - /// - /// The memory in `buffer` must be owned by the instance `owner`, - /// meaning that as long as `owner` is alive the memory will remain valid. - /// - /// - Parameters: - /// - buffer: an `UnsafeRawBufferPointer` to initialized memory. - /// - owner: a binding whose lifetime must exceed that of - /// the newly created `RawSpan`. - public init( - unsafeBytes buffer: UnsafeRawBufferPointer, - owner: borrowing Owner - ) - - /// Unsafely create a `RawSpan` over initialized memory. - /// - /// The memory over `count` bytes starting at - /// `pointer` must be owned by the instance `owner`, - /// meaning that as long as `owner` is alive the memory will remain valid. - /// - /// - Parameters: - /// - pointer: a pointer to the first initialized byte. - /// - byteCount: the number of initialized bytes in the span. - /// - owner: a binding whose lifetime must exceed that of - /// the newly created `RawSpan`. - public init( - unsafeStart pointer: UnsafeRawPointer, - byteCount: Int, - owner: borrowing Owner - ) - /// Create a `RawSpan` over the memory represented by a `Span` /// /// - Parameters: /// - span: An existing `Span`, which will define both this /// `RawSpan`'s lifetime and the memory it represents. - public init( - _ span: borrowing Span - ) + public init(_ span: borrowing Span) } ``` +Other initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for further details, see "[Initializers](#Initializers)" in the [future directions](#Directions) section. + ##### Accessing the memory of a `RawSpan`: `RawSpan` has basic operations to access the contents of its memory: `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. These operations are not type-safe, in that the loaded value returned by the operation can be invalid. Some types have a property that makes this operation safe, but there is no [formal identification](#SurjectiveBitPattern) for these types at this time. @@ -736,13 +626,13 @@ extension RawSpan { public func validateBounds(_ offset: Int) -> Bool /// Traps if `offset` is not a valid offset into this `RawSpan` - public func assertValidity(_ offset: Int) + public func ensureValidity(_ offset: Int) /// Return true if `offsets` is a valid range of offsets into this `RawSpan` public func validateBounds(_ offsets: Range) -> Bool /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` - public func assertValidity(_ offsets: Range) + public func ensureValidity(_ offsets: Range) } ``` @@ -789,6 +679,106 @@ extension RawSpan { } ``` +### Extensions to standard library and Foundation types + +```swift +extension Array { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable +} + +extension ArraySlice { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable +} + +extension ContiguousArray { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable +} + +extension Foundation.Data { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result +} + +extension String.UTF8View { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result +} + +extension Substring.UTF8View { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result +} + +extension CollectionOfOne { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable +} + +extension SIMD { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Self.Scalar: BitwiseCopyable +} + +extension KeyValuePairs { + public func withSpan( + _ body: (_ elements: Span<(key: Key, value: Value)>) throws(E) -> Result + ) throws(E) -> Result { + try Array(self).withSpan(body) + } + + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable { + try Array(self).withBytes(body) + } +} +``` + +In this proposal, the `withSpan()` and `withBytes()` methods are the supported ways to obtain a `Span` or `RawSpan`. Initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for further details, see [Initializers](#Initializers) in the [future directions](#Directions) section. + ## Source compatibility This proposal is additive and source-compatible with existing code. @@ -821,17 +811,19 @@ This is discussed more fully in the [indexing appendix](#Indexing) below. #### Initializing and returning `Span` instances -A `Span` represents a region of memory and, as such, must be initialized using unsafe pointers. This is an unsafe operation which will typically be performed internally to a container's implementation. In order to bridge to safe code, these initializers require new annotations that indicate to the compiler how the newly-created `Span` can be used safely. +A `Span` represents a region of memory and, as such, must be initialized using an unsafe pointer. This is an unsafe operation which will typically be performed internally to a container's implementation. In order to bridge to safe code, these initializers require new annotations that indicate to the compiler how the newly-created `Span` can be used safely. These annotations have been [pitched][PR-2305-pitch] and are expected to be formally [proposed][PR-2305] soon. `Span` initializers using lifetime annotations will be proposed alongside the annotations themselves. #### Coroutine Accessors +Once a function can return a `Span`, we may also + This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for some types to be able to provide borrowing access to their internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. #### Layout constraint for safe loading of bit patterns -We could add a layout constraint refining`BitwiseCopyable`, specifically for types whose mapping from bit pattern to values is a [surjective function](https://en.wikipedia.org/wiki/Surjective_function), `SurjectiveBitPattern`. Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) +`RawSpan` has unsafe functions that interpret the raw bit patterns it contains as values of arbitrary `BitwiseCopyable` types. In order to have safe alternatives to these, we could add a layout constraint refining`BitwiseCopyable`. Specifically, the refined layout constraint ](https://en.wikipedia.org/wiki/Surjective_function) (e.g. `SurjectiveBitPattern`) would apply to types for which mapping from bit pattern to value is a [surjective function. Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) An alternative to a layout constraint is to add a type validation step to ensure that if a given bit pattern were to be interpreted as a value of type `T`, then all the invariants of type `T` would be respected. This alternative would be more flexible, but may have a higher runtime cost. @@ -973,7 +965,7 @@ public protocol ContiguousStorage: ~Copyable, ~Escapable { Two issues prevent us from proposing it at this time: (a) the ability to suppress requirements on `associatedtype` declarations was deferred during the review of [SE-0427], and (b) we cannot declare a `_read` accessor as a protocol requirement, since `_read` is not considered stable. -Many of the standard library collections could conform to `ContiguousStorage`, but we would not include the `Unsafe{Mutable,Raw}BufferPointer` types among them. Conversion of these will continue to use the explicit `Span(unsafe{Bytes,Elements,Start}:)` initializers. +Many of the standard library collections could conform to `ContiguousStorage`, but we would not include the `Unsafe{Mutable,Raw}BufferPointer` types among them. Conversion of these will continue to use the `withSpan()` and `withBytes()` methods. #### Syntactic Sugar for Automatic Conversions From 54912a70e589f2f295d0f7db3943465ff6aa351b Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 22 Aug 2024 16:44:50 -0700 Subject: [PATCH 57/73] =?UTF-8?q?rename=20file=20to=20include=20the=20word?= =?UTF-8?q?=20=E2=80=9Cspan=E2=80=9D?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...s-storage.md => nnnn-span-access-shared-contiguous-storage.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename proposals/{nnnn-safe-shared-contiguous-storage.md => nnnn-span-access-shared-contiguous-storage.md} (100%) diff --git a/proposals/nnnn-safe-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md similarity index 100% rename from proposals/nnnn-safe-shared-contiguous-storage.md rename to proposals/nnnn-span-access-shared-contiguous-storage.md From 8a26a9dc463f6862bc0701083fab03949a189c4b Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 22 Aug 2024 16:51:55 -0700 Subject: [PATCH 58/73] improve title --- proposals/nnnn-span-access-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 6baea35ffa..8fa1bf1ea5 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -1,4 +1,4 @@ -# Safe Access to Contiguous Storage +# Span: Safe Access to Contiguous Storage * Proposal: [SE-NNNN](nnnn-safe-shared-contiguous-storage.md) * Authors: [Guillaume Lessard](https://github.com/glessard), [Michael Ilseman](https://github.com/milseman), [Andrew Trick](https://github.com/atrick) From c7e464a2d2248506a9fa817e80e162696de7add6 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 30 Aug 2024 09:14:56 -0700 Subject: [PATCH 59/73] add link to preview implementation --- proposals/nnnn-span-access-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 8fa1bf1ea5..346dfbf927 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -3,7 +3,7 @@ * Proposal: [SE-NNNN](nnnn-safe-shared-contiguous-storage.md) * Authors: [Guillaume Lessard](https://github.com/glessard), [Michael Ilseman](https://github.com/milseman), [Andrew Trick](https://github.com/atrick) * Review Manager: TBD -* Status: **Awaiting implementation** +* Status: **Awaiting implementation**, previewed in a [branch](https://github.com/apple/swift-collections/tree/future) of [swift-collections](https://github.com/apple/swift-collections). * Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) * Bug: rdar://48132971, rdar://96837923 * Implementation: Prototyped in https://github.com/apple/swift-collections (on branch "future") From 9f32a16573c5a492f12faa723a761641c352fb4c Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 4 Sep 2024 09:50:05 -0700 Subject: [PATCH 60/73] lots of changes --- ...n-span-access-shared-contiguous-storage.md | 697 ++++++------------ 1 file changed, 206 insertions(+), 491 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 346dfbf927..c43c95f8e0 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -3,7 +3,7 @@ * Proposal: [SE-NNNN](nnnn-safe-shared-contiguous-storage.md) * Authors: [Guillaume Lessard](https://github.com/glessard), [Michael Ilseman](https://github.com/milseman), [Andrew Trick](https://github.com/atrick) * Review Manager: TBD -* Status: **Awaiting implementation**, previewed in a [branch](https://github.com/apple/swift-collections/tree/future) of [swift-collections](https://github.com/apple/swift-collections). +* Status: **Awaiting implementation**, previewed in a [branch](https://github.com/apple/swift-collections/tree/future) of [swift-collections](https://github.com/apple/swift-collections). * Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) * Bug: rdar://48132971, rdar://96837923 * Implementation: Prototyped in https://github.com/apple/swift-collections (on branch "future") @@ -15,7 +15,7 @@ We introduce `Span`, an abstraction for container-agnostic access to contiguo In the C family of programming languages, memory can be shared with any function by using a pointer and (ideally) a length. This allows contiguous memory to be shared with a function that doesn't know the layout of a container being used by the caller. A heap-allocated array, contiguously-stored named fields or even a single stack-allocated instance can all be accessed through a C pointer. We aim to enable a similar idiom in Swift, without compromising Swift's memory safety. -This proposal is related to two compiler features: [Nonescapable types](https://github.com/swiftlang/swift-evolution/pull/2304) (`~Escapable`,) being proposed along side, and [Compile-time Lifetime Dependency Annotations][PR-2305], which will be proposed in the following weeks. A precursor to this proposal was the [BufferView roadmap](https://forums.swift.org/t/66211) forum thread. This proposal is also related to the following proposals: +This proposal builds on [Nonescapable types][PR-2304] (`~Escapable`,) and is a precursor to [Compile-time Lifetime Dependency Annotations][PR-2305], which will be proposed in the following weeks. The [BufferView roadmap](https://forums.swift.org/t/66211) forum thread was an antecedent to this proposal. This proposal also depends on the following proposals: - [SE-0426] BitwiseCopyable - [SE-0427] Noncopyable generics @@ -26,12 +26,13 @@ This proposal is related to two compiler features: [Nonescapable types](https:// [SE-0427]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md [SE-0437]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0437-noncopyable-stdlib-primitives.md [SE-0377]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0377-parameter-ownership-modifiers.md +[PR-2304]: https://github.com/swiftlang/swift-evolution/pull/2304 [PR-2305]: https://github.com/swiftlang/swift-evolution/pull/2305 [PR-2305-pitch]: https://forums.swift.org/t/69865 ## Motivation -Swift needs safe and performant types for local processing over values in contiguous memory. Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function significantly limits performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()`, forcing them to use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. +Swift needs safe and performant types for local processing over values in contiguous memory. Consider for example a program using multiple libraries, including one for [base64](https://datatracker.ietf.org/doc/html/rfc4648) decoding. The program would obtain encoded data from one or more of its dependencies, which could supply the data in the form of `[UInt8]`, `Foundation.Data` or even `String`, among others. None of these types is necessarily more correct than another, but the base64 decoding library must pick an input format. It could declare its input parameter type to be `some Sequence`, but such a generic function can significantly limit performance. This may force the library author to either declare its entry point as inlinable, or to implement an internal fast path using `withContiguousStorageIfAvailable()`, forcing them to use an unsafe type. The ideal interface would have a combination of the properties of both `some Sequence` and `UnsafeBufferPointer`. The `UnsafeBufferPointer` passed to a `withUnsafeXXX` closure-style API, while performant, is unsafe in multiple ways: @@ -45,35 +46,17 @@ Even if the body of the `withUnsafeXXX` call does not escape the pointer, other #### `Span` -`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. `Span` does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed while the `Span` exists. In this first proposal, `Span`s will be constrained to closures from which they structurally cannot escape. Later, we will introduce a lifetime dependency between the `Span` and the binding of the type vending it, preventing its escape from the scope where it is valid for use. This guarantee preserves temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. +`Span` will allow sharing the contiguous internal representation of a type, by providing access to a borrowed view of an interval of contiguous memory. `Span` does not copy the underlying data: it instead relies on a guarantee that the original container cannot be modified or destroyed while the `Span` exists. In the prototype that accompanies this first proposal, `Span`s will be constrained to closures from which they structurally cannot escape. Later, we will introduce a lifetime dependency between a `Span` and the binding of the type vending it, preventing its escape from the scope where it is valid for use. Both of these approaches guarantee temporal safety. `Span` also performs bounds-checking on every access to preserve spatial safety. Additionally `Span` always represents initialized memory, preserving the definite initialization guarantee. -A `Span` provided by container represents a borrow of that container. `Span` can therefore provide simultaneous access to a non-copyable container, and can help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see [future directions](#Directions) for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) +`Span` is intended as the currency type for local processing over values in contiguous memory. It is a replacement for many API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that do not need to escape the owning container. -`Span` is the currency type for local processing over values in contiguous memory. It is a replacement for many API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that do not need to escape the owning container. +A `Span` provided by a container represents a borrow of that container. `Span` can therefore provide simultaneous access to a non-copyable container. It can also help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see [future directions](#Directions) for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) #### `RawSpan` -`RawSpan` allows sharing the contiguous internal representation for values which may be heterogeneously-typed, as well as representing heteregeneously-typed input to be parsed. Since it is a fully concrete type, it can achieve great performance in debug builds of client code as well as straightforward performance in library code. +`RawSpan` allows sharing contiguous memory representing values which may be heterogeneously-typed, such as memory intended for parsing. It makes the same safety guarantees as `Span`. Since it is a fully concrete type, it can achieve great performance in debug builds of client code as well as straightforward performance in library code. -A `RawSpan` can be obtained from containers of `BitwiseCopyable` elements, as well as be initialized directly from an instance of `Span`. - -#### Extensions to Standard Library and Foundation types - -The standard library and Foundation will provide `withSpan()` and `withBytes()` closure-taking functions as safe replacements for the existing `withUnsafeBufferPointer()` and `withUnsafeBytes()` functions. These functions provide For example, `Array` will be extended as follows: - -```swift -extension Array { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} -``` - -The full list of standard library and Foundation types to be extended in this manner is found in the [detailed design](#Design) section. +A `RawSpan` can be obtained from containers of `BitwiseCopyable` elements, as well as be initialized directly from an instance of `Span`. ## Detailed design @@ -100,15 +83,7 @@ extension Span where Element: ~Copyable & ~Escapable { Note that `Span` does _not_ conform to `Collection`. This is because `Collection`, as originally conceived and enshrined in existing source code, assumes pervasive copyability and escapability of the `Collection` itself as well as of element type. In particular a subsequence of a `Collection` is semantically a separate value from the instance it was derived from. In the case of `Span`, a sub-span representing a subrange of its elements _must_ have the same lifetime as the `Span` from which it originates. Another proposal will consider collection-like protocols to accommodate different combinations of `~Copyable` and `~Escapable` for the collection and its elements. -`Span`s representing subsets of consecutive elements can be extracted out of a larger `Span` with an API similar to the `extracting()` functions recently added to `UnsafeBufferPointer` in support of non-copyable elements: - -```swift -extension Span where Element: ~Copyable & ~Escapable { - public func extracting(_ bounds: Range) -> Self -} -``` - -The first element of a given span is always at position zero, and its last element is always at position `count-1`. +Like `UnsafeBufferPointer`, `Span` uses a simple offset-based indexing. The first element of a given span is always at position zero, and its last element is always at position `count-1`. As a side-effect of not conforming to `Collection` or `Sequence`, `Span` is not directly supported by `for` loops at this time. It is, however, easy to use in a `for` loop via indexing: @@ -118,64 +93,13 @@ for i in 0..( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withUnsafeBytes( - _ body: (_ elements: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension UnsafeRawBufferPointer { // as well as UnsafeMutableRawBufferPointer - public func withUnsafeBytes( - _ body: (_ elements: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} -``` - -`Span` and `RawSpan` also have an unsafe hatch for use with functions that require an unsafe argument, such as C functions: +### `Span` API: -```swift -extension Span where Element: ~Copyable & ~Escapable { - func withUnsafeBufferPointer( - _ body: (_ buffer: UnsafeBufferPointer) throws(E) -> Result - ) throws(E) -> Result -} - -extension Span where Element: BitwiseCopyable { - func withUnsafeBytes( - _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result - ) throws(E) -> Result -} - -extension RawSpan { - func withUnsafeBytes( - _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result - ) throws(E) -> Result -} -``` - -### Complete `Span` API: - -```swift -public struct Span: Copyable, ~Escapable { - internal var _start: UnsafePointer - internal var _count: Int -} -``` - -Initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for further details, see "[Initializers](#Initializers)" in the [future directions](#Directions) section. +Initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for details, see "[Initializers](#Initializers)" in the [future directions](#Directions) section. ##### Basic API: -The following properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). The only difference with their counterpart should be a lifetime dependency annotation where applicable, allowing them to return borrowed nonescapable values or borrowed noncopyable values. +The following properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). ```swift extension Span where Element: ~Copyable & ~Escapable { @@ -186,112 +110,63 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` -##### Accessing subranges of elements: +Note that we use a `_read` accessor for the subscript, a requirement in order to `yield` a borrowed non-copyable `Element` (see ["Coroutines"](#Coroutines).) This will be updated to a final syntax at a later time. + +##### Unchecked access to elements: -In SE-0437, `UnsafeBufferPointer`'s slicing subscript was replaced by the `extracting(_ bounds:)` functions, due to the copyability assumption baked into the standard library's `Slice` type. `Span` has similar requirements as `UnsafeBufferPointer`, in that it must be possible to obtain an instance representing a subrange of the same memory as another `Span`. We therefore follow in the footsteps of SE-0437, and add a family of `extracting()` functions: +The `subscript` mentioned above has always-on bounds checking of its parameter, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly, such as in tight loops: ```swift extension Span where Element: ~Copyable & ~Escapable { - /// Constructs a new span over the items within the supplied range of - /// positions within this span. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `Span`. - /// - /// - Returns: A `Span` over the items within `bounds` - func extracting(_ bounds: Range) -> Self - - /// Constructs a new span over the items within the supplied range of - /// positions within this span. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `Span`. - /// - /// - Returns: A `Span` over the items within `bounds` - func extracting(_ bounds: some RangeExpression) -> Self + // Unchecked subscripting and extraction - /// Constructs a new span over all the items of this span. + /// Accesses the element at the specified `position`. /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. + /// This subscript does not validate `position`; this is an unsafe operation. /// - /// - Returns: A `Span` over all the items of this span. - func extracting(_: UnboundedRange) -> Self + /// - Parameter position: The offset of the element to access. `position` + /// must be greater or equal to zero, and less than `count`. + public subscript(unchecked position: Int) -> Element { _read } } ``` -Additionally, we add specialized versions that extract prefixes and suffixes: + +##### Index validation utilities: + +Every time `Span` uses a position parameter, it checks for its validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: + ```swift extension Span where Element: ~Copyable & ~Escapable { - /// Returns a span containing the initial elements of this span, - /// up to the specified maximum length. + /// Return true if `index` is a valid offset into this `Span` /// - /// If the maximum length exceeds the length of this span, - /// the result contains all the elements. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter maxLength: The maximum number of elements to return. - /// `maxLength` must be greater than or equal to zero. - /// - Returns: A span with at most `maxLength` elements. - public func extracting(first maxLength: Int) -> Self + /// - Parameters: + /// - index: an index to validate + /// - Returns: true if `index` is a valid index + public func indicesContain(_ index: Int) -> Bool - /// Returns a span over all but the given number of trailing elements. + /// Traps if `index` is not a valid offset into this `Span` /// - /// If the number of elements to drop exceeds the number of elements in - /// the span, the result is an empty span. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter k: The number of elements to drop off the end of - /// the span. `k` must be greater than or equal to zero. - /// - Returns: A span leaving off the specified number of elements at the end. - public func extracting(droppingLast k: Int) -> Self + /// - Parameters: + /// - index: an index to validate + public func indexPrecondition(_ index: Int) - /// Returns a span containing the final elements of the span, - /// up to the given maximum length. + /// Return true if `indices` is a valid range of offsets into this `Span` /// - /// If the maximum length exceeds the length of this span, - /// the result contains all the elements. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter maxLength: The maximum number of elements to return. - /// `maxLength` must be greater than or equal to zero. - /// - Returns: A span with at most `maxLength` elements. - public func extracting(last maxLength: Int) -> Self + /// - Parameters: + /// - indices: a range of indices to validate + /// - Returns: true if `indices` is a valid range of indices + public func indicesContain(_ indices: Range) -> Bool - /// Returns a span over all but the given number of initial elements. + /// Traps if `indices` is not a valid range of offsets into this `Span` /// - /// If the number of elements to drop exceeds the number of elements in - /// the span, the result is an empty span. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// - Parameter k: The number of elements to drop from the beginning of - /// the span. `k` must be greater than or equal to zero. - /// - Returns: A span starting after the specified number of elements. - public func extracting(droppingFirst k: Int) -> Self + /// - Parameters: + /// - indices: a range of indices to validate + public func indexPrecondition(_ indices: Range) } ``` -##### Identifying whether a span is a subrange of another: +Note: these function names are not ideal. + +##### Identifying whether a `Span` is a subrange of another: When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span: @@ -317,90 +192,6 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` -##### Unchecked access to elements and subranges of elements: - -The `subscript` and the `extracting()` functions mentioned above all have always-on bounds checking of their parameters, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly, such as in tight loops: - -```swift -extension Span where Element: ~Copyable & ~Escapable { - // Unchecked subscripting and extraction - - /// Accesses the element at the specified `position`. - /// - /// This subscript does not validate `position`; this is an unsafe operation. - /// - /// - Parameter position: The offset of the element to access. `position` - /// must be greater or equal to zero, and less than `count`. - public subscript(unchecked position: Int) -> Element { _read } - - /// Constructs a new span over the items within the supplied range of - /// positions within this span. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// This function does not validate `bounds`; this is an unsafe operation. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `Span`. - /// - /// - Returns: A `Span` over the items within `bounds` - public func extracting(unchecked bounds: Range) -> Self - - /// Constructs a new span over the items within the supplied range of - /// positions within this span. - /// - /// The returned span's first item is always at offset 0; unlike buffer - /// slices, extracted spans do not share their indices with the - /// span from which they are extracted. - /// - /// This function does not validate `bounds`; this is an unsafe operation. - /// - /// - Parameter bounds: A valid range of positions. Every position in - /// this range must be within the bounds of this `Span`. - /// - /// - Returns: A `Span` over the items within `bounds` - public func extracting(unchecked bounds: some RangeExpression) -> Self -} -``` - -##### Index validation utilities: - -Every time `Span` uses a position parameter, it checks for its validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: - -```swift -extension Span where Element: ~Copyable & ~Escapable { - /// Return true if `offset` is a valid offset into this `Span` - /// - /// - Parameters: - /// - position: an index to validate - /// - Returns: true if `offset` is a valid index - public func validateBounds(_ offset: Int) -> Bool - - /// Traps if `offset` is not a valid offset into this `Span` - /// - /// - Parameters: - /// - position: an index to validate - public func ensureValidity(_ offset: Int) - - /// Return true if `offsets` is a valid range of offsets into this `Span` - /// - /// - Parameters: - /// - offsets: a range of indices to validate - /// - Returns: true if `offsets` is a valid range of indices - public func validateBounds(_ offsets: Range) -> Bool - - /// Traps if `offsets` is not a valid range of offsets into this `Span` - /// - /// - Parameters: - /// - offsets: a range of indices to validate - public func ensureValidity(_ offsets: Range) -} -``` - -Note: these function names could be improved. - ##### Interoperability with unsafe code: We provide two functions for interoperability with C or other legacy pointer-taking functions. @@ -447,35 +238,21 @@ extension Span where Element: BitwiseCopyable { ### RawSpan -In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogeneously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that intends to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports extracting sub-spans, along with the data loading operations `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. +In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogeneously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that is intended to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports the data loading operations `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. -#### Complete `RawSpan` API: +#### `RawSpan` API: ```swift +@frozen public struct RawSpan: Copyable, ~Escapable { internal var _start: UnsafeRawPointer internal var _count: Int } ``` -##### Converting a `Span` to a `RawSpan`: - -```swift -extension RawSpan { - /// Create a `RawSpan` over the memory represented by a `Span` - /// - /// - Parameters: - /// - span: An existing `Span`, which will define both this - /// `RawSpan`'s lifetime and the memory it represents. - public init(_ span: borrowing Span) -} -``` - -Other initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for further details, see "[Initializers](#Initializers)" in the [future directions](#Directions) section. - ##### Accessing the memory of a `RawSpan`: -`RawSpan` has basic operations to access the contents of its memory: `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. These operations are not type-safe, in that the loaded value returned by the operation can be invalid. Some types have a property that makes this operation safe, but there is no [formal identification](#SurjectiveBitPattern) for these types at this time. +`RawSpan` has basic operations to access the contents of its memory: `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. These operations are not type-safe, in that the loaded value returned by the operation can be invalid. Some types have a property that makes this operation safe, but there we don't have a way to [formally identify](#SurjectiveBitPattern) such types at this time. ```swift extension RawSpan { @@ -521,7 +298,7 @@ extension RawSpan { ) -> T ``` -These functions have the following counterparts which omit bounds-checking for cases where redundant checks affect performance: +These functions have counterparts which omit bounds-checking for cases where redundant checks affect performance: ```swift /// Returns a new instance of the given type, constructed from the raw memory /// at the specified offset. @@ -568,26 +345,6 @@ These functions have the following counterparts which omit bounds-checking for c } ``` -A `RawSpan` can be viewed as a `Span`, provided the memory is laid out homogeneously as instances of `T`. - -```swift -extension RawSpan { - /// View the bytes of this span as type `T` - /// - /// This is the equivalent of `unsafeBitCast(_:to:)`. The - /// underlying bytes must be initialized as type `T`, or be - /// initialized to a type that is layout-compatible with `T`. - /// - /// This is an unsafe operation. Failure to meet the preconditions - /// above may produce invalid values of `T`. - /// - /// - Parameters: - /// - type: The type as which to view the bytes of this span. - /// - Returns: A typed span viewing these bytes as instances of `T`. - public func unsafeView(as type: T.Type) -> Span -} -``` - `RawSpan` provides `withUnsafeBytes` for interoperability with C or other legacy pointer-taking functions: ```swift @@ -621,53 +378,27 @@ extension RawSpan { /// A Boolean value indicating whether the span is empty. public var isEmpty: Bool { get } - - /// Return true if `offset` is a valid offset into this `RawSpan` - public func validateBounds(_ offset: Int) -> Bool - - /// Traps if `offset` is not a valid offset into this `RawSpan` - public func ensureValidity(_ offset: Int) - - /// Return true if `offsets` is a valid range of offsets into this `RawSpan` - public func validateBounds(_ offsets: Range) -> Bool - - /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` - public func ensureValidity(_ offsets: Range) } ``` -##### Accessing subranges of elements: - -Similarly to `Span`, `RawSpan` does not support slicing in the style of `Collection`. It supports the same set of `extracting()` functions as `Span`. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: - +##### `RawSpan` offset validation: ```swift extension RawSpan { + /// Return true if `offset` is a valid offset into this `RawSpan` + public func offsetsContain(_ offset: Int) -> Bool - public func extracting(_ bounds: Range) -> Self - - public func extracting(unchecked bounds: Range) -> Self - - public func extracting(_ bounds: some RangeExpression) -> Self - - public func extracting( - unchecked bounds: some RangeExpression - ) -> Self - - public func extracting(_: UnboundedRange) -> Self - - // extracting prefixes and suffixes - - public func extracting(first maxLength: Int) -> Self - - public func extracting(droppingLast k: Int) -> Self + /// Traps if `offset` is not a valid offset into this `RawSpan` + public func offsetPrecondition(_ offset: Int) - public func extracting(last maxLength: Int) -> Self + /// Return true if `offsets` is a valid range of offsets into this `RawSpan` + public func offsetsContain(_ offsets: Range) -> Bool - public func extracting(droppingFirst k: Int) -> Self + /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` + public func offsetPrecondition(_ offsets: Range) } ``` -##### Identifying whether a span is a subrange of another: +##### Identifying whether a `RawSpan` is a subrange of another: When working with multiple `RawSpan` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: @@ -679,105 +410,45 @@ extension RawSpan { } ``` -### Extensions to standard library and Foundation types +### Extensions to standard library types: + +The `UnsafeBufferPointer` family of types can be be adapted for use with `Span`-taking API through unsafe `Span`-providing functions. A `Span` instance obtained this way loses its static guarantee of temporal safety, because it is possible to deinitialize or deallocate the underlying memory before the end of the closure's scope. While the closure provides a clear signal to not do so, it is not enforceable by the compiler. ```swift -extension Array { - public func withSpan( +extension UnsafeBufferPointer { + public func withUnsafeSpan( _ body: (_ elements: Span) throws(E) -> Result ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} -extension ArraySlice { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result + public func withUnsafeBytes( + _ body: (_ elements: RawSpan) throws(E) -> Result ) throws(E) -> Result where Element: BitwiseCopyable } -extension ContiguousArray { - public func withSpan( +extension UnsafeMutableBufferPointer { + public func withUnsafeSpan( _ body: (_ elements: Span) throws(E) -> Result ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension Foundation.Data { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} - -extension String.UTF8View { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} - -extension Substring.UTF8View { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} - -extension CollectionOfOne { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result + public func withUnsafeBytes( + _ body: (_ elements: RawSpan) throws(E) -> Result ) throws(E) -> Result where Element: BitwiseCopyable } -extension SIMD { - public func withSpan( - _ body: (_ elements: Span) throws(E) -> Result +extension UnsafeRawBufferPointer { + public func withUnsafeBytes( + _ body: (_ elements: RawSpan) throws(E) -> Result ) throws(E) -> Result - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Self.Scalar: BitwiseCopyable } -extension KeyValuePairs { - public func withSpan( - _ body: (_ elements: Span<(key: Key, value: Value)>) throws(E) -> Result - ) throws(E) -> Result { - try Array(self).withSpan(body) - } - - public func withBytes( - _ body: (_ bytes: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable { - try Array(self).withBytes(body) - } +extension UnsafeMutableRawBufferPointer { + public func withUnsafeBytes( + _ body: (_ elements: RawSpan) throws(E) -> Result + ) throws(E) -> Result } ``` -In this proposal, the `withSpan()` and `withBytes()` methods are the supported ways to obtain a `Span` or `RawSpan`. Initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for further details, see [Initializers](#Initializers) in the [future directions](#Directions) section. +Safe extensions to safe standard library types will be proposed in a later proposal. ## Source compatibility @@ -797,11 +468,11 @@ The additions described in this proposal require a new version of the standard l Making `Span` non-copyable was in the early vision of this type. However, we found that would make `Span` a poor match to model borrowing semantics. This realization led to the initial design for non-escapable declarations. ##### Use a non-escapable index type -Eventually we want a similar usage pattern for a `MutableSpan` (described [below](#MutableSpan)) as we are proposing for `Span`. If the index of a `MutableSpan` were to borrow the view, then it becomes impossible to implement a mutating subscript without also requiring an index to be consumed. This seems untenable. +A non-escapable index type implies that any indexing operation would borrow its `Span`. This would prevent using such an index for a mutation, since a mutation requires an _exclusive_ borrow. Noting that the usage pattern we desire for `Span` must also apply to `MutableSpan`(described [below](#MutableSpan),) a non-escapable index would make it impossible to also implement a mutating subscript, unless any mutating operation consumes the index. This seems untenable. ##### Naming -The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is usually used in reference to transient storage. On the other hand we already have a nomenclature using the term "Storage" in the `withContiguousStorageIfAvailable()` function, and we tried to allude to that in a previous pitch where we called this type `StorageView`. We also considered the name `StorageSpan`, but that did not add much beyond the name `Span`. `Span` clearly identifies itself as a relative of C++'s `std::span`. +The ideas in this proposal previously used the name `BufferView`. While the use of the word "buffer" would be consistent with the `UnsafeBufferPointer` type, it is nevertheless not a great name, since "buffer" is commonly used in reference to transient storage. Another previous pitch used the term `StorageView` in reference to the `withContiguousStorageIfAvailable()` standard library function. We also considered the name `StorageSpan`, but that did not add much beyond the shorter name `Span`. `Span` clearly identifies itself as a relative of C++'s `std::span`. ##### A more sophisticated approach to indexing @@ -815,21 +486,129 @@ A `Span` represents a region of memory and, as such, must be initialized using a These annotations have been [pitched][PR-2305-pitch] and are expected to be formally [proposed][PR-2305] soon. `Span` initializers using lifetime annotations will be proposed alongside the annotations themselves. -#### Coroutine Accessors +#### Obtaining variant `Span`s and `RawSpan`s from `Span` and `RawSpan` + +`Span`s representing subsets of consecutive elements could be extracted out of a larger `Span` with an API similar to the `extracting()` functions recently added to `UnsafeBufferPointer` in support of non-copyable elements: + +```swift +extension Span where Element: ~Copyable & ~Escapable { + public func extracting(_ bounds: Range) -> Self +} +``` + +Each variant of such a function needs to return a `Span`, which requires a lifetime dependency. -Once a function can return a `Span`, we may also +Similarly, a `RawSpan` should be initializable from a `Span`, and `RawSpan` should provide a function to unsafely view its content as a typed `Span`: + +```swift +extension RawSpan { + public init(_ span: Span) -This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for some types to be able to provide borrowing access to their internal storage. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` will be adapted to the new syntax. + public func unsafeView(as type: T.Type) -> Span +} +``` + +We are subsetting these functions of `Span` and `RawSpan` until the lifetime annotations are proposed. + +#### Coroutine Accessors + +This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for some types to be able to provide borrowing access to their internal storage, in particular storage containing non-copyable elements. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` and `RawSpan` will be adapted to the new syntax. + +#### Extensions to Standard Library and Foundation types + +The standard library and Foundation has a number of types that can in principle provide access to their internal storage as a `Span`. We could provide `withSpan()` and `withBytes()` closure-taking functions as safe replacements for the existing `withUnsafeBufferPointer()` and `withUnsafeBytes()` functions. We could also also provide lifetime-dependent `span` or `bytes` properties. For example, `Array` could be extended as follows: + +```swift +extension Array { + public func withSpan( + _ body: (_ elements: Span) throws(E) -> Result + ) throws(E) -> Result + + public var span: Span { borrowing get } +} + +extension Array where Element: BitwiseCopyable { + public func withBytes( + _ body: (_ bytes: RawSpan) throws(E) -> Result + ) throws(E) -> Result where Element: BitwiseCopyable + + public var bytes: RawSpan { borrowing get } +} +``` + +Of these, the closure-taking functions can be implemented now, but it is unclear whether they are desirable. The lifetime-dependent computed properties require lifetime annotations, as initializers do. We are deferring proposing these extensions until the lifetime annotations are proposed. + +#### A `ContiguousStorage` protocol + +An earlier version of this proposal proposed a `ContiguousStorage` protocol by which a type could indicate that it can provide a `Span`. `ContiguousStorage` would form a bridge between generically-typed interfaces and a performant concrete implementation. It would supersede the rejected [SE-0256](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). + +For example, for the hypothetical base64 decoding library mentioned in the [motivation](#Motivation) section, a possible API could be: + +```swift +extension HypotheticalBase64Decoder { + public func decode(bytes: some ContiguousStorage) -> [UInt8] +} +``` + +`ContiguousStorage` would have the following definition: + +```swift +public protocol ContiguousStorage: ~Copyable, ~Escapable { + associatedtype Element: ~Copyable & ~Escapable + var storage: Span { _read } +} +``` + +Two issues prevent us from proposing it at this time: (a) the ability to suppress requirements on `associatedtype` declarations was deferred during the review of [SE-0427], and (b) we cannot declare a `_read` accessor as a protocol requirement. + +Many of the standard library collections could conform to `ContiguousStorage`. + +#### Support for `Span` in `for` loops + +This proposal does not define an `IteratorProtocol` conformance, since an iterator for `Span` would need to be non-escapable. This is not compatible with `IteratorProtocol`. As such, `Span` is not directly usable in `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: + +```swift +func doSomething(_ e: borrowing Element) { ... } +let span: Span = ... +for borrowing element in span { + doSomething(element) +} +``` + +In the meantime, it is possible to loop through a `Span`'s elements by direct indexing: + +```swift +let span: Span = ... +// either: +var i = 0 +while i < span.count { + doSomething(span[i]) + i += 1 +} + +// ...or: +for i in 0..Layout constraint for safe loading of bit patterns -`RawSpan` has unsafe functions that interpret the raw bit patterns it contains as values of arbitrary `BitwiseCopyable` types. In order to have safe alternatives to these, we could add a layout constraint refining`BitwiseCopyable`. Specifically, the refined layout constraint ](https://en.wikipedia.org/wiki/Surjective_function) (e.g. `SurjectiveBitPattern`) would apply to types for which mapping from bit pattern to value is a [surjective function. Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) +`RawSpan` has unsafe functions that interpret the raw bit patterns it contains as values of arbitrary `BitwiseCopyable` types. In order to have safe alternatives to these, we could add a layout constraint refining `BitwiseCopyable`, specifically for types whose mapping from bit pattern to values is a [surjective function](https://en.wikipedia.org/wiki/Surjective_function) (e.g. `SurjectiveBitPattern`). Such types would be safe to [load](#Load) from `RawSpan` instances. 1-byte examples are `Int8` (any of 256 values are valid) and `Bool` (256 bit patterns map to `true` or `false` because only one bit is considered.) An alternative to a layout constraint is to add a type validation step to ensure that if a given bit pattern were to be interpreted as a value of type `T`, then all the invariants of type `T` would be respected. This alternative would be more flexible, but may have a higher runtime cost. -#### Byte parsing helpers +#### Byte parsing helpers -A handful of helper API can make `RawSpan` better suited for binary parsers and decoders. +We could add some API to `RawSpan` to make it better suited for binary parsers and decoders. ```swift extension RawSpan { @@ -859,11 +638,11 @@ extension RawSpan { `Cursor` stores and manages a parsing subrange, which alleviates the developer from managing one layer of slicing. -Alternatively, if some future `RawSpan.Iterator` were 3 words in size (start, current position, and end) instead of 2 (current pointer and end), that is it were a "resettable" iterator, it could host this API instead of introducing a new `Cursor` type or concept. +Alternatively, if some future `RawSpan.Iterator` were 3 words in size (start, current position, and end) instead of 2 (current pointer and end), making it a "resettable", it could host this API instead of introducing a new `Cursor` type or concept. ##### Example: Parsing PNG -The code snippet below parses a [PNG Chunk](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks): +The code snippet below parses a [PNG Chunk](https://www.w3.org/TR/png-3/#4Concepts.FormatChunks), using the byte parsing helpers defined above: ```swift // Parse a PNG chunk @@ -873,48 +652,9 @@ let data = try cursor.parse(numBytes: length) let crc = try cursor.parse(UInt32.self).bigEndian ``` -#### Defining `BorrowingIterator` with support in `for` loops - -This proposal does not define an `IteratorProtocol` conformance, since it would need to be borrowed and non-escapable. This is not compatible with `IteratorProtocol`. As such, `Span` is not directly usable in `for` loops as currently defined. A `BorrowingIterator` protocol for non-escapable and non-copyable containers must be defined, providing a `for` loop syntax where the element is borrowed through each iteration. Ultimately we should arrive at a way to iterate through borrowed elements from a borrowed view: - -```swift -borrowing view: Span = ... -for borrowing element in view { - doSomething(element) -} -``` - -In the meantime, it is possible to loop through a `Span`'s elements by direct indexing: - -```swift -func doSomething(_ e: borrowing Element) { ... } -let view: Span = ... -// either: -var i = 0 -while i < view.count { - doSomething(view[i]) - view.index(after: &i) -} - -// ...or: -for i in 0..Safe mutations of memory with `MutableSpan` -Some data structures can delegate mutations of their owned memory. In the standard library we have `withMutableBufferPointer()`, for example. +Some data structures can delegate mutations of their owned memory. In the standard library the function `withMutableBufferPointer()` provides this functionality in an unsafe manner. The `UnsafeMutableBufferPointer` passed to a `withUnsafeMutableXXX` closure-style API is unsafe in multiple ways: @@ -934,38 +674,13 @@ A `MutableSpan` should provide a better, safer alternative to mutable memory Some data structures can delegate initialization of their initial memory representation, and in some cases the initialization of additional memory. For example, the standard library features the initializer`Array.init(unsafeUninitializedCapacity:initializingWith:)`, which depends on `UnsafeMutableBufferPointer` and is known to be error-prone. A safer abstraction for initialization would make such initializers less dangerous, and would allow for a greater variety of them. -We can define an `OutputSpan` type, which could support appending to the initialized portion of its underlying storage. `OutputSpan` allows for uninitialized memory beyond the last position appended. Such an `OutputSpan` would also be a useful abstraction to pass user-allocated storage to low-level API such as networking calls or file I/O. +We can define an `OutputSpan` type, which could support appending to the initialized portion of a data structure's underlying storage. `OutputSpan` allows for uninitialized memory beyond the last position appended. Such an `OutputSpan` would also be a useful abstraction to pass user-allocated storage to low-level API such as networking calls or file I/O. #### Resizable, contiguously-stored, untyped collection in the standard library The example in the [motivation](#Motivation) section mentions the `Foundation.Data` type. There has been some discussion of either replacing `Data` or moving it to the standard library. This document proposes neither of those. A major issue is that in the "traditional" form of `Foundation.Data`, namely `NSData` from Objective-C, it was easier to control accidental copies because the semantics of the language did not lead to implicit copying. -Even if `Span` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, just as `Array` will: resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may still want to add an untyped-element equivalent of `Array` at a later time. - -#### A `ContiguousStorage` protocol - -An earlier version of this proposal proposed a `ContiguousStorage` protocol by which a type could indicate that it can provide a `Span`. `ContiguousStorage` would form a bridge between generically-typed interfaces and a performant concrete implementation. It would supersede the rejected [SE-0256](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0256-contiguous-collection.md). - -For example, for the hypothetical base64 decoding library mentioned in the [motivation](#Motivation) section, a possible API could be: - -```swift -extension HypotheticalBase64Decoder { - public func decode(bytes: some ContiguousStorage) -> [UInt8] -} -``` - -`ContiguousStorage` would have the following definition: - -```swift -public protocol ContiguousStorage: ~Copyable, ~Escapable { - associatedtype Element: ~Copyable & ~Escapable - var storage: Span { _read } -} -``` - -Two issues prevent us from proposing it at this time: (a) the ability to suppress requirements on `associatedtype` declarations was deferred during the review of [SE-0427], and (b) we cannot declare a `_read` accessor as a protocol requirement, since `_read` is not considered stable. - -Many of the standard library collections could conform to `ContiguousStorage`, but we would not include the `Unsafe{Mutable,Raw}BufferPointer` types among them. Conversion of these will continue to use the `withSpan()` and `withBytes()` methods. +Even if `Span` were to replace all uses of a constant `Data` in API, something like `Data` would still be needed, for the same reason as `Array` is needed: such a type allows for resizing mutations (e.g. `RangeReplaceableCollection` conformance.) We may want to add an untyped-element equivalent of `Array` to the standard library at a later time. #### Syntactic Sugar for Automatic Conversions @@ -992,7 +707,7 @@ The [`std::span`](https://en.cppreference.com/w/cpp/container/span) class templa ## Acknowledgments -Joe Groff, John McCall, Tim Kientzle, Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. +Joe Groff, John McCall, Tim Kientzle, Steve Canon and Karoy Lorentey contributed to this proposal with their clarifying questions and discussions. ### Appendix: Index and slicing design considerations @@ -1070,7 +785,7 @@ myCollection.withUnsafeBytes { } ``` -Note, however, that parsers tend to become more complex and copying slices for later index extraction becomes more common. At that point, it is better to use a more powerful approach such as the index-advancing or cursor API presented in *Byte parsing helpers*. +Note, however, that parsers tend to become more complex and copying slices for later index extraction becomes more common. At that point, it is better to use a more powerful approach such as the index-advancing or cursor API presented in *[Byte parsing helpers](#ByteParsingHelpers)*. That being said, if we had a time machine it's not clear that we would choose a design with index interchange, as it does introduce design tradeoffs and makes some code, especially when the index type is `Int`, troublesome: From 979f5bb27d699b8a3fb2ab5b6c3faff26bd183eb Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 5 Sep 2024 13:40:00 -0700 Subject: [PATCH 61/73] remove UBP.withUnsafeSpan and similar --- ...n-span-access-shared-contiguous-storage.md | 40 ------------------- 1 file changed, 40 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index c43c95f8e0..4ac60f402b 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -410,46 +410,6 @@ extension RawSpan { } ``` -### Extensions to standard library types: - -The `UnsafeBufferPointer` family of types can be be adapted for use with `Span`-taking API through unsafe `Span`-providing functions. A `Span` instance obtained this way loses its static guarantee of temporal safety, because it is possible to deinitialize or deallocate the underlying memory before the end of the closure's scope. While the closure provides a clear signal to not do so, it is not enforceable by the compiler. - -```swift -extension UnsafeBufferPointer { - public func withUnsafeSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withUnsafeBytes( - _ body: (_ elements: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension UnsafeMutableBufferPointer { - public func withUnsafeSpan( - _ body: (_ elements: Span) throws(E) -> Result - ) throws(E) -> Result - - public func withUnsafeBytes( - _ body: (_ elements: RawSpan) throws(E) -> Result - ) throws(E) -> Result where Element: BitwiseCopyable -} - -extension UnsafeRawBufferPointer { - public func withUnsafeBytes( - _ body: (_ elements: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} - -extension UnsafeMutableRawBufferPointer { - public func withUnsafeBytes( - _ body: (_ elements: RawSpan) throws(E) -> Result - ) throws(E) -> Result -} -``` - -Safe extensions to safe standard library types will be proposed in a later proposal. - ## Source compatibility This proposal is additive and source-compatible with existing code. From 0235837112719ec04c723109c67c623d61dfea21 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 5 Sep 2024 14:58:19 -0700 Subject: [PATCH 62/73] remove another ~Escapable that cannot be promised --- proposals/nnnn-span-access-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 4ac60f402b..4e2fad5449 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -363,7 +363,7 @@ extension RawSpan { /// The closure's parameter is valid only for the duration of /// its execution. /// - Returns: The return value of the `body` closure parameter. - func withUnsafeBytes( + func withUnsafeBytes( _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result ) throws(E) -> Result } From d656ead9a50ab5c5973a89fdcbbee431f300d36c Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 5 Sep 2024 17:46:53 -0700 Subject: [PATCH 63/73] add a missing blurb --- proposals/nnnn-span-access-shared-contiguous-storage.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 4e2fad5449..69f7072b65 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -250,6 +250,8 @@ public struct RawSpan: Copyable, ~Escapable { } ``` +Initializers, required for library adoption, will be proposed alongside [lifetime annotations][PR-2305]; for details, see "[Initializers](#Initializers)" in the [future directions](#Directions) section. + ##### Accessing the memory of a `RawSpan`: `RawSpan` has basic operations to access the contents of its memory: `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. These operations are not type-safe, in that the loaded value returned by the operation can be invalid. Some types have a property that makes this operation safe, but there we don't have a way to [formally identify](#SurjectiveBitPattern) such types at this time. From 13f30ce94f99344a627d8ddd9078366c36314482 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 5 Sep 2024 17:47:13 -0700 Subject: [PATCH 64/73] improve name of bounds-checking functions --- ...nn-span-access-shared-contiguous-storage.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 69f7072b65..931921a8a7 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -141,26 +141,26 @@ extension Span where Element: ~Copyable & ~Escapable { /// - Parameters: /// - index: an index to validate /// - Returns: true if `index` is a valid index - public func indicesContain(_ index: Int) -> Bool + public func boundsContain(_ index: Int) -> Bool /// Traps if `index` is not a valid offset into this `Span` /// /// - Parameters: /// - index: an index to validate - public func indexPrecondition(_ index: Int) + public func boundsPrecondition(_ index: Int) /// Return true if `indices` is a valid range of offsets into this `Span` /// /// - Parameters: /// - indices: a range of indices to validate /// - Returns: true if `indices` is a valid range of indices - public func indicesContain(_ indices: Range) -> Bool + public func boundsContain(_ indices: Range) -> Bool /// Traps if `indices` is not a valid range of offsets into this `Span` /// /// - Parameters: /// - indices: a range of indices to validate - public func indexPrecondition(_ indices: Range) + public func boundsPrecondition(_ indices: Range) } ``` @@ -383,20 +383,20 @@ extension RawSpan { } ``` -##### `RawSpan` offset validation: +##### `RawSpan` bounds checking: ```swift extension RawSpan { /// Return true if `offset` is a valid offset into this `RawSpan` - public func offsetsContain(_ offset: Int) -> Bool + public func boundsContain(_ offset: Int) -> Bool /// Traps if `offset` is not a valid offset into this `RawSpan` - public func offsetPrecondition(_ offset: Int) + public func boundsPrecondition(_ offset: Int) /// Return true if `offsets` is a valid range of offsets into this `RawSpan` - public func offsetsContain(_ offsets: Range) -> Bool + public func boundsContain(_ offsets: Range) -> Bool /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` - public func offsetPrecondition(_ offsets: Range) + public func boundsPrecondition(_ offsets: Range) } ``` From b7e193329fad7d8ab8bd3e687ab53908810af143 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 6 Sep 2024 10:38:05 -0700 Subject: [PATCH 65/73] addition about closure-based unsafe escape-hatch functions --- proposals/nnnn-span-access-shared-contiguous-storage.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 931921a8a7..629ae5dc88 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -236,6 +236,8 @@ extension Span where Element: BitwiseCopyable { } ``` +These functions use a closure to define the scope of validity of `buffer`, ensuring that the underlying `Span` and the binding it depends on both remain valid through the end of the closure. They have the same shape as the equivalents on `Array` because they fulfill the same function, namely to keep the underlying binding alive. + ### RawSpan In addition to `Span`, we propose the addition of `RawSpan`, to represent heterogeneously-typed values in contiguous memory. `RawSpan` is similar to `Span`, but represents _untyped_ initialized bytes. `RawSpan` is a specialized type that is intended to support parsing and decoding applications, as well as applications where heavily-used code paths require concrete types as much as possible. Its API supports the data loading operations `unsafeLoad(as:)` and `unsafeLoadUnaligned(as:)`. From 2255c158cc226d85ad6cc523229a66bcd5b720d8 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 6 Sep 2024 19:03:02 -0700 Subject: [PATCH 66/73] remove boundsPrecondition, add boundsContain overload - add `boundsContain(_ bounds: ClosedRange)` --- ...n-span-access-shared-contiguous-storage.md | 20 ++++++------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 629ae5dc88..5cad08709d 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -143,12 +143,6 @@ extension Span where Element: ~Copyable & ~Escapable { /// - Returns: true if `index` is a valid index public func boundsContain(_ index: Int) -> Bool - /// Traps if `index` is not a valid offset into this `Span` - /// - /// - Parameters: - /// - index: an index to validate - public func boundsPrecondition(_ index: Int) - /// Return true if `indices` is a valid range of offsets into this `Span` /// /// - Parameters: @@ -156,11 +150,12 @@ extension Span where Element: ~Copyable & ~Escapable { /// - Returns: true if `indices` is a valid range of indices public func boundsContain(_ indices: Range) -> Bool - /// Traps if `indices` is not a valid range of offsets into this `Span` + /// Return true if `indices` is a valid range of offsets into this `Span` /// /// - Parameters: /// - indices: a range of indices to validate - public func boundsPrecondition(_ indices: Range) + /// - Returns: true if `indices` is a valid range of indices + public func boundsContain(_ indices: ClosedRange) -> Bool } ``` @@ -388,17 +383,14 @@ extension RawSpan { ##### `RawSpan` bounds checking: ```swift extension RawSpan { - /// Return true if `offset` is a valid offset into this `RawSpan` + /// Return true if `offset` is a valid byte offset into this `RawSpan` public func boundsContain(_ offset: Int) -> Bool - /// Traps if `offset` is not a valid offset into this `RawSpan` - public func boundsPrecondition(_ offset: Int) - /// Return true if `offsets` is a valid range of offsets into this `RawSpan` public func boundsContain(_ offsets: Range) -> Bool - /// Traps if `offsets` is not a valid range of offsets into this `RawSpan` - public func boundsPrecondition(_ offsets: Range) + /// Return true if `offsets` is a valid range of offsets into this `RawSpan` + public func boundsContain(_ offsets: ClosedRange) -> Bool } ``` From 92b2e2f78da977637cdf8064cc8003da289fc5b5 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 9 Sep 2024 14:43:01 -0700 Subject: [PATCH 67/73] start pointer clarification --- proposals/nnnn-span-access-shared-contiguous-storage.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 5cad08709d..60bca3e4fd 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -65,11 +65,13 @@ A `RawSpan` can be obtained from containers of `BitwiseCopyable` elements, as we ```swift @frozen public struct Span: Copyable, ~Escapable { - internal var _start: UnsafePointer + internal var _start: UnsafeRawPointer? internal var _count: Int } ``` +We store a `UnsafeRawPointer` value internally in order to explicitly support reinterpreted views of memory as containing different types of `BitwiseCopyable` elements. Note that the the optionality of the pointer does not affect usage of `Span`, since accesses are bounds-checked and the pointer is only dereferenced when the `Span` isn't empty, and the pointer cannot be `nil`. + It provides a buffer-like interface to the elements stored in that span of memory: ```swift @@ -110,7 +112,7 @@ extension Span where Element: ~Copyable & ~Escapable { } ``` -Note that we use a `_read` accessor for the subscript, a requirement in order to `yield` a borrowed non-copyable `Element` (see ["Coroutines"](#Coroutines).) This will be updated to a final syntax at a later time. +Note that we use a `_read` accessor for the subscript, a requirement in order to `yield` a borrowed non-copyable `Element` (see ["Coroutines"](#Coroutines).) This will be updated to a final syntax at a later time, understanding that we intend the replacement to be source-compatible. ##### Unchecked access to elements: From 3b127175f1d2e58d6178d97ed06c355dcbb4d67e Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Mon, 9 Sep 2024 15:13:02 -0700 Subject: [PATCH 68/73] improve coroutine explanation --- proposals/nnnn-span-access-shared-contiguous-storage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 60bca3e4fd..3e56263b1f 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -468,9 +468,9 @@ extension RawSpan { We are subsetting these functions of `Span` and `RawSpan` until the lifetime annotations are proposed. -#### Coroutine Accessors +#### Coroutine or Projection Accessors -This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for some types to be able to provide borrowing access to their internal storage, in particular storage containing non-copyable elements. When a stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` and `RawSpan` will be adapted to the new syntax. +This proposal includes some `_read` accessors, the coroutine version of the `get` accessor. `_read` accessors are not an official part of the Swift language, but are necessary for some types to be able to provide borrowing access to their internal storage, in particular storage containing non-copyable elements. The correct solution may involve a projection of a different type than is provided by a coroutine. When correct, stable replacement for `_read` accessors is proposed and accepted, the implementation of `Span` and `RawSpan` will be adapted to the new syntax. #### Extensions to Standard Library and Foundation types From 38c68403df677319403dc6944864bf5d1314a67f Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Fri, 6 Sep 2024 12:00:28 -0700 Subject: [PATCH 69/73] convert non-breaking spaces --- proposals/nnnn-span-access-shared-contiguous-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 3e56263b1f..559595d3f4 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -233,7 +233,7 @@ extension Span where Element: BitwiseCopyable { } ``` -These functions use a closure to define the scope of validity of `buffer`, ensuring that the underlying `Span` and the binding it depends on both remain valid through the end of the closure. They have the same shape as the equivalents on `Array` because they fulfill the same function, namely to keep the underlying binding alive. +These functions use a closure to define the scope of validity of `buffer`, ensuring that the underlying `Span` and the binding it depends on both remain valid through the end of the closure. They have the same shape as the equivalents on `Array` because they fulfill the same function, namely to keep the underlying binding alive. ### RawSpan From 1279d7084a9548d1cf87ad525a3cafd8212fffad Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Tue, 10 Sep 2024 12:09:09 -0700 Subject: [PATCH 70/73] fix extensions --- .../nnnn-span-access-shared-contiguous-storage.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 559595d3f4..01e95398ba 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -75,7 +75,7 @@ We store a `UnsafeRawPointer` value internally in order to explicitly support re It provides a buffer-like interface to the elements stored in that span of memory: ```swift -extension Span where Element: ~Copyable & ~Escapable { +extension Span where Element: ~Copyable { public var count: Int { get } public var isEmpty: Bool { get } @@ -104,7 +104,7 @@ Initializers, required for library adoption, will be proposed alongside [lifetim The following properties, functions and subscripts have direct counterparts in the `Collection` protocol hierarchy. Their semantics shall be as described where they counterpart is declared (in `Collection` or `RandomAccessCollection`). ```swift -extension Span where Element: ~Copyable & ~Escapable { +extension Span where Element: ~Copyable { public var count: Int { get } public var isEmpty: Bool { get } @@ -119,7 +119,7 @@ Note that we use a `_read` accessor for the subscript, a requirement in order to The `subscript` mentioned above has always-on bounds checking of its parameter, in order to prevent out-of-bounds accesses. We also want to provide unchecked variants as an alternative for cases where bounds-checking is proving costly, such as in tight loops: ```swift -extension Span where Element: ~Copyable & ~Escapable { +extension Span where Element: ~Copyable { // Unchecked subscripting and extraction /// Accesses the element at the specified `position`. @@ -137,7 +137,7 @@ extension Span where Element: ~Copyable & ~Escapable { Every time `Span` uses a position parameter, it checks for its validity, unless the parameter is marked with the word "unchecked". The validation is performed with these functions: ```swift -extension Span where Element: ~Copyable & ~Escapable { +extension Span where Element: ~Copyable { /// Return true if `index` is a valid offset into this `Span` /// /// - Parameters: @@ -168,7 +168,7 @@ Note: these function names are not ideal. When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span: ```swift -extension Span where Element: ~Copyable & ~Escapable { +extension Span where Element: ~Copyable { /// Returns true if the memory represented by `span` is a subrange of /// the memory represented by `self` /// @@ -194,7 +194,7 @@ extension Span where Element: ~Copyable & ~Escapable { We provide two functions for interoperability with C or other legacy pointer-taking functions. ```swift -extension Span where Element: ~Copyable & ~Escapable { +extension Span where Element: ~Copyable { /// Calls a closure with a pointer to the viewed contiguous storage. /// /// The buffer pointer passed as an argument to `body` is valid only @@ -449,7 +449,7 @@ These annotations have been [pitched][PR-2305-pitch] and are expected to be form `Span`s representing subsets of consecutive elements could be extracted out of a larger `Span` with an API similar to the `extracting()` functions recently added to `UnsafeBufferPointer` in support of non-copyable elements: ```swift -extension Span where Element: ~Copyable & ~Escapable { +extension Span where Element: ~Copyable { public func extracting(_ bounds: Range) -> Self } ``` From e18eafe4aa97ecd64c8a3038ec10a3b7e68507b1 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 11 Sep 2024 15:53:53 -0700 Subject: [PATCH 71/73] [feedback] mention initializers earlier --- proposals/nnnn-span-access-shared-contiguous-storage.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 01e95398ba..999ba6b8ae 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -50,7 +50,9 @@ Even if the body of the `withUnsafeXXX` call does not escape the pointer, other `Span` is intended as the currency type for local processing over values in contiguous memory. It is a replacement for many API currently using `Array`, `UnsafeBufferPointer`, `Foundation.Data`, etc., that do not need to escape the owning container. -A `Span` provided by a container represents a borrow of that container. `Span` can therefore provide simultaneous access to a non-copyable container. It can also help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see [future directions](#Directions) for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes)) +A `Span` provided by a container represents a borrow of that container. `Span` can therefore provide simultaneous access to a non-copyable container. It can also help avoid unwanted copies of copyable containers. Note that `Span` is not a replacement for a copyable container with owned storage; see [future directions](#Directions) for more details ([Resizable, contiguously-stored, untyped collection in the standard library](#Bytes).) + +In this initial proposal, no initializers are proposed for `Span`. Initializers for non-escapable types such as `Span` require a concept of lifetime dependency, which does not exist at this time. The lifetime dependency annotation will indicate to the compiler how a newly-created `Span` can be used safely. See also ["Initializers"](#Initializers) in [future directions](#Directions). #### `RawSpan` From b9a166e6a3ef3f52a24a9cccd5bacd2755d5e9b5 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Wed, 11 Sep 2024 16:22:27 -0700 Subject: [PATCH 72/73] rename span comparison functions --- ...n-span-access-shared-contiguous-storage.md | 27 +++++++++++-------- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index 999ba6b8ae..f04ac8dca5 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -167,27 +167,30 @@ Note: these function names are not ideal. ##### Identifying whether a `Span` is a subrange of another: -When working with multiple `Span` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span: +When working with multiple `Span` instances, it is often desirable to know whether one is identical to or a subrange of another. We include functions to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span: ```swift extension Span where Element: ~Copyable { - /// Returns true if the memory represented by `span` is a subrange of - /// the memory represented by `self` + /// Returns true if the other span represents exactly the same memory + public func isIdentical(to span: borrowing Self) -> Bool + + /// Returns true if the memory represented by `self` is a subrange of + /// the memory represented by `span` /// /// Parameters: /// - span: a span of the same type as `self` - /// Returns: whether `span` is a subrange of `self` - public func contains(_ span: borrowing Self) -> Bool + /// Returns: whether `self` is a subrange of `span` + public func isWithin(_ span: borrowing Self) -> Bool - /// Returns the offsets where the memory of `span` is located within - /// the memory represented by `self` + /// Returns the offsets where the memory of `self` is located within + /// the memory represented by `span` /// /// Note: `span` must be a subrange of `self` /// /// Parameters: /// - span: a subrange of `self` /// Returns: A range of offsets within `self` - public func offsets(of span: borrowing Self) -> Range + public func indicesWithin(_ span: borrowing Self) -> Range } ``` @@ -400,13 +403,15 @@ extension RawSpan { ##### Identifying whether a `RawSpan` is a subrange of another: -When working with multiple `RawSpan` instances, it is often desirable to know whether one is a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: +When working with multiple `RawSpan` instances, it is often desirable to know whether one is identical to or a subrange of another. We include a function to determine whether this is the case, as well as a function to obtain the valid offsets of the subrange within the larger span. The documentation is omitted here, as it is substantially the same as for the equivalent functions on `Span`: ```swift extension RawSpan { - public func contains(_ span: borrowing Self) -> Bool + public func isIdentical(to span: borrowing Self) -> Bool + + public func isWithin(_ span: borrowing Self) -> Bool - public func offsets(of span: borrowing Self) -> Range + public func byteOffsetsWithin(_ span: borrowing Self) -> Range } ``` From 25aca7e8d3a96f03ee2bfd44b00669c58832e731 Mon Sep 17 00:00:00 2001 From: Guillaume Lessard Date: Thu, 12 Sep 2024 17:58:12 -0700 Subject: [PATCH 73/73] fix span comparison signatures and documentation --- .../nnnn-span-access-shared-contiguous-storage.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/proposals/nnnn-span-access-shared-contiguous-storage.md b/proposals/nnnn-span-access-shared-contiguous-storage.md index f04ac8dca5..e547b07941 100644 --- a/proposals/nnnn-span-access-shared-contiguous-storage.md +++ b/proposals/nnnn-span-access-shared-contiguous-storage.md @@ -183,14 +183,12 @@ extension Span where Element: ~Copyable { public func isWithin(_ span: borrowing Self) -> Bool /// Returns the offsets where the memory of `self` is located within - /// the memory represented by `span` - /// - /// Note: `span` must be a subrange of `self` + /// the memory represented by `span`, or `nil` /// /// Parameters: /// - span: a subrange of `self` - /// Returns: A range of offsets within `self` - public func indicesWithin(_ span: borrowing Self) -> Range + /// Returns: A range of offsets within `self`, or `nil` + public func indicesWithin(_ span: borrowing Self) -> Range? } ``` @@ -411,7 +409,7 @@ extension RawSpan { public func isWithin(_ span: borrowing Self) -> Bool - public func byteOffsetsWithin(_ span: borrowing Self) -> Range + public func byteOffsetsWithin(_ span: borrowing Self) -> Range? } ```