[Swift] 指针UnsafePointer

本文系学习Swift中的指针操作详解的整理

默认情况下Swift是内存安全的,苹果官方不鼓励我们直接操作内存。但是,Swift中也提供了使用指针操作内存的方法,直接操作内存是很危险的行为,很容易就出现错误,因此官方将直接操作内存称为 “unsafe 特性”。

在操作指针之前,需要理解几个概念:size、alignment、stride,以及他们的获取/使用,这就用到了 MemoryLayout :

MemoryLayout

使用MemoryLayout,可以检测某个类型的实际大小(size),内存对齐大小(alignment),以及实际占用的内存大小(步长:stride),其单位均为字节;

public enum MemoryLayout<T> {

    public static var size: Int { get }

    public static var stride: Int { get }

    public static var alignment: Int { get }

    public static func size(ofValue value: T) -> Int

    public static func stride(ofValue value: T) -> Int

    public static func alignment(ofValue value: T) -> Int
}

例如:如果一个类型的大小(size)为5字节,对齐内存(alignment)大小为4字节,那么其实际占用的内存大小(stride)为8字节,这是因为编译需要为其填充空白的边界,使其符合它的 4 字节内存边界对齐。

常见基本类型的内存size、alignment、stride:

MemoryLayout<Int>.size // return 8 (on 64-bit)
MemoryLayout<Int>.alignment // return 8 (on 64-bit)
MemoryLayout<Int>.stride // return 8 (on 64-bit)

MemoryLayout<Int16>.size // return 2
MemoryLayout<Int16>.alignment // return 2
MemoryLayout<Int16>.stride // return 2

MemoryLayout<Bool>.size // return 1
MemoryLayout<Bool>.alignment // return 1
MemoryLayout<Bool>.stride // return 1

MemoryLayout<Float>.size // return 4
MemoryLayout<Float>.alignment // return 4
MemoryLayout<Float>.stride // return 4

MemoryLayout<Double>.size // return 8
MemoryLayout<Double>.alignment // return 8
MemoryLayout<Double>.stride // return 8

原文中Bool的相关值为2,在Playground中测试结果为1,所以这里更改为了1

一般在移动指针的时候,对于特定类型,指针一次移动一个stride(步长),移动的范围,要在分配的内存范围内,切记超出分配的内存空间,切记超出分配的内存空间,切记超出分配的内存空间。
一般情况下stride是alignment的整数倍,即符合内存对齐原则;实际分配的内存空间大小也是alignment的整数倍,但是实际实例大小可能会小于实际分配的内存空间大小。

UnsafePointer

所有指针类型为 UnsafePointer,一旦你操作了内存,编译器不会对这种操作进行检测,你需要对自己的代码承担全部的责任。Swift中定义了一些特定类型的指针,每个类型都有他们的作用和目的,使用适当的指针类型可以防止错误的发生,并且更清晰得表达开发者的意图,防止未定义行为的产生。

通过指针类型的名称,我们可以知道这是一个什么类型的指针:可变/不可变、原生(raw)/有类型、是否是缓冲类型(buffer),大致有以下8种类型:

Pointer Name Unsafe? Write Access? Collection Strideable? Typed?
UnsafeMutablePointer<T> yes yes no yes yes
UnsafePointer<T> yes no no yes yes
UnsafeMutableBufferPointer<T> yes yes yes no yes
UnsafeBufferPointer<T> yes no yes no yes
UnsafeRawPointer yes no no yes no
UnsafeMutableRawPointer yes yes no yes no
UnsafeMutableRawBufferPointer yes yes yes no no
UnsafeRawBufferPointer yes no yes no no
  • unsafe:不安全的
  • Write Access:可写入
  • Collection:像一个容器,可添加数据
  • Strideable:指针可使用 advanced 函数移动
  • Typed:是否需要指定类型(范型)

原生(Raw)指针

// 1
let count = 2
let stride = MemoryLayout<Int>.stride
let aligment = MemoryLayout<Int>.alignment
let byteCount = stride * count

// 2
do {
    print("raw pointers")
    // 3
    let pointer = UnsafeMutableRawPointer.allocate(byteCount: byteCount, alignment: aligment)
    
    // 4
    defer {
        pointer.deallocate()
        
    }
    
    // 5
    pointer.storeBytes(of: 42, as: Int.self)
    pointer.advanced(by: stride).storeBytes(of: 6, as: Int.self)
// 读取第一个值
    pointer.load(as: Int.self)
// 读取第二个值
    pointer.advanced(by: stride).load(as: Int.self)
    
    // 6
    let bufferPointer = UnsafeRawBufferPointer(start: pointer, count: byteCount)
    for (index, byte) in bufferPointer.enumerated() {
        print("bute \(index): \(byte)")
    }
}

代码说明:

    1. 声明示例用到的参数
      count :整数的个数
      stride:整数的步长
      aligment:整数的内存对齐大小
      byteCount:实际需要的内存大小
    1. 声明作用域
      使用 do 来增加一个作用域,让我们可以在接下的示例中复用作用域中的变量名。
    1. UnsafeMutableRawPointer.allocate 创建分配所需字节数,该指针可以用来读取和存储(改变)原生的字节。

这里分配内存使用了allocate 方法:

public static func allocate(byteCount: Int, alignment: Int) -> UnsafeMutableRawPointer
  • byteCount:所需字节数
  • alignment:内存对齐
    1. 延时释放
      使用 defer 来保证内存得到正确地释放,操作指针的时候,所有内存都需要我们手动进行管理。
      这里释放内存使用了deallocate方法:
 public func deallocate()

allocate 和 deallocate 方法一定要配对出现。

    1. 使用 storeBytes 和 load 方法存储和读取字节

存储数据方法storeBytes:

/// - Parameters:
///   - value: The value to store as raw bytes.
///   - offset: The offset from this pointer, in bytes. `offset` must be
///     nonnegative. The default is zero.
///   - type: The type of `value`.
public func storeBytes<T>(of value: T, toByteOffset offset: Int = default, as type: T.Type)
  • value:要存储的值
  • offset:偏移量,默认即可
  • type:值的类型

读取数据方法 load :

/// - Parameters:
    ///   - offset: The offset from this pointer, in bytes. `offset` must be
    ///     nonnegative. The default is zero.
    ///   - type: The type of the instance to create.
    /// - Returns: A new instance of type `T`, read from the raw bytes at
    ///   `offset`. The returned instance is memory-managed and unassociated
    ///   with the value in the memory referenced by this pointer.
    public func load<T>(fromByteOffset offset: Int = default, as type: T.Type) -> T
  • offset:偏移量,默认即可
  • type:值的类型

移动指针地址 advanced :

/// - Parameter n: The number of bytes to offset this pointer. `n` may be
    ///   positive, negative, or zero.
    /// - Returns: A pointer offset from this pointer by `n` bytes.
    public func advanced(by n: Int) -> UnsafeMutableRawPointer
  • n:步长stride

使用原生指针,存储下一个值的时候需要移动一个步长(stride),也可以直接使用 + 运算符:

(pointer + stride).storeBytes(of: 6, as: Int.self)
    1. UnsafeRawBufferPointer 类型以字节流的形式来读取内存。这意味着我们可以这些字节进行迭代,对其使用下标,或者使用 filter,map 以及 reduce 这些很酷的方法,缓冲类型指针使用了原生指针进行初始化。

类型指针

do {
    print("Typed pointers")
    // 1.
    let pointer = UnsafeMutablePointer<Int>.allocate(capacity: count)
    pointer.initialize(repeating: 0, count: count)

    // 2.
    defer {
        pointer.deinitialize(count: count)
        pointer.deallocate()
    }
    // 3.
    pointer.pointee = 42
    pointer.advanced(by: 1).pointee = 6
    pointer.pointee
    pointer.advanced(by: 1).pointee
    
    let bufferPointer = UnsafeBufferPointer(start: pointer, count: count)
    for (index, value) in bufferPointer.enumerated() {
        print("value \(index): \(value)")
    }
    
}

类型指针与原生指针的区别,主要体现在上面标注数字的几个地方:

    1. 分配内存、初始化

类型指针,在分配内存的时候通过给范型赋值来指定当前指针所操作的数据类型:

/// - Parameter count: The amount of memory to allocate, counted in instances
    ///   of `Pointee`.
    public static func allocate(capacity count: Int) -> UnsafeMutablePointer<Pointee>
  • count:要存储的数据个数

可以看到其分配内存的方法,只有一个参数,指定所要存储的数据个数即可,因为通过给范型参数赋值,已经知道了要存储的数据类型,其alignment和stride就确定了,这时只需要再知道存储几个数据即可。

这里还多了个初始化的过程,类型指针单单分配内存,还不能使用,还需要初始化:

/// - Parameters:
    ///   - repeatedValue: The instance to initialize this pointer's memory with.
    ///   - count: The number of consecutive copies of `newValue` to initialize.
    ///     `count` must not be negative. 
    public func initialize(repeating repeatedValue: Pointee, count: Int)
  • repeatedValue:默认值
  • count:数量
    1. 延时释放

在释放的时候,要先释放已初始化的实例(deinitialize),再释放已分配的内存(deallocate)空间:

/// - Parameter count: The number of instances to deinitialize. `count` must
    ///   not be negative. 
    /// - Returns: A raw pointer to the same address as this pointer. The memory
    ///   referenced by the returned raw pointer is still bound to `Pointee`.
    public func deinitialize(count: Int) -> UnsafeMutableRawPointer
  • count:数量
    1. 存储/读取
      类型指针的存储/读取值,不需要再使用storeBytes/load,Swift提供了一个以类型安全的方式读取和存储值--pointee:
public var pointee: Pointee { get nonmutating set }

这里的移动指针的方法,和上面的一致,也是 advanced ,但是其参数有所不同:

/// - Parameter n: The number of strides of the pointer's `Pointee` type to
    ///   offset this pointer. To access the stride, use
    ///   `MemoryLayout<Pointee>.stride`. `n` may be positive, negative, or
    ///   zero.
    /// - Returns: A pointer offset from this pointer by `n` instances of the
    ///   `Pointee` type.
    public func advanced(by n: Int) -> UnsafeMutablePointer<Pointee>
  • n:这里是按类型值的个数进行移动

同样,这里也可以使用运算符 + 进行移动:

(pointer + 1).pointee = 6

原生指针转换为类型指针

do {
    print("Converting raw pointers to typed pointers")
    // 创建原生指针
    let rawPointer = UnsafeMutableRawPointer.allocate(byteCount: byteCount, alignment: aligment)

// 延迟释放原生指针的内存
    defer {
        rawPointer.deallocate()
    }
    // 将原生指针绑定类型
    let typePointer = rawPointer.bindMemory(to: Int.self, capacity: count)
    typePointer.initialize(repeating: 0, count: count)
    defer {
        typePointer.deinitialize(count: count)
    }
    
    typePointer.pointee = 42
    typePointer.advanced(by: 1).pointee = 9
    typePointer.pointee
    typePointer.advanced(by: 1).pointee
    
    let bufferPointer = UnsafeBufferPointer(start: typePointer, count: count)
    for (index, value) in bufferPointer.enumerated() {
        print("value \(index): \(value)")
    }
}

原生指针转换为类型指针,是通过调用内存绑定到特定的类型来完成的:

/// - Parameters:
    ///   - type: The type `T` to bind the memory to.
    ///   - count: The amount of memory to bind to type `T`, counted as instances
    ///     of `T`.
    /// - Returns: A typed pointer to the newly bound memory. The memory in this
    ///   region is bound to `T`, but has not been modified in any other way.
    ///   The number of bytes in this region is
    ///   `count * MemoryLayout<T>.stride`.
    public func bindMemory<T>(to type: T.Type, capacity count: Int) -> UnsafeMutablePointer<T>
  • type:数据类型
  • count:容量

通过对内存的绑定,我们可以通过类型安全的方法来访问它。其实我们手动创建类型指针的时候,系统自动帮我们进行了内存绑定。

获取一个实例的字节

这里定义了一个结构体 Sample来作为示例:

struct Sample {
    
    var number: Int
    var flag: Bool
    
    init(number: Int, flag: Bool) {
        self.number = number
        self.flag = flag
    }
}

do {
    print("Getting the bytes of an instance")
    
    var sample = Sample(number: 25, flag: true)
    // 1.
    withUnsafeBytes(of: &sample) { (rs) in
        
        for bute in rs {
            print(bute)
        }
    }
}

这里主要是使用了withUnsafeBytes 方法来实现获取字节数:

/// - Parameters:
///   - arg: An instance to temporarily access through a raw buffer pointer.
///   - body: A closure that takes a raw buffer pointer to the bytes of `arg`
///     as its sole argument. If the closure has a return value, that value is
///     also used as the return value of the `withUnsafeBytes(of:_:)`
///     function. The buffer pointer argument is valid only for the duration
///     of the closure's execution.
/// - Returns: The return value, if any, of the `body` closure.
public func withUnsafeBytes<T, Result>(of arg: inout T, _ body: (UnsafeRawBufferPointer) throws -> Result) rethrows -> Result
  • arg:实例对象地址
  • body:回调闭包,参数为UnsafeRawBufferPointer 类型的指针

注意:该方法和回调闭包都有返回值,如果闭包有返回值,此返回值将会作为该方法的返回值;但是,一定不要在闭包中将body的参数,即:UnsafeRawBufferPointer 类型的指针作为返回值返回,该参数的使用范围仅限当前闭包,该参数的使用范围仅限当前闭包,该参数的使用范围仅限当前闭包。

withUnsafeBytes 同样适合用 Array 和 Data 的实例.

使用指针的原则

不要从 withUnsafeBytes 中返回指针

绝对不要让指针逃出 withUnsafeBytes(of:) 的作用域范围。这样的代码会成为定时炸弹,你永远不知道它什么时候可以用,而什么时候会崩溃。

一次只绑定一种类型

在使用 bindMemory方法将原生指针绑定内存类型,转为类型指针的时候,一次只能绑定一个类型,例如:将一个原生指针绑定Int类型,不能再绑定Bool类型:

let typePointer = rawPointer.bindMemory(to: Int.self, capacity: count)
// 一定不要这么做
let typePointer1 = rawPointer.bindMemory(to: Bool.self, capacity: count)

但是,我们可以使用 withMemoryRebound 来对内存进行重新绑定。并且,这条规则也表明了不要将一个基本类型(如 Int)重新绑定到一个自定义类型(如 class)上。

/// - Parameters:
    ///   - type: The type to temporarily bind the memory referenced by this
    ///     pointer. The type `T` must be the same size and be layout compatible
    ///     with the pointer's `Pointee` type.
    ///   - count: The number of instances of `T` to bind to `type`.
    ///   - body: A closure that takes a mutable typed pointer to the
    ///     same memory as this pointer, only bound to type `T`. The closure's
    ///     pointer argument is valid only for the duration of the closure's
    ///     execution. If `body` has a return value, that value is also used as
    ///     the return value for the `withMemoryRebound(to:capacity:_:)` method.
    /// - Returns: The return value, if any, of the `body` closure parameter.
    public func withMemoryRebound<T, Result>(to type: T.Type, capacity count: Int, _ body: (UnsafeMutablePointer<T>) throws -> Result) rethrows -> Result
  • type:值的类型
  • count:值的个数
  • body:回调闭包,参数为UnsafeRawBufferPointer 类型的指针

注意:该方法和回调闭包都有返回值,如果闭包有返回值,此返回值将会作为该方法的返回值;但是,一定不要在闭包中将body的参数,即:UnsafeRawBufferPointer 类型的指针作为返回值返回,该参数的使用范围仅限当前闭包,该参数的使用范围仅限当前闭包,该参数的使用范围仅限当前闭包。

不要操作超出范围的内存
do {
    
    let count = 3
    let stride = MemoryLayout<Int16>.stride
    let alignment = MemoryLayout<Int16>.alignment
    let byteCount = count * stride
    
    let pointer = UnsafeMutableRawPointer.allocate(byteCount: byteCount, alignment: alignment)
    // 1. 这里的count+1,超出了原有指针pointer分配的内存范围
    let bufferPointer = UnsafeRawBufferPointer.init(start: pointer, count: count + 1)
    
    for byte in bufferPointer {
        print(byte)
    }
}

这里的count+1,超出了原有指针pointer分配的内存范围,切记不要出现这种情况。

推荐阅读更多精彩内容