Strings and Characters (字符串和字符)

Astringis a series of characters, such as"hello, world"or"albatross". Swift strings are represented by theStringtype. The contents of aStringcan be accessed in various ways, including as a collection ofCharactervalues.

字符串是例如"hello, world","albatross"这样的有序的Character(字符)类型的值的集合。通过String类型来表示。一个String的内容可以用许多方式读取,包括作为一个Character值的集合。

Swift’sStringandCharactertypes provide a fast, Unicode-compliant way to work with text in your code. The syntax for string creation and manipulation is lightweight and readable, with a string literal syntax that is similar to C. String concatenation is as simple as combining two strings with the+operator, and string mutability is managed by choosing between a constant or a variable, just like any other value in Swift. You can also use strings to insert constants, variables, literals, and expressions into longer strings, in a process known as string interpolation. This makes it easy to create custom string values for display, storage, and printing.

Swift 的String和Character类型提供了快速和兼容 Unicode 的方式供你的代码使用。创建和操作字符串的语法与 C 语言中字符串操作相似,轻量并且易读。字符串连接操作只需要简单地通过+符号将两个字符串相连即可。与 Swift 中其他值一样,能否更改字符串的值,取决于其被定义为常量还是变量。你也可以在字符串内插过程中使用字符串插入常量、变量、字面量表达成更长的字符串,这样可以很容易的创建自定义的字符串值,进行展示、存储以及打印。

Despite this simplicity of syntax, Swift’sStringtype is a fast, modern string implementation. Every string is composed of encoding-independent Unicode characters, and provides support for accessing those characters in various Unicode representations.

尽管语法简易,但String类型是一种快速、现代化的字符串实现。每一个字符串都是由编码无关的 Unicode 字符组成,并支持访问字符的多种 Unicode 表示形式(representations)。

Note

Swift’sStringtype is bridged with Foundation’sNSStringclass. Foundation also extendsStringto expose methods defined byNSString. This means, if you import Foundation, you can access thoseNSStringmethods onStringwithout casting. For more information about usingStringwith Foundation and Cocoa, seeWorking with Cocoa Data TypesinUsing Swift with Cocoa and Objective-C (Swift 3.0.1).

Swift 的String类型与 Foundation中的NSString类进行了无缝桥接。Foundation 也可以对String进行扩展,暴露在NSString中定义的方法。 这意味着,如果你在String中调用这些NSString的方法,将不用进行转换. 更多关于在 Foundation 和 Cocoa 中使用String的信息请查看Using Swift with Cocoa and Objective-C (Swift 3.0.1)

String Literals (字符串字面量)

You can include predefinedStringvalues within your code asstring literals. A string literal is a fixed sequence of textual characters surrounded by a pair of double quotes ("").Use a string literal as an initial value for a constant or variable:

您可以在您的代码中包含一段预定义的字符串值作为字符串字面量。字符串字面量是由双引号 ("") 包裹着的具有固定顺序的文本字符集。字符串字面量可以用于为常量和变量提供初始值:

let someString="Some string literal value"

Note that Swift infers a type ofStringfor thesomeStringconstant, because it is initialized with a string literal value.

注意someString常量通过字符串字面量进行初始化,Swift 会推断该常量为String类型。

Note

For information about using special characters in string literals, seeSpecial Characters in String Literals.

更多关于在字符串字面量中使用特殊字符的信息,请查看字符串字面量的特殊字符

Initializing an Empty String (初始化空字符串)

To create an emptyStringvalue as the starting point for building a longer string, either assign an empty string literal to a variable, or initialize a newStringinstance with initializer syntax:

要创建一个空字符串作为初始值,可以将空的字符串字面量赋值给变量,也可以初始化一个新的String实例:

var emptyString=""// empty string literal

var anotherEmptyString=String()// initializer syntax

// these two strings are both empty, and are equivalent to each other

// 两个字符串均为空并等价。

Find out whether aStringvalue is empty by checking its BooleanisEmptyproperty:

您可以通过检查其Bool类型的isEmpty属性来判断该字符串是否为空:

if emptyString.isEmpty{

    print("Nothing to see here")

}

    // Prints "Nothing to see here"

String Mutability (字符串的可变性)

You indicate whether a particularStringcan be modified (ormutated) by assigning it to a variable (in which case it can be modified), or to a constant (in which case it cannot be modified):

您可以通过将一个特定字符串分配给一个变量来对其进行修改,或者分配给一个常量来保证其不会被修改:

var variableString="Horse"

var iableString+=" and carriage"

// variableString is now "Horse and carriage"

let constantString="Highlander"

constantString+=" and another Highlander"

// this reports a compile-time error - a constant string cannot be modified

// 这会报告一个编译错误 (compile-time error) - 常量字符串不可以被修改。

Note

This approach is different from string mutation in Objective-C and Cocoa, where you choose between two classes (NSStringandNSMutableString) to indicate whether a string can be mutated.

在 Objective-C 和 Cocoa 中,您需要通过选择两个不同的类(NSString和NSMutableString)来指定字符串是否可以被修改。

Strings Are Value Types (字符串是值类型)

Swift’sStringtype is avalue type. If you create a newStringvalue, thatStringvalue iscopiedwhen it is passed to a function or method, or when it is assigned to a constant or variable. In each case, a new copy of the existingStringvalue is created, and the new copy is passed or assigned, not the original version. Value types are described inStructures and Enumerations Are Value Types.

Swift 的String类型是值类型。如果您创建了一个新的字符串,那么当其进行常量、变量赋值操作,或在函数/方法中传递时,会进行值拷贝。任何情况下,都会对已有字符串值创建新副本,并对该新副本进行传递或赋值操作。值类型在结构体和枚举是值类型中进行了详细描述。

Swift’s copy-by-defaultStringbehavior ensures that when a function or method passes you aStringvalue, it is clear that you own that exactStringvalue, regardless of where it came from. You can be confident that the string you are passed will not be modified unless you modify it yourself.

Swift 默认字符串拷贝的方式保证了在函数/方法中传递的是字符串的值。很明显无论该值来自于哪里,都是您独自拥有的。您可以确信传递的字符串不会被修改,除非你自己去修改它。

Behind the scenes, Swift’s compiler optimizes string usage so that actual copying takes place only when absolutely necessary. This means you always get great performance when working with strings as value types.

在实际编译时,Swift 编译器会优化字符串的使用,使实际的复制只发生在绝对必要的情况下,这意味着您将字符串作为值类型的同时可以获得极高的性能。

Working with Characters (使用字符)

You can access the individualCharactervalues for aStringby iterating over itscharactersproperty with afor-inloop:

您可通过for-in循环来遍历字符串中的characters属性来获取每一个字符的值:

for character in "Dog!🐶".characters {

    print(character)

}

// D

// o

// g

// !

// 🐶

Thefor-inloop is described inFor-In Loops.

for-in循环在For 循环中进行了详细描述。

Alternatively, you can create a stand-aloneCharacterconstant or variable from a single-character string literal by providing aCharactertype annotation:

另外,通过标明一个Character类型并用字符字面量进行赋值,可以建立一个独立的字符常量或变量:

let exclamationMark:Character="!"

Stringvalues can be constructed by passing an array ofCharactervalues as an argument to its initializer:

字符串可以通过传递一个值类型为Character的数组作为自变量来初始化:

let catCharacters: [Character] = ["C","a","t","!","🐱"]

let catString=String(catCharacters)

print(catString)

// Prints "Cat!🐱"

Concatenating Strings and Characters (连接字符串和字符)

Stringvalues can be added together (orconcatenated) with the addition operator (+) to create a newStringvalue:

字符串可以通过加法运算符(+)相加在一起(或称“连接”)创建一个新的字符串:

let string1="hello"

let string2=" there"

var welcome=string1+string2

// welcome now equals "hello there"

You can also append aStringvalue to an existingStringvariable with the addition assignment operator (+=):

您也可以通过加法赋值运算符 (+=) 将一个字符串添加到一个已经存在字符串变量上:

var instruction="look over"

instruction+=string2

// instruction now equals "look over there"

You can append aCharactervalue to aStringvariable with theStringtype’sappend()method:

您可以用append()方法将一个字符附加到一个字符串变量的尾部:

let exclamationMark:Character="!"

welcome.append(exclamationMark)

// welcome now equals "hello there!"

Note

You can’t append aStringorCharacterto an existingCharactervariable, because aCharactervalue must contain a single character only.

您不能将一个字符串或者字符添加到一个已经存在的字符变量上,因为字符变量只能包含一个字符。

String Interpolation (字符串插值)

String interpolationis a way to construct a newStringvalue from a mix of constants, variables, literals, and expressions by including their values inside a string literal. Each item that you insert into the string literal is wrapped in a pair of parentheses, prefixed by a backslash:

字符串插值是一种构建新字符串的方式,可以在其中包含常量、变量、字面量和表达式。您插入的字符串字面量的每一项都在以反斜线为前缀的圆括号中:

let multiplier=3

let message="\(multiplier)times 2.5 is\(Double(multiplier) *2.5)"

// message is "3 times 2.5 is 7.5"

In the example above, the value ofmultiplieris inserted into a string literal as\(multiplier). This placeholder is replaced with the actual value ofmultiplierwhen the string interpolation is evaluated to create an actual string.

在上面的例子中,multiplier作为\(multiplier)被插入到一个字符串常量量中。当创建字符串执行插值计算时此占位符会被替换为multiplier实际的值。

The value ofmultiplieris also part of a larger expression later in the string. This expression calculates the value ofDouble(multiplier) * 2.5and inserts the result (7.5) into the string. In this case, the expression is written as\(Double(multiplier) * 2.5)when it is included inside the string literal.

multiplier的值也作为字符串中后面表达式的一部分。该表达式计算Double(multiplier) * 2.5的值并将结果 (7.5) 插入到字符串中。在这个例子中,表达式写为\(Double(multiplier) * 2.5)并包含在字符串字面量中。

Note

The expressions you write inside parentheses within an interpolated string cannot contain an unescaped backslash (\), a carriage return, or a line feed. However, they can contain other string literals.

插值字符串中写在括号中的表达式不能包含非转义反斜杠 (\),并且不能包含回车或换行符。不过,插值字符串可以包含其他字面量。

Unicode

Unicodeis an international standard for encoding, representing, and processing text in different writing systems. It enables you to represent almost any character from any language in a standardized form, and to read and write those characters to and from an external source such as a text file or web page. Swift’sStringandCharactertypes are fully Unicode-compliant, as described in this section.

Unicode是一个国际标准,用于文本的编码和表示。它使您可以用标准格式表示来自任意语言几乎所有的字符,并能够对文本文件或网页这样的外部资源中的字符进行读写操作。Swift 的String和Character类型是完全兼容 Unicode 标准的。

Unicode Scalars

Behind the scenes, Swift’s nativeStringtype is built fromUnicode scalarvalues. A Unicode scalar is a unique 21-bit number for a character or modifier, such asU+0061forLATIN SMALL LETTER A("a"), orU+1F425forFRONT-FACING BABY CHICK("🐥").

Swift 的String类型是基于Unicode 标量建立的。Unicode 标量是对应字符或者修饰符的唯一的21位数字,例如U+0061表示小写的拉丁字母(LATIN SMALL LETTER A)("a"),U+1F425表示小鸡表情(FRONT-FACING BABY CHICK) ("🐥")。

Note

A Unicode scalar is any Unicodecode pointin the rangeU+0000toU+D7FFinclusive orU+E000toU+10FFFFinclusive. Unicode scalars do not include the Unicodesurrogate paircode points, which are the code points in the rangeU+D800toU+DFFFinclusive.

Unicode码位(code poing)的范围是U+0000到U+D7FF或者U+E000到U+10FFFF。Unicode 标量不包括 Unicode代理项(surrogate pair)码位,其码位范围是U+D800到U+DFFF。

Note that not all 21-bit Unicode scalars are assigned to a character—some scalars are reserved for future assignment. Scalars that have been assigned to a character typically also have a name, such asLATIN SMALL LETTER AandFRONT-FACING BABY CHICKin the examples above.

注意不是所有的21位 Unicode 标量都代表一个字符,因为有一些标量是留作未来分配的。已经代表一个典型字符的标量都有自己的名字,例如上面例子中的LATIN SMALL LETTER A和FRONT-FACING BABY CHICK。

Special Characters in String Literals

String literals can include the following special characters:

字符串字面量可以包含以下特殊字符:

The escaped special characters\0(null character),\\(backslash),\t(horizontal tab),\n(line feed),\r(carriage return),\"(double quote) and\'(single quote)

转义字符\0(空字符)、\\(反斜线)、\t(水平制表符)、\n(换行符)、\r(回车符)、\"(双引号)、\'(单引号)。

An arbitrary Unicode scalar, written as\u{n}, wherenis a 1–8 digit hexadecimal number with a value equal to a valid Unicode code point

Unicode 标量,写成\u{n}(u为小写),其中n为任意一到八位十六进制数且可用的 Unicode 位码。

The code below shows four examples of these special characters. ThewiseWordsconstant contains two escaped double quote characters. ThedollarSign,blackHeart, andsparklingHeartconstants demonstrate the Unicode scalar format:

下面的代码为各种特殊字符的使用示例。wiseWords常量包含了两个双引号。dollarSign、blackHeart和sparklingHeart常量演示了三种不同格式的 Unicode 标量:

let wiseWords="\"Imagination is more important than knowledge\" - Einstein"

// "Imagination is more important than knowledge" - Einstein

let dollarSign="\u{24}"// $,  Unicode scalar U+0024

let blackHeart="\u{2665}"// ♥,  Unicode scalar U+2665

let sparklingHeart="\u{1F496}"// 💖, Unicode scalar U+1F496

Extended Grapheme Clusters (可扩展的字形群集)

Every instance of Swift’sCharactertype represents a singleextended grapheme cluster. An extended grapheme cluster is a sequence of one or more Unicode scalars that (when combined) produce a single human-readable character.

每一个 Swift 的Character类型代表一个可扩展的字形群。一个可扩展的字形群是一个或多个可生成人类可读的字符 Unicode 标量的有序排列。

Here’s an example. The letterécan be represented as the single Unicode scalaré(LATIN SMALL LETTER E WITH ACUTE, orU+00E9). However, the same letter can also be represented as apairof scalars—a standard lettere(LATIN SMALL LETTER E, orU+0065), followed by theCOMBINING ACUTE ACCENTscalar (U+0301). TheCOMBINING ACUTE ACCENTscalar is graphically applied to the scalar that precedes it, turning aneinto anéwhen it is rendered by a Unicode-aware text-rendering system.

举个例子,字母é可以用单一的 Unicode 标量é(LATIN SMALL LETTER E WITH ACUTE, 或者U+00E9)来表示。然而一个标准的字母e(LATIN SMALL LETTER E或者U+0065) 加上一个急促重音(COMBINING ACTUE ACCENT)的标量(U+0301),这样一对标量就表示了同样的字母é。这个急促重音的标量形象的将e转换成了é。

In both cases, the letteréis represented as a single SwiftCharactervalue that represents an extended grapheme cluster. In the first case, the cluster contains a single scalar; in the second case, it is a cluster of two scalars:

在这两种情况中,字母é代表了一个单一的 Swift 的Character值,同时代表了一个可扩展的字形群。在第一种情况,这个字形群包含一个单一标量;而在第二种情况,它是包含两个标量的字形群:

let eAcute:Character="\u{E9}"// é

let combinedEAcute:Character="\u{65}\u{301}"// e followed by ́

// eAcute is é, combinedEAcute is é

Extended grapheme clusters are a flexible way to represent many complex script characters as a singleCharactervalue. For example, Hangul syllables from the Korean alphabet can be represented as either a precomposed or decomposed sequence. Both of these representations qualify as a singleCharactervalue in Swift:

可扩展的字符群集是一个灵活的方法,用许多复杂的脚本字符表示单一的Character值。例如,来自朝鲜语字母表的韩语音节能表示为组合或分解的有序排列。在 Swift 都会表示为同一个单一的Character值:

let precomposed:Character="\u{D55C}"// 한

let decomposed:Character="\u{1112}\u{1161}\u{11AB}"// ᄒ, ᅡ, ᆫ

// precomposed is 한, decomposed is 한

Extended grapheme clusters enable scalars for enclosing marks (such asCOMBINING ENCLOSING CIRCLE, orU+20DD) to enclose other Unicode scalars as part of a singleCharactervalue:

可拓展的字符群集可以使包围记号(例如COMBINING ENCLOSING CIRCLE或者U+20DD)的标量包围其他 Unicode 标量,作为一个单一的Character值:

let enclosedEAcute:Character="\u{E9}\u{20DD}"

// enclosedEAcute is é⃝

Unicode scalars for regional indicator symbols can be combined in pairs to make a singleCharactervalue, such as this combination ofREGIONAL INDICATOR SYMBOL LETTER U(U+1F1FA) andREGIONAL INDICATOR SYMBOL LETTER S(U+1F1F8):

地域性指示符号的 Unicode 标量可以组合成一个单一的Character值,例如REGIONAL INDICATOR SYMBOL LETTER U(U+1F1FA)和REGIONAL INDICATOR SYMBOL LETTER S(U+1F1F8):

let regionalIndicatorForUS:Character="\u{1F1FA}\u{1F1F8}"

// regionalIndicatorForUS is 🇺🇸

Counting Characters (计算字符数量)

To retrieve a count of theCharactervalues in a string, use thecountproperty of the string’scharactersproperty:

如果想要获得一个字符串中Character值的数量,可以使用字符串的characters属性的count属性:

let unusualMenagerie="Koala 🐨, Snail 🐌, Penguin 🐧, Dromedary 🐪"

    print("unusualMenagerie has\(unusualMenagerie.characters.count)characters")

// Prints "unusualMenagerie has 40 characters"

Note that Swift’s use of extended grapheme clusters forCharactervalues means that string concatenation and modification may not always affect a string’s character count.

注意在 Swift 中,使用可拓展的字符群集作为Character值来连接或改变字符串时,并不一定会更改字符串的字符数量。

For example, if you initialize a new string with the four-character wordcafe, and then append aCOMBINING ACUTE ACCENT(U+0301) to the end of the string, the resulting string will still have a character count of4, with a fourth character ofé, note:

例如,如果你用四个字符的单词cafe初始化一个新的字符串,然后添加一个COMBINING ACTUE ACCENT(U+0301)作为字符串的结尾。最终这个字符串的字符数量仍然是4,因为第四个字符是é,而不是e:

var word="cafe"

   print("the number of characters in\(word)is\(word.characters.count)")

// Prints "the number of characters in cafe is 4"

word+="\u{301}"// COMBINING ACUTE ACCENT, U+0301

    print("the number of characters in\(word)is\(word.characters.count)")

// Prints "the number of characters in café is 4"

Note

Extended grapheme clusters can be composed of one or more Unicode scalars. This means that different characters—and different representations of the same character—can require different amounts of memory to store. Because of this, characters in Swift do not each take up the same amount of memory within a string’s representation. As a result, the number of characters in a string cannot be calculated without iterating through the string to determine its extended grapheme cluster boundaries. If you are working with particularly long string values, be aware that thecharactersproperty must iterate over the Unicode scalars in the entire string in order to determine the characters for that string.

可扩展的字符群集可以组成一个或者多个 Unicode 标量。这意味着不同的字符以及相同字符的不同表示方式可能需要不同数量的内存空间来存储。所以 Swift 中的字符在一个字符串中并不一定占用相同的内存空间数量。因此在没有获取字符串的可扩展的字符群的范围时候,就不能计算出字符串的字符数量。如果您正在处理一个长字符串,需要注意characters属性必须遍历全部的 Unicode 标量,来确定字符串的字符数量。

The count of the characters returned by thecharactersproperty is not always the same as thelengthproperty of anNSStringthat contains the same characters. The length of anNSStringis based on the number of 16-bit code units within the string’s UTF-16 representation and not the number of Unicode extended grapheme clusters within the string.

另外需要注意的是通过characters属性返回的字符数量并不总是与包含相同字符的NSString的length属性相同。NSString的length属性是利用 UTF-16 表示的十六位代码单元数字,而不是 Unicode 可扩展的字符群集。

Accessing and Modifying a String (访问和修改字符串)

You access and modify a string through its methods and properties, or by using subscript syntax.

你可以通过字符串的属性和方法来访问和修改它,当然也可以用下标语法完成。

String Indices (字符串索引)

EachStringvalue has an associatedindex type,String.Index, which corresponds to the position of eachCharacterin the string.

每一个String值都有一个关联的索引(index)类型,String.Index,它对应着字符串中的每一个Character的位置。

As mentioned above, different characters can require different amounts of memory to store, so in order to determine whichCharacteris at a particular position, you must iterate over each Unicode scalar from the start or end of thatString. For this reason, Swift strings cannot be indexed by integer values.

前面提到,不同的字符可能会占用不同数量的内存空间,所以要知道Character的确定位置,就必须从String开头遍历每一个 Unicode 标量直到结尾。因此,Swift 的字符串不能用整数(integer)做索引。

Use thestartIndexproperty to access the position of the firstCharacterof aString. TheendIndexproperty is the position after the last character in aString. As a result, theendIndexproperty isn’t a valid argument to a string’s subscript. If aStringis empty,startIndexandendIndexare equal.

使用startIndex属性可以获取一个String的第一个Character的索引。使用endIndex属性可以获取最后一个Character的后一个位置的索引。因此,endIndex属性不能作为一个字符串的有效下标。如果String是空串,startIndex和endIndex是相等的。

You access the indices before and after a given index using theindex(before:)andindex(after:)methods ofString. To access an index farther away from the given index, you can use theindex(_:offsetBy:)method instead of calling one of these methods multiple times.

通过调用String的index(before:)或index(after:)方法,可以立即得到前面或后面的一个索引。您还可以通过调用index(_:offsetBy:)方法来获取对应偏移量的索引,这种方式可以避免多次调用index(before:)或index(after:)方法。

You can use subscript syntax to access theCharacterat a particularStringindex.

你可以使用下标语法来访问String特定索引的Character。

let greeting="Guten Tag!"

greeting[greeting.startIndex]

// G

greeting[greeting.index(before:greeting.endIndex)]

// !

greeting[greeting.index(after:greeting.startIndex)]

// u

let index=greeting.index(greeting.startIndex,offsetBy:7)

greeting[index]

// a

Attempting to access an index outside of a string’s range or aCharacterat an index outside of a string’s range will trigger a runtime error.

试图获取越界索引对应的Character,将引发一个运行时错误。

greeting[greeting.endIndex]// Error

greeting.index(after:greeting.endIndex)// Error

Use theindicesproperty of thecharactersproperty to access all of the indices of individual characters in a string.

使用characters属性的indices属性会创建一个包含全部索引的范围(Range),用来在一个字符串中访问单个字符。

for in dexingreeting.characters.indices{

    print("\(greeting[index])",terminator:"")

}

// Prints "G u t e n  T a g ! "

Note

You can use thestartIndexandendIndexproperties and theindex(before:),index(after:), andindex(_:offsetBy:)methods on any type that conforms to theCollectionprotocol. This includesString, as shown here, as well as collection types such asArray,Dictionary, andSet.

您可以使用startIndex和endIndex属性或者index(before:)、index(after:)和index(_:offsetBy:)方法在任意一个确认的并遵循Collection协议的类型里面,如上文所示是使用在String中,您也可以使用在Array、Dictionary和Set中。

Inserting and Removing (插入和删除)

To insert a single character into a string at a specified index, use theinsert(_:at:)method, and to insert the contents of another string at a specified index, use theinsert(contentsOf:at:)method.

调用insert(_:at:)方法可以在一个字符串的指定索引插入一个字符,调用insert(contentsOf:at:)方法可以在一个字符串的指定索引插入一个段字符串。

var welcome="hello"

welcome.insert("!",at:welcome.endIndex)

// welcome now equals "hello!"

welcome.insert(contentsOf:" there".characters,at:welcome.index(before:welcome.endIndex))

// welcome now equals "hello there!"

To remove a single character from a string at a specified index, use theremove(at:)method, and to remove a substring at a specified range, use theremoveSubrange(_:)method:

调用remove(at:)方法可以在一个字符串的指定索引删除一个字符,调用removeSubrange(_:)方法可以在一个字符串的指定索引删除一个子字符串。

welcome.remove(at:welcome.index(before:welcome.endIndex))

// welcome now equals "hello there"

let range=welcome.index(welcome.endIndex,offsetBy:-6)..

welcome.removeSubrange(range)

// welcome now equals "hello"

Note

You can use the theinsert(_:at:),insert(contentsOf:at:),remove(at:), andremoveSubrange(_:)methods on any type that conforms to theRangeReplaceableCollectionprotocol. This includesString, as shown here, as well as collection types such asArray,Dictionary, andSet.

您可以使用insert(_:at:)、insert(contentsOf:at:)、remove(at:)和removeSubrange(_:)方法在任意一个确认的并遵循RangeReplaceableCollection协议的类型里面,如上文所示是使用在String中,您也可以使用在Array、Dictionary和Set中。

Comparing Strings (比较字符串)

Swift provides three ways to compare textual values: string and character equality, prefix equality, and suffix equality.

Swift 提供了三种方式来比较文本值:字符串字符相等、前缀相等和后缀相等。

String and Character Equality (字符串和字符相等)

String and character equality is checked with the “equal to” operator (==) and the “not equal to” operator (!=), as described inComparison Operators:

字符串/字符可以用等于操作符(==)和不等于操作符(!=),详细描述在比较运算符

let quotation="We're a lot alike, you and I."

let sameQuotation="We're a lot alike, you and I."

if quotation==sameQuotation{

    print("These two strings are considered equal")

}

// Prints "These two strings are considered equal"

TwoStringvalues (or twoCharactervalues) are considered equal if their extended grapheme clusters arecanonically equivalent. Extended grapheme clusters are canonically equivalent if they have the same linguistic meaning and appearance, even if they are composed from different Unicode scalars behind the scenes.

如果两个字符串(或者两个字符)的可扩展的字形群集是标准相等的,那就认为它们是相等的。在这个情况下,即使可扩展的字形群集是有不同的 Unicode 标量构成的,只要它们有同样的语言意义和外观,就认为它们标准相等。

For example,LATIN SMALL LETTER E WITH ACUTE(U+00E9) is canonically equivalent toLATIN SMALL LETTER E(U+0065) followed byCOMBINING ACUTE ACCENT(U+0301). Both of these extended grapheme clusters are valid ways to represent the characteré, and so they are considered to be canonically equivalent:

例如,LATIN SMALL LETTER E WITH ACUTE(U+00E9)就是标准相等于LATIN SMALL LETTER E(U+0065)后面加上COMBINING ACUTE ACCENT(U+0301)。这两个字符群集都是表示字符é的有效方式,所以它们被认为是标准相等的:

// "Voulez-vous un café?" using LATIN SMALL LETTER E WITH ACUTE

let eAcuteQuestion="Voulez-vous un caf\u{E9}?"

// "Voulez-vous un café?" using LATIN SMALL LETTER E and COMBINING ACUTE ACCENT

let combinedEAcuteQuestion="Voulez-vous un caf\u{65}\u{301}?"

if eAcuteQuestion==combinedEAcuteQuestion{

    print("These two strings are considered equal")

}

// Prints "These two strings are considered equal"

Conversely,LATIN CAPITAL LETTER A(U+0041, or"A"), as used in English, isnotequivalent toCYRILLIC CAPITAL LETTER A(U+0410, or"А"), as used in Russian. The characters are visually similar, but do not have the same linguistic meaning:

相反,英语中的LATIN CAPITAL LETTER A(U+0041,或者A)不等于俄语中的CYRILLIC CAPITAL LETTER A(U+0410,或者A)。两个字符看着是一样的,但却有不同的语言意义:

let latinCapitalLetterA:Character="\u{41}"

let cyrillicCapitalLetterA:Character="\u{0410}"

if latinCapitalLetterA!=cyrillicCapitalLetterA {

    print("These two characters are not equivalent.")

}

// Prints "These two characters are not equivalent."

Note

String and character comparisons in Swift are not locale-sensitive.

在 Swift 中,字符串和字符并不区分地域(not locale-sensitive)。

Prefix and Suffix Equality (前缀和后缀相等)

To check whether a string has a particular string prefix or suffix, call the string’shasPrefix(_:)andhasSuffix(_:)methods, both of which take a single argument of typeStringand return a Boolean value.

通过调用字符串的hasPrefix(_:)/hasSuffix(_:)方法来检查字符串是否拥有特定前缀/后缀,两个方法均接收一个String类型的参数,并返回一个布尔值。

The examples below consider an array of strings representing the scene locations from the first two acts of Shakespeare’sRomeo and Juliet:

下面的例子以一个字符串数组表示莎士比亚话剧《罗密欧与朱丽叶》中前两场的场景位置:

let romeoAndJuliet= [

"Act 1 Scene 1: Verona, A public place",

"Act 1 Scene 2: Capulet's mansion",

"Act 1 Scene 3: A room in Capulet's mansion",

"Act 1 Scene 4: A street outside Capulet's mansion",

"Act 1 Scene 5: The Great Hall in Capulet's mansion",

"Act 2 Scene 1: Outside Capulet's mansion",

"Act 2 Scene 2: Capulet's orchard",

"Act 2 Scene 3: Outside Friar Lawrence's cell",

"Act 2 Scene 4: A street in Verona",

"Act 2 Scene 5: Capulet's mansion",

"Act 2 Scene 6: Friar Lawrence's cell"

]

You can use thehasPrefix(_:)method with theromeoAndJulietarray to count the number of scenes in Act 1 of the play:

您可以调用hasPrefix(_:)方法来计算话剧中第一幕的场景数:

var act1SceneCount=0

for scene in romeoAndJuliet {

    if scene.hasPrefix("Act 1 ") {

       act1SceneCount+=1

    }

}

print("There are\(act1SceneCount)scenes in Act 1")

// Prints "There are 5 scenes in Act 1"

Similarly, use thehasSuffix(_:)method to count the number of scenes that take place in or around Capulet’s mansion and Friar Lawrence’s cell:

相似地,您可以用hasSuffix(_:)方法来计算发生在不同地方的场景数:

var mansionCount=0

va rcellCount=0

for scene in romeoAndJuliet {

    if scene.hasSuffix("Capulet's mansion") {

       mansionCount+=1

} else if scene.hasSuffix("Friar Lawrence's cell") {

    cellCount+=1

}

}

print("\(mansionCount)mansion scenes;\(cellCount)cell scenes")

// Prints "6 mansion scenes; 2 cell scenes"

Note

ThehasPrefix(_:)andhasSuffix(_:)methods perform a character-by-character canonical equivalence comparison between the extended grapheme clusters in each string, as described inString and Character Equality.

hasPrefix(_:)和hasSuffix(_:)方法都是在每个字符串中逐字符比较其可扩展的字符群集是否标准相等,详细描述在字符串/字符相等

Unicode Representations of Strings (字符串的 Unicode 表示形式)

When a Unicode string is written to a text file or some other storage, the Unicode scalars in that string are encoded in one of several Unicode-definedencoding forms. Each form encodes the string in small chunks known ascode units. These include the UTF-8 encoding form (which encodes a string as 8-bit code units), the UTF-16 encoding form (which encodes a string as 16-bit code units), and the UTF-32 encoding form (which encodes a string as 32-bit code units).

当一个 Unicode 字符串被写进文本文件或者其他储存时,字符串中的 Unicode 标量会用 Unicode 定义的几种编码格式(encoding forms)编码。每一个字符串中的小块编码都被称代码单元(code units)。这些包括 UTF-8 编码格式(编码字符串为8位的代码单元), UTF-16 编码格式(编码字符串位16位的代码单元),以及 UTF-32 编码格式(编码字符串32位的代码单元)。

Swift provides several different ways to access Unicode representations of strings. You can iterate over the string with afor-instatement, to access its individualCharactervalues as Unicode extended grapheme clusters. This process is described inWorking with Characters.

Swift 提供了几种不同的方式来访问字符串的 Unicode 表示形式。您可以利用for-in来对字符串进行遍历,从而以 Unicode 可扩展的字符群集的方式访问每一个Character值。该过程在使用字符中进行了描述。

Alternatively, access aStringvalue in one of three other Unicode-compliant representations:

另外,能够以其他三种 Unicode 兼容的方式访问字符串的值:

A collection of UTF-8 code units (accessed with the string’sutf8property)

UTF-8 代码单元集合 (利用字符串的utf8属性进行访问)

A collection of UTF-16 code units (accessed with the string’sutf16property)

UTF-16 代码单元集合 (利用字符串的utf16属性进行访问)

A collection of 21-bit Unicode scalar values, equivalent to the string’s UTF-32 encoding form (accessed with the string’sunicodeScalarsproperty)

21位的 Unicode 标量值集合,也就是字符串的 UTF-32 编码格式 (利用字符串的unicodeScalars属性进行访问)

Each example below shows a different representation of the following string, which is made up of the charactersD,o,g,‼(DOUBLE EXCLAMATION MARK, or Unicode scalarU+203C), and the 🐶 character (DOG FACE, or Unicode scalarU+1F436):

下面由D,o,g,‼(DOUBLE EXCLAMATION MARK, Unicode 标量U+203C)和🐶(DOG FACE,Unicode 标量为U+1F436)组成的字符串中的每一个字符代表着一种不同的表示:

let dogString="Dog‼🐶"

UTF-8 Representation (UTF-8表示)

You can access a UTF-8 representation of aStringby iterating over itsutf8property. This property is of typeString.UTF8View, which is a collection of unsigned 8-bit (UInt8) values, one for each byte in the string’s UTF-8 representation:

您可以通过遍历String的utf8属性来访问它的UTF-8表示。其为String.UTF8View类型的属性,UTF8View是无符号8位 (UInt8) 值的集合,每一个UInt8值都是一个字符的 UTF-8 表示:

for codeUnit in dogString.utf8{

    print("\(codeUnit)",terminator:"")

}

print("")

// 68 111 103 226 128 188 240 159 144 182

In the example above, the first three decimalcodeUnitvalues (68,111,103) represent the charactersD,o, andg, whose UTF-8 representation is the same as their ASCII representation. The next three decimalcodeUnitvalues (226,128,188) are a three-byte UTF-8 representation of theDOUBLE EXCLAMATION MARKcharacter. The last fourcodeUnitvalues (240,159,144,182) are a four-byte UTF-8 representation of theDOG FACEcharacter.

上面的例子中,前三个10进制codeUnit值 (68,111,103) 代表了字符D、o和g,它们的 UTF-8 表示与 ASCII 表示相同。接下来的三个10进制codeUnit值 (226,128,188) 是DOUBLE EXCLAMATION MARK的3字节 UTF-8 表示。最后的四个codeUnit值 (240,159,144,182) 是DOG FACE的4字节 UTF-8 表示。

UTF-16 Representation (UTF-16表示)

You can access a UTF-16 representation of aStringby iterating over itsutf16property. This property is of typeString.UTF16View, which is a collection of unsigned 16-bit (UInt16) values, one for each 16-bit code unit in the string’s UTF-16 representation:

您可以通过遍历String的utf16属性来访问它的UTF-16表示。其为String.UTF16View类型的属性,UTF16View是无符号16位 (UInt16) 值的集合,每一个UInt16都是一个字符的 UTF-16 表示:

for codeUnit in dogString.utf16{

    print("\(codeUnit)",terminator:"")

}

print("")

// Prints "68 111 103 8252 55357 56374 "

Again, the first threecodeUnitvalues (68,111,103) represent the charactersD,o, andg, whose UTF-16 code units have the same values as in the string’s UTF-8 representation (because these Unicode scalars represent ASCII characters).

同样,前三个codeUnit值 (68,111,103) 代表了字符D、o和g,它们的 UTF-16 代码单元和 UTF-8 完全相同(因为这些 Unicode 标量表示 ASCII 字符)。

The fourthcodeUnitvalue (8252) is a decimal equivalent of the hexadecimal value203C, which represents the Unicode scalarU+203Cfor theDOUBLE EXCLAMATION MARKcharacter. This character can be represented as a single code unit in UTF-16.

第四个codeUnit值 (8252) 是一个等于十六进制203C的的十进制值。这个代表了DOUBLE EXCLAMATION MARK字符的 Unicode 标量值U+203C。这个字符在 UTF-16 中可以用一个代码单元表示。

The fifth and sixthcodeUnitvalues (55357and56374) are a UTF-16 surrogate pair representation of theDOG FACEcharacter. These values are a high-surrogate value ofU+D83D(decimal value55357) and a low-surrogate value ofU+DC36(decimal value56374).

第五和第六个codeUnit值 (55357和56374) 是DOG FACE字符的 UTF-16 表示。第一个值为U+D83D(十进制值为55357),第二个值为U+DC36(十进制值为56374)。

Unicode Scalar Representation (Unicode标量表示)

You can access a Unicode scalar representation of aStringvalue by iterating over itsunicodeScalarsproperty. This property is of typeUnicodeScalarView, which is a collection of values of typeUnicodeScalar.

您可以通过遍历String值的unicodeScalars属性来访问它的 Unicode 标量表示。其为UnicodeScalarView类型的属性,UnicodeScalarView是UnicodeScalar类型的值的集合。UnicodeScalar是21位的 Unicode 代码点。

EachUnicodeScalarhas avalueproperty that returns the scalar’s 21-bit value, represented within aUInt32value:

每一个UnicodeScalar拥有一个value属性,可以返回对应的21位数值,用UInt32来表示:

for scalar in dogString.unicodeScalars{

    print("\(scalar.value)",terminator:"")

}

print("")

// Prints "68 111 103 8252 128054 "

Thevalueproperties for the first threeUnicodeScalarvalues (68,111,103) once again represent the charactersD,o, andg.

前三个UnicodeScalar值(68,111,103)的value属性仍然代表字符D、o和g。

The fourthcodeUnitvalue (8252) is again a decimal equivalent of the hexadecimal value203C, which represents the Unicode scalarU+203Cfor theDOUBLE EXCLAMATION MARKcharacter.

第四个codeUnit值(8252)仍然是一个等于十六进制203C的十进制值。这个代表了DOUBLE EXCLAMATION MARK字符的 Unicode 标量U+203C。

Thevalueproperty of the fifth and finalUnicodeScalar,128054, is a decimal equivalent of the hexadecimal value1F436, which represents the Unicode scalarU+1F436for theDOG FACEcharacter.

第五个UnicodeScalar值的value属性,128054,是一个十六进制1F436的十进制表示。其等同于DOG FACE的 Unicode 标量U+1F436。

As an alternative to querying theirvalueproperties, eachUnicodeScalarvalue can also be used to construct a newStringvalue, such as with string interpolation:

作为查询它们的value属性的一种替代方法,每个UnicodeScalar值也可以用来构建一个新的String值,比如在字符串插值中使用:

for scalar in dogString.unicodeScalars{

    print("\(scalar)")

}

// D

// o

// g

// ‼

// 🐶

推荐阅读更多精彩内容