(译)Haskell中的WHNF

在学习《real world Haskell》时对NF、WHNF的概念有些困惑,于是去stackoverflow查了查。本文来源于stackoverflow上的一个提问,大意就是怎么理解Haskell中的WHNF、HNF以及NF。译文是该提问下票数最高的回答。

我的博客链接:链接
原文地址:链接


I'll try to give an explanation in simple terms. As others have pointed out, head normal form does not apply to Haskell, so I will not consider it here.

我将试着用一些简单的术语来解释它们。另外,由于Haskell中并没有用到HNF,所以在这里我就不解释了。

Normal form

范式

An expression in normal form is fully evaluated, and no sub-expression could be evaluated any further (i.e. it contains no un-evaluated thunks).
These expressions are all in normal form:

42
(2, "hello")
\x -> (x + 1)

一个NF的表达式是可以完全被求值的,并且不存在需要进一步求值的子表达式(包括不可求值的部分)。
这些表达是都是NF的:

42
(2, "hello")
\x -> (x + 1)

These expressions are not in normal form:

1 + 2                 -- we could evaluate this to 3
(\x -> x + 1) 2       -- we could apply the function
"he" ++ "llo"         -- we could apply the (++)
(1 + 1, 2 + 2)        -- we could evaluate 1 + 1 and 2 + 2

这些表达式都不是NF的:

1 + 2                 -- 我们可以把它求值为 3
(\x -> x + 1) 2       -- 我们可以应用这个函数
"he" ++ "llo"         -- 我们可以应用 (++)
(1 + 1, 2 + 2)        -- 我们可以对 1 + 1 以及 2 + 2求值

Weak head normal form

弱头范式

An expression in weak head normal form has been evaluated to the outermost data constructor or lambda abstraction (the head). Sub-expressions may or may not have been evaluated. Therefore, every normal form expression is also in weak head normal form, though the opposite does not hold in general.

我们说一个表达式是WHNF(弱头范式)的,意味着它可以被求值为最外层的值构造器或匿名函数(lambda)抽象(所谓的头head)。子表达式或许不能被求值。因此,所有NF表达式都满足WHNF,尽管反对方一般不这么认为。

To determine whether an expression is in weak head normal form, we only have to look at the outermost part of the expression. If it's a data constructor or a lambda, it's in weak head normal form. If it's a function application, it's not.

判断一个表达式是否满足WHNF,我们只需要看最外层的表达式。如果是一个值构造器或一个匿名函数,那它就满足WHNF。如果是一个函数调用,那它就不是。

These expressions are in weak head normal form:

(1 + 1, 2 + 2)       -- the outermost part is the data constructor (,)
\x -> 2 + 2          -- the outermost part is a lambda abstraction
'h' : ("e" ++ "llo") -- the outermost part is the data constructor (:)

这些表达式是满足WHNF的:

(1 + 1, 2 + 2) -- 最外层是一个值构造器 (,)
\x -> 2 + 2 -- 最外层是一个匿名函数的抽象
'h' : ("e" ++ "llo") -- 最外层是一个值构造器 (:)

As mentioned, all the normal form expressions listed above are also in weak head normal form.
These expressions are not in weak head normal form:

1 + 2                -- the outermost part here is an application of (+)
(\x -> x + 1) 2      -- the outermost part is an application of (\x -> x + 1)
"he" ++ "llo"        -- the outermost part is an application of (++)

就像之前所提到的,所有以上列出的NF表达式都是符合WHNF的。以下表达式不符合WHNF:

1 + 2 -- 最外层是 (+) 函数的调用
(\x -> x + 1) 2 -- 最外层是匿名函数 (\x -> x + 1) 的调用
"he" ++ "llo" -- 最外层是 (++) 函数的调用

Stack overflows

栈溢出

Evaluating an expression to weak head normal form may require that other expressions be evaluated to WHNF first. For example, to evaluate 1 + (2 + 3) to WHNF, we first have to evaluate 2 + 3. If evaluating a single expression leads to too many of these nested evaluations, the result is a stack overflow.

把一个表达式转换为WHNF可能需要先对其他表达式求值使其成为WHNF。举个栗子,把1+(2+3)转化为WHNF,我们首先必须要对2+3求值。如果一个表达式的求值需要对许多内嵌的(子表达式)求值,那么将会导致栈溢出。

This happens when you build up a large expression that does not produce any data constructors or lambdas until a large part of it has been evaluated. These are often caused by this kind of usage of foldl:

foldl (+) 0 [1, 2, 3, 4, 5, 6] 
= foldl (+) (0 + 1) [2, 3, 4, 5, 6] 
= foldl (+) ((0 + 1) + 2) [3, 4, 5, 6] 
= foldl (+) (((0 + 1) + 2) + 3) [4, 5, 6] 
= foldl (+) ((((0 + 1) + 2) + 3) + 4) [5, 6] 
= foldl (+) (((((0 + 1) + 2) + 3) + 4) + 5) [6] 
= foldl (+) ((((((0 + 1) + 2) + 3) + 4) + 5) + 6) [] 
= (((((0 + 1) + 2) + 3) + 4) + 5) + 6 
= ((((1 + 2) + 3) + 4) + 5) + 6 
= (((3 + 3) + 4) + 5) + 6 
= ((6 + 4) + 5) + 6 
= (10 + 5) + 6 
= 15 + 6 
= 21

当你展开一个大型表达式时,不会打开任何值构造器或lambda表达式,直到它的其他部分都已被求值。foldl函数经常被这样使用:

foldl (+) 0 [1, 2, 3, 4, 5, 6] 
= foldl (+) (0 + 1) [2, 3, 4, 5, 6] 
= foldl (+) ((0 + 1) + 2) [3, 4, 5, 6] 
= foldl (+) (((0 + 1) + 2) + 3) [4, 5, 6] 
= foldl (+) ((((0 + 1) + 2) + 3) + 4) [5, 6] 
= foldl (+) (((((0 + 1) + 2) + 3) + 4) + 5) [6] 
= foldl (+) ((((((0 + 1) + 2) + 3) + 4) + 5) + 6) [] 
= (((((0 + 1) + 2) + 3) + 4) + 5) + 6 
= ((((1 + 2) + 3) + 4) + 5) + 6 
= (((3 + 3) + 4) + 5) + 6 
= ((6 + 4) + 5) + 6 
= (10 + 5) + 6 
= 15 + 6 
= 21

Notice how it has to go quite deep before it can get the expression into weak head normal form.

我们注意到,在表达式转换为WHNF之前,它嵌套的非常之深。

You may wonder, why does not Haskell reduce the inner expressions ahead of time? That is because of Haskell's laziness. Since it cannot be assumed in general that every subexpression will be needed, expressions are evaluated from the outside in.

你可能会奇怪,为什么Haskell不提前减少内嵌的表达式呢?因为Haskell是惰性的。一般情况下你并不能假定所有子表达式都需要被求值,表达式的求值是由外而内的。

(GHC has a strictness analyzer that will detect some situations where a subexpression is always needed and it can then evaluate it ahead of time. This is only an optimization, however, and you should not rely on it to save you from overflows).

(GHC有一个严格的分析,它会检测哪些子表达式是必需的,然后提前计算它。但那只是一个优化,你不应该指望它来保证你的代码免于栈溢出)

This kind of expression, on the other hand, is completely safe:

data List a = Cons a (List a) | Nil
foldr Cons Nil [1, 2, 3, 4, 5, 6]
 = Cons 1 (foldr Cons Nil [2, 3, 4, 5, 6])  -- Cons is a constructor, stop. 

换句话说,这类表达式是完全安全的:

data List a = Cons a (List a) | Nil
foldr Cons Nil [1, 2, 3, 4, 5, 6]
 = Cons 1 (foldr Cons Nil [2, 3, 4, 5, 6]) -- Cons 是一个构造器, 停止. 

To avoid building these large expressions when we know all the subexpressions will have to be evaluated, we want to force the inner parts to be evaluated ahead of time.

为了避免构建这类巨大的表达式,当我们知道所有子表达式都将会被求值时,我们必须强制提前对内部进行求值。

seq

seq函数

seq is a special function that is used to force expressions to be evaluated. Its semantics are that seq x y means that whenever y is evaluated to weak head normal form, x is also evaluated to weak head normal form.It is among other places used in the definition of foldl', the strict variant of foldl.

foldl' f a [] = a
foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs

seq是一个特殊的函数,它用来强制表达式提前被求值。seq x y意味着无论是否y能被求值为WHNF,x都会被转换为WHNF。它在其他地方被使用,例如在foldl'函数(foldl的一个变体)的定义中

foldl' f a [] = a
foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs

Each iteration of foldl'

foldl' 的每一次迭代

forces the accumulator to WHNF. It therefore avoids building up a large expression, and it therefore avoids overflowing the stack.

foldl' (+) 0 [1, 2, 3, 4, 5, 6] 
= foldl' (+) 1 [2, 3, 4, 5, 6] 
= foldl' (+) 3 [3, 4, 5, 6] 
= foldl' (+) 6 [4, 5, 6] 
= foldl' (+) 10 [5, 6] 
= foldl' (+) 15 [6] 
= foldl' (+) 21 [] 
= 21 -- 21 is a data constructor, stop.

强制把累加器转换为WHNF。因此它可以避免构建一个巨大的表达式,也因此避免了栈溢出。

foldl' (+) 0 [1, 2, 3, 4, 5, 6] 
= foldl' (+) 1 [2, 3, 4, 5, 6] 
= foldl' (+) 3 [3, 4, 5, 6] 
= foldl' (+) 6 [4, 5, 6] 
= foldl' (+) 10 [5, 6] 
= foldl' (+) 15 [6] 
= foldl' (+) 21 [] 
= 21 -- 21 是一个值构造器, 停止.

But as the example on HaskellWiki mentions, this does not save you in all cases, as the accumulator is only evaluated to WHNF. In the example, the accumulator is a tuple, so it will only force evaluation of the tuple constructor, and not acc or len.

f (acc, len) x = (acc + x, len + 1)
foldl' f (0, 0) [1, 2, 3]
 = foldl' f (0 + 1, 0 + 1) [2, 3]
 = foldl' f ((0 + 1) + 2, (0 + 1) + 1) [3]
 = foldl' f (((0 + 1) + 2) + 3, ((0 + 1) + 1) + 1) []
 = (((0 + 1) + 2) + 3, ((0 + 1) + 1) + 1) -- tuple constructor, stop.

但是Haskell维基中提到的一个例子,并不是所有情况都能由此解决,当累加器可以求值为WHNF时。比如在那个例子中,累加器是一个元组,因此,它只强制求值到元组的构造器,而不是其中的acc和len。

f (acc, len) x = (acc + x, len + 1)
foldl' f (0, 0) [1, 2, 3]
 = foldl' f (0 + 1, 0 + 1) [2, 3]
 = foldl' f ((0 + 1) + 2, (0 + 1) + 1) [3]
 = foldl' f (((0 + 1) + 2) + 3, ((0 + 1) + 1) + 1) []
 = (((0 + 1) + 2) + 3, ((0 + 1) + 1) + 1) -- 元组的构造器, 停止.

To avoid this, we must make it so that evaluating the tuple constructor forces evaluation of acc
and len. We do this by using seq.

f' (acc, len) x = let acc' = acc + x
                      len' = len + 1
                  in  acc' `seq` len' `seq` (acc', len')
foldl' f' (0, 0) [1, 2, 3]
 = foldl' f' (1, 1) [2, 3]
 = foldl' f' (3, 2) [3]
 = foldl' f' (6, 3) []
 = (6, 3)                    -- tuple constructor, stop.

为了避免这种情况,我们必须强制对元组构造器的求值,然后强制对acc和len求值。我们可以这样使用seq:

f' (acc, len) x = let acc' = acc + x
                      len' = len + 1
                  in  acc' `seq` len' `seq` (acc', len')
foldl' f' (0, 0) [1, 2, 3]
 = foldl' f' (1, 1) [2, 3]
 = foldl' f' (3, 2) [3]
 = foldl' f' (6, 3) []
 = (6, 3)                    -- 元组构造器,停止.
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 159,117评论 4 362
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,328评论 1 293
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 108,839评论 0 243
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 44,007评论 0 206
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,384评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,629评论 1 219
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,880评论 2 313
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,593评论 0 198
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,313评论 1 243
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,575评论 2 246
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,066评论 1 260
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,392评论 2 253
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,052评论 3 236
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,082评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,844评论 0 195
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,662评论 2 274
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,575评论 2 270

推荐阅读更多精彩内容

  • 这是我在我的Linux小记里写的第……7篇文章。 在大学的最后一个学期,最要紧的事情无非是毕业设计,当然自己还是有...
    于心叶的叶阅读 2,228评论 1 8
  • 三天前偶然开始追了一部名为《NANA》的番,向来看剧爱快进的个性居然耐心完完整整的看完了 奈奈与娜娜的...
    颜遂心阅读 478评论 0 0
  • 本来以为这部电影会有点 salvation 的东西,但是看完了也没有抓住什么,仔细想想男主的念白好像总是在告诉我们...
    ahalaoreja阅读 233评论 0 1
  • 文/丽子 网爆:51岁的郭天王爱上了29岁的网红方缓,是要娶,已登记结婚! 看来这次婚讯不是乌龙,是真的了!他们的...
    丽子a阅读 682评论 0 2
  • “我越来越多地开始考虑采用形式化的、风格化的舞蹈动作,让舞者走出剧场,以整个世界为舞台。这就意味着不仅剧院固定的前...
    丁少小蕾Melody阅读 1,809评论 0 8