java8 Stream Collectors.toMap 引发的 NullPointerException和Duplicate key

NullPointerException

v 为null,报空指针错误



k为null, 运行正常。


以往的认知:HashMap中k,v都是可以存null值的。在上面的测试用例中可以看到,v为null其实会报错。

查看API可以看到

 public static <T, K, U, M extends Map<K, U>>
    Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper,
                                Function<? super T, ? extends U> valueMapper,
                                BinaryOperator<U> mergeFunction,
                                Supplier<M> mapSupplier) {
        BiConsumer<M, T> accumulator
                = (map, element) -> map.merge(keyMapper.apply(element),
                                              valueMapper.apply(element), mergeFunction);
        return new CollectorImpl<>(mapSupplier, accumulator, mapMerger(mergeFunction), CH_ID);
    }
@Override
    public V merge(K key, V value,
                   BiFunction<? super V, ? super V, ? extends V> remappingFunction) {
        if (value == null)
            throw new NullPointerException();
        if (remappingFunction == null)
            throw new NullPointerException();
        int hash = hash(key);
        Node<K,V>[] tab; Node<K,V> first; int n, i;
        int binCount = 0;
        TreeNode<K,V> t = null;
        Node<K,V> old = null;
        if (size > threshold || (tab = table) == null ||
            (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((first = tab[i = (n - 1) & hash]) != null) {
            if (first instanceof TreeNode)
                old = (t = (TreeNode<K,V>)first).getTreeNode(hash, key);
            else {
                Node<K,V> e = first; K k;
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k)))) {
                        old = e;
                        break;
                    }
                    ++binCount;
                } while ((e = e.next) != null);
            }
        }
        if (old != null) {
            V v;
            if (old.value != null)
                v = remappingFunction.apply(old.value, value);
            else
                v = value;
            if (v != null) {
                old.value = v;
                afterNodeAccess(old);
            }
            else
                removeNode(hash, key, null, false, true);
            return v;
        }
        if (value != null) {
            if (t != null)
                t.putTreeVal(this, tab, hash, key, value);
            else {
                tab[i] = newNode(hash, key, value, first);
                if (binCount >= TREEIFY_THRESHOLD - 1)
                    treeifyBin(tab, hash);
            }
            ++modCount;
            ++size;
            afterNodeInsertion(true);
        }
        return value;
    }

从代码中可以看到,HashMap.merge 会对v进行判空。

目前我的解决办法是在流中加上判空过滤

System.out.println(list.stream().filter(t -> t.getName() != null).collect(Collectors.toMap(TestDO::getId, TestDO::getName, (k1, k2) -> k1)));

Duplicate key

查看toMap的API及相关注释

/*
 * <p>If the mapped keys contains duplicates (according to
     * {@link Object#equals(Object)}), an {@code IllegalStateException} is
     * thrown when the collection operation is performed.  If the mapped keys
     * may have duplicates, use {@link #toMap(Function, Function, BinaryOperator)}
     * instead.
*/
public static <T, K, U>
    Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                    Function<? super T, ? extends U> valueMapper) {
        return toMap(keyMapper, valueMapper, throwingMerger(), HashMap::new);
    }

如果有重复的key,使用toMap(Function, Function)会抛异常,可以使用toMap(Function, Function, BinaryOperator)来解决。

优化后的代码是

public class MailTest {
    public static void main(String[] args) {
        List<TestDO> list = Lists.newArrayList();
        TestDO t1 = new TestDO(1, "a");
        TestDO t2 = new TestDO(1, "b");
        list.add(t1);
        list.add(t2);
        System.out.println(list.stream().collect(Collectors.toMap(TestDO::getId, TestDO::getName, (k1, k2) -> k1)));
        System.out.println(list.stream().collect(Collectors.toMap(TestDO::getId, TestDO::getName, (k1, k2) -> k2)));
        System.out.println(list.stream().collect(Collectors.toMap(TestDO::getId, TestDO::getName, (k1, k2) -> k1 + k2)));
    }
}


从执行结果来看,对于重复的key,对他的value可以有三种取值情况:

  1. (k1, k2) -> k1,取第一个值

  2. (k1, k2) -> k2),取第二个值

  3. (k1, k2) -> k1 + k2),对两个值做某些操作后取结果。(例如基本类型可以进行加减乘除)。

推荐阅读更多精彩内容

  • Java集合:HashMap源码剖析 一、HashMap概述 二、HashMap的数据结构 三、HashMap源码...
    记住时光阅读 663评论 2 1
  • 1.HashMap是一个数组+链表/红黑树的结构,数组的下标在HashMap中称为Bucket值,每个数组项对应的...
    谁在烽烟彼岸阅读 920评论 2 2
  • pyspark.sql模块 模块上下文 Spark SQL和DataFrames的重要类: pyspark.sql...
    mpro阅读 9,127评论 0 13
  • 5.1、对于HashMap需要掌握以下几点 Map的创建:HashMap() 往Map中添加键值对:即put(Ob...
    rochuan阅读 420评论 0 0
  • 婚姻的本质就像一种生长缓慢的植物,需要不断灌溉,加施肥料,修枝理叶,打杀害虫,才有持久的绿荫。
    Miyakicheung阅读 2,604评论 9 15