NIO文件读写PK传统IOReader

前言

早就看到很多描述NIO相比传统IO性能上的各种优势,于是带着求证心里前来测试一番,场景如下:

  1. 写场景:将字符串内容lu ben wei niu bi, stand up all! play game must be Laughing,play nm!作为一行内容,重复的写入到一个文件中大概十万行,最后比较两种IO模型下的耗时
  2. 读场景:将一个五十万行的文本(包含汉字)按行读取出来,统计两种IO模型下的耗时

写测试

传统IO都使用带缓存(buffer)的IO去操作,代码如下:

    private static final String CONTENT = "lu ben wei niu bi, stand up all! play game must be Laughing,play nm!\r\n";
    private final static int WRITE_COUNT = 100000;
    private final static String FILE_PATH = "E:\\IOTest\\lbw.txt";
    public static void testWrite_Buffer(String content) {
        FileOutputStream outputStream = null;
        OutputStreamWriter writer = null;
        BufferedWriter out = null;
        try {
            long start = System.currentTimeMillis();
            String path = FILE_PATH;
            outputStream = new FileOutputStream(path);
            writer = new OutputStreamWriter(outputStream, "GBK");
            out = new BufferedWriter(writer);
            for (int i = 0; i < WRITE_COUNT; i++) {
                out.write(content);
            }
            System.err.println("buffer->time:" + (System.currentTimeMillis() - start));
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(out, writer, outputStream);
        }
    }

运行结果大致在57-60ms左右
然后是nio写操作:

    private static final String CONTENT = "lu ben wei niu bi, stand up all! play game must be Laughing,play nm!\r\n";
    private final static int WRITE_COUNT = 100000;
    private final static String FILE_PATH = "E:\\IOTest\\lbw.txt";
    public static void testNio_write(String content) {
        FileChannel channel = null;
        FileOutputStream outputStream = null;
        try {
            long start = System.currentTimeMillis();
            String path = FILE_PATH;
            outputStream = new FileOutputStream(path);
            channel = outputStream.getChannel();
            ByteBuffer buffer = ByteBuffer.allocateDirect(content.getBytes().length * WRITE_COUNT);
            for (int i = 0; i < WRITE_COUNT; i++) {
                buffer.put(content.getBytes());
            }
            buffer.flip();
            channel.write(buffer);
            System.err.println("channel->time:" + (System.currentTimeMillis() - start));
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(channel, outputStream);
        }
    }

运行结果大致在45-49ms左右。。
最后用mmap(内存映射机制)测试:

    private static final String CONTENT = "lu ben wei niu bi, stand up all! play game must be Laughing,play nm!\r\n";
    private final static int WRITE_COUNT = 100000;
    private final static String FILE_PATH = "E:\\IOTest\\lbw.txt";
    public static void testRandomAccessFile_write(String content) {
    FileChannel channel = null;
    try {
        long start = System.currentTimeMillis();
        String path = FILE_PATH;
        RandomAccessFile accessFile = new RandomAccessFile(path, "rw");
        channel = accessFile.getChannel();
        long offset = 0;
        MappedByteBuffer map;
        byte[] bytes = content.getBytes();
        map = channel.map(FileChannel.MapMode.READ_WRITE, offset, bytes.length * WRITE_COUNT);
        for (int i = 0; i < WRITE_COUNT; i++) {
            map.put(bytes);
        }
        System.err.println("mmap->time:" + (System.currentTimeMillis() - start));
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        close(channel);
    }
}

运行结果大致是15-19ms Oh!!!
mmap比传统io快了一倍多!NIO提升不明显,快了20%左右
完整代码:

package rpf.study.excel.nio;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.Closeable;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CoderResult;
import java.util.Objects;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.function.BiConsumer;

/**
 * @program: java-excel
 * @description:
 * @author: ranpengfeng
 * @create: 2019-11-15 11:21
 */
public class Test {
    private static final String CONTENT = "lu ben wei niu bi, stand up all! play game must be Laughing,play nm!\r\n";
    private final static int WRITE_COUNT = 100000;

    /**
     * 测试路径
     */
    private final static String FILE_PATH = "E:\\IOTest\\lbw.txt";

    public static void main(String[] args) {
        ExecutorService executorService = Executors.newFixedThreadPool(3);
        executorService.submit(() -> {
            //testNio_write(CONTENT);
        });
        executorService.submit(() -> {
            //testWrite_Buffer(CONTENT);
        });
        executorService.submit(() -> {
            testRandomAccessFile_write(CONTENT);
        });
        executorService.shutdown();
    }

    public static void testWrite_Buffer(String content) {
        FileOutputStream outputStream = null;
        OutputStreamWriter writer = null;
        BufferedWriter out = null;
        try {
            long start = System.currentTimeMillis();
            String path = FILE_PATH;
            outputStream = new FileOutputStream(path);
            writer = new OutputStreamWriter(outputStream, "GBK");
            out = new BufferedWriter(writer);
            for (int i = 0; i < WRITE_COUNT; i++) {
                out.write(content);
            }
            System.err.println("buffer->time:" + (System.currentTimeMillis() - start));
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(out, writer, outputStream);
        }
    }


    public static void testRandomAccessFile_write(String content) {
        FileChannel channel = null;
        try {
            long start = System.currentTimeMillis();
            String path = FILE_PATH;
            RandomAccessFile accessFile = new RandomAccessFile(path, "rw");
            channel = accessFile.getChannel();
            long offset = 0;
            MappedByteBuffer map;
            byte[] bytes = content.getBytes();
            map = channel.map(FileChannel.MapMode.READ_WRITE, offset, bytes.length * WRITE_COUNT);
            for (int i = 0; i < WRITE_COUNT; i++) {
                map.put(bytes);
            }
            System.err.println("mmap->time:" + (System.currentTimeMillis() - start));
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(channel);
        }
    }
    
    public static void testNio_write(String content) {
        FileChannel channel = null;
        FileOutputStream outputStream = null;
        try {
            long start = System.currentTimeMillis();
            String path = FILE_PATH;
            outputStream = new FileOutputStream(path);
            channel = outputStream.getChannel();
            ByteBuffer buffer = ByteBuffer.allocateDirect(content.getBytes().length * WRITE_COUNT);
            for (int i = 0; i < WRITE_COUNT; i++) {
                buffer.put(content.getBytes());
            }
            buffer.flip();
            channel.write(buffer);
            System.err.println("channel->time:" + (System.currentTimeMillis() - start));
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(channel, outputStream);
        }
    }

    /**
     * 可变参数关闭closeable实现
     *
     * @param readers
     */
    private static void close(Closeable... readers) {
        try {
            for (Closeable reader : readers) {
                if (Objects.nonNull(reader)) {
                    reader.close();
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
            System.err.println("流关闭失败");
        }
    }


}

读取测试

读取还是和IOBuffer(BufferedReader)一行,一行一行读,最终统计总共耗时。BufferedReader有自己的解码器(字节码转字符),这里需要创建一个NIO的节码器,以及按行读取的function。
思路:首先是以换行符结尾(\r,\n)作为完整一行的标志。解析时,以8k大小作为一个解析块,因为考虑到文件大小,内存占用关系,以一个固定大小的ByteBuffer去循环的获取文件内容。由于无法每次都保证8K解析块的末尾刚好是某一行的结束位置,因此需要将上一次不完整行的内容保存到StringBuffer中。图形话来说就是:

不完整行

这个时候需要把byteBuffer中剩余的半行内容编码成字符串放到StringBuffer,再解析下一行时候,就在StringBuffer后面追加既是完整一行。
其次是汉字问题,由于一个汉字在gbk编码下占用两个字节,所以有可能出现ByteBuffer末尾包含一个汉字的其中一个字节,而下一次读取的时候ByteBuffer头部则是汉字的另一个字节,这样会导致两行内容编码失败,图形解释:
汉字不完整

解决办法就是,在节码失败后,将失败的汉字编码放到leftBuffer中去暂存起来,下一次直接将两部分字节码连接在一起编码

首先是传统IO代码:

 private final static String FILE_PATH = "E:\\IOTest\\sx.txt";
 public static void testBuffer(BiConsumer<Integer, String> consumer) {
    InputStreamReader reader = null;
    BufferedReader in = null;
    try {
        long start = System.currentTimeMillis();
        String path = FILE_PATH;
        reader = new InputStreamReader(new FileInputStream(path), "GBK");
        in = new BufferedReader(reader);
        String line;
        int lineCount = 0;
        while ((line = in.readLine()) != null) {
            lineCount++;
            consumer.accept(lineCount, line);
        }
        System.out.println("buffer->lineCount: " + lineCount);
        System.err.println("buffer->time:" + (System.currentTimeMillis() - start));
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        close(reader, in);
    }
}

由于读相互不影响,因此可以一起运行。
NIO代码:

    /**
    *解码器配置
    */
     private static Charset cs = Charset.forName("GBK");
     private static CharsetDecoder charsetDecoder = cs.newDecoder();
    
     /**
      * 测试路径
      */
     private final static String FILE_PATH = "E:\\IOTest\\sx.txt";
     public static void testNio(BiConsumer<Integer, String> consumer) {
            FileChannel channel = null;
            FileInputStream fileInputStream = null;
            try {
                long start = System.currentTimeMillis();
                String path = FILE_PATH;
                fileInputStream = new FileInputStream(path);
                channel = fileInputStream.getChannel();
                int capacity = 1024 * 8;
                //直接获取通道中的字节
                ByteBuffer buffer = ByteBuffer.allocateDirect(capacity);
                //暂存节码出现问题的字节,通常由于汉字的2字节存在不同的批次的buffer中引起
                ByteBuffer leftBuffer = ByteBuffer.allocateDirect(capacity * 2);
                int lineCount = 0;
                CharBuffer cb = CharBuffer.allocate(buffer.capacity());
                //连接buffer中末尾不足一行的内容,缓存于此
                StringBuffer attach = new StringBuffer();
                while (channel.read(buffer) != -1) {
                    buffer.flip();
                    //解码字节
                    decodeBytes(buffer, leftBuffer, cb);
                    char[] chars = cb.array();
                    int pChar = 0;
                    String line;
                    int min = Math.min(cb.position(), chars.length);
                    for (int i = 0; i < min; i++) {
                        char c = chars[i];
                        if (c == '\n' || c == '\r') {
                            if (pChar < i || attach.length() > 0) {
                                if (attach.length() == 0) {
                                    line = new String(chars, pChar, i - pChar);
                                } else {
                                    attach.append(chars, pChar, i - pChar);
                                    line = attach.toString();
                                    attach.delete(0, attach.length());
                                }
                                lineCount++;
                                consumer.accept(lineCount, line);
                                pChar = i + 1;
                                continue;
    
                            }
                            if (c == '\r') {
                                lineCount++;
                                consumer.accept(lineCount, "");
                                pChar = i + 1;
                            }
    
                        }
                    }
                    //将不是完整一行数据的内容放入attach
                    if (pChar < min) {
                        attach.append(chars, pChar, min - pChar);
                    }
                    buffer.clear();
                }
                System.out.println("channel->lineCount:" + lineCount);
                System.err.println("channel->time:" + (System.currentTimeMillis() - start));
            } catch (Exception e) {
                e.printStackTrace();
            } finally {
                close(channel, fileInputStream);
            }
        }
    /**
     * 解码字节->字符
     * @param buffer
     * @param leftBuffer
     * @param cb
     */
    private static void decodeBytes(ByteBuffer buffer, ByteBuffer leftBuffer, CharBuffer cb) {
        int capacity = buffer.capacity();
        cb.clear();
        //假定刚好不存在汉字字节被断开的问题,则直接解析,大多数情况
        CoderResult decode = charsetDecoder.decode(buffer, cb, true);
        //处理遇到汉字字节被断开的解析格式问题
        while (decode.isMalformed()) {
            //查询leftBuffer是否有上次未处理的字节码
            if (leftBuffer.position() == 0) {
                int position = buffer.position();
                //将本次出现解码错误位置的字节放到leftBuffer中
                for (; position < capacity; position++) {
                    leftBuffer.put(buffer.get(position));
                }
                //将limit移动到解码出错的位置
                buffer.limit(buffer.position());
                //重新解码
                decode = charsetDecoder.decode(buffer, cb, true);
            } else {
                //将解码错误的字节append到leftBuffer的后面,构造成完整的汉字解码
                buffer.position(0);
                cb.position(0);
                leftBuffer.put(buffer);
                //写模式切换读模式
                leftBuffer.flip();
                decode = charsetDecoder.decode(leftBuffer, cb, true);
                int position = leftBuffer.position();
                int limit = leftBuffer.limit();
                leftBuffer.clear();
                //如果末尾再次遇到汉字解码问题,则将出现问题的字节同理移动到leftBuffer开头位置,并结束循环,下次read(buffer)的时候再处理
                if (decode.isMalformed()) {
                    for (; position < limit; position++) {
                        leftBuffer.put(leftBuffer.get(position));
                    }
                    break;
                }
            }
        }
    }

最后是mmap的代码:

    /**
    *解码器配置
    */
    private static Charset cs = Charset.forName("GBK");
    private static CharsetDecoder charsetDecoder = cs.newDecoder();
    
    /**
     * 测试路径
     */
    private final static String FILE_PATH = "E:\\IOTest\\sx.txt";
     
    public static void testRandomAccessFile(BiConsumer<Integer, String> consumer) {
        FileChannel channel = null;
        try {
            long start = System.currentTimeMillis();
            String path = FILE_PATH;
            RandomAccessFile accessFile = new RandomAccessFile(path, "r");
            channel = accessFile.getChannel();
            //偏移位置
            long offset = 0;
            //映射mmap容量
            int max = 1024 * 1000;
            //总字节大小
            long len = channel.size();
            int lineCount = 0;
            MappedByteBuffer map;
            CharBuffer cb;
            ByteBuffer leftBuffer = ByteBuffer.allocateDirect(max * 2);
            //连接buffer中末尾不足一行的内容,缓存于此
            StringBuffer attach = new StringBuffer();
            while (offset <= len) {
                if (offset + max > len) {
                    map = channel.map(FileChannel.MapMode.READ_ONLY, offset, len - offset);
                    offset = len + 1;
                } else {
                    map = channel.map(FileChannel.MapMode.READ_ONLY, offset, max);
                    offset += max;
                }
                cb = CharBuffer.allocate(map.capacity());
                String line;
                decodeBytes(map,leftBuffer,cb);
                char[] chars = cb.array();
                int pChar = 0;
                int min = Math.min(cb.position(), chars.length);
                for (int i = 0; i < min; i++) {
                    char c = chars[i];
                    if (c == '\n' || c == '\r') {
                        if (pChar < i || attach.length() > 0) {
                            if (attach.length() == 0) {
                                line = new String(chars, pChar, i - pChar);
                            } else {
                                attach.append(chars, pChar, i - pChar);
                                line = attach.toString();
                                attach.delete(0, attach.length());
                            }
                            lineCount++;
                            consumer.accept(lineCount, line);
                            pChar = i + 1;
                            continue;
                        }
                        if (c == '\r') {
                            lineCount++;
                            consumer.accept(lineCount, "");
                            pChar = i + 1;
                        }
                    }
                }
                //将不是完整一行数据的内容放入attach
                if (pChar < min) {
                    attach.append(chars, pChar, min - pChar);
                }
            }
    
            System.out.println("mmap->lineCount:" + lineCount);
            System.err.println("mmap->time:" + (System.currentTimeMillis() - start));
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(channel);
        }
    }
    /**
     * 解码字节->字符
     * @param buffer
     * @param leftBuffer
     * @param cb
     */
    private static void decodeBytes(ByteBuffer buffer, ByteBuffer leftBuffer, CharBuffer cb) {
        int capacity = buffer.capacity();
        cb.clear();
        //假定刚好不存在汉字字节被断开的问题,则直接解析,大多数情况
        CoderResult decode = charsetDecoder.decode(buffer, cb, true);
        //处理遇到汉字字节被断开的解析格式问题
        while (decode.isMalformed()) {
            //查询leftBuffer是否有上次未处理的字节码
            if (leftBuffer.position() == 0) {
                int position = buffer.position();
                //将本次出现解码错误位置的字节放到leftBuffer中
                for (; position < capacity; position++) {
                    leftBuffer.put(buffer.get(position));
                }
                //将limit移动到解码出错的位置
                buffer.limit(buffer.position());
                //重新解码
                decode = charsetDecoder.decode(buffer, cb, true);
            } else {
                //将解码错误的字节append到leftBuffer的后面,构造成完整的汉字解码
                buffer.position(0);
                cb.position(0);
                leftBuffer.put(buffer);
                //写模式切换读模式
                leftBuffer.flip();
                decode = charsetDecoder.decode(leftBuffer, cb, true);
                int position = leftBuffer.position();
                int limit = leftBuffer.limit();
                leftBuffer.clear();
                //如果末尾再次遇到汉字解码问题,则将出现问题的字节同理移动到leftBuffer开头位置,并结束循环,下次read(buffer)的时候再处理
                if (decode.isMalformed()) {
                    for (; position < limit; position++) {
                        leftBuffer.put(leftBuffer.get(position));
                    }
                    break;
                }
            }
        }

最后测试入口:

    public static void main(String[] args) {
        BiConsumer<Integer, String> consumer = (i, line) -> {
            if (i >= 1000 && i <= 2000) {
                System.out.println("[" + i + "]:" + line);
            }
        };
        ExecutorService executorService = Executors.newFixedThreadPool(3);
        executorService.submit(() -> {
            testNio(consumer);
        });
        executorService.submit(() -> {
            testBuffer(consumer);
        });
        executorService.submit(() -> {
            testRandomAccessFile(consumer);
        });
        executorService.shutdown();
    }

结果相当的意外:

第一次:
buffer->lineCount: 572585
buffer->time:513
channel->lineCount:572585
channel->time:902
mmap->time:1069
mmap->lineCount:572585
第二次:
buffer->lineCount: 572585
buffer->time:589
mmap->lineCount:572585
mmap->time:926
channel->lineCount:572585
channel->time:1025
第三次:
buffer->lineCount: 572585
buffer->time:505
mmap->lineCount:572585
channel->lineCount:572585
mmap->time:773
channel->time:776
第四次:
buffer->lineCount: 572585
buffer->time:651
channel->time:1073
channel->lineCount:572585
mmap->lineCount:572585
mmap->time:1177

结果:完败!读测试的代码我修改了很多次,最终还是没办法优化的和传统IO相当,BufferReader牛逼!
BufferReader获取字节数组的方法还是一个native的原生方法,并没有看到有什么独特的优化的地方。。

总结:写场景下最快的是内存映射,读场景传统的IO默秒全,总的来说BufferReader/BufferWriter表现还是比较均衡的,写起来也比较方便,如果是频繁的文件写入操作可以考虑mmap或者nio,如果只是文件读取解析操作还是用传统IO不仅块,而且很方便!

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 159,569评论 4 363
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,499评论 1 294
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 109,271评论 0 244
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 44,087评论 0 209
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,474评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,670评论 1 222
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,911评论 2 313
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,636评论 0 202
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,397评论 1 246
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,607评论 2 246
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,093评论 1 261
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,418评论 2 254
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,074评论 3 237
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,092评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,865评论 0 196
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,726评论 2 276
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,627评论 2 270

推荐阅读更多精彩内容