Java对象占用内存空间分析及实战

最近在做一个需求,评估Java列表10万数据加载到内容占用空间大小,以及对服务器内存使用影响。以前都是从书上看Java内存布局相关知识,借这个机会深入分析Java对象占用内存空间及实战,加深对Java内存布局的理解。

简单回顾Java对象内存布局:对象头(Header),实例数据(Instance Data)和对齐填充(Padding)。另外,不同环境Java对象占用内存空间可能有所差异。本文实验环境如下,HotSpot 64-Bit虚拟机,默认开启指针压缩(-XX:+UseCompressedOops),结合如图1,所以Java对象实例的对象头大小为12bytes(8bytes makOop + 4 bytes klassOop), Java数组实例的对象头大小为16bytes(8bytes makOop + 4 bytes klassOop + 4 bytes length);64位Linux系统,所以字节对齐必须是8的倍数。

xxx:~$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
图1 Java对象内存布局

本文验证Java对象占用内存空间使用的方法是:org.apache.lucene.util.RamUsageEstimator#sizeOf(java.lang.Object),计算的对象大小包含本体对象和引用对象的大小,对应jar包版本:

<dependency>
    <groupId>org.apache.lucene</groupId>
    <artifactId>lucene-core</artifactId>
    <version>4.2.0</version>
</dependency>

原生类型(primitive type)

一般技术文章介绍原生类型占用的存储空间总会列举下面表格。那new一个long对象,占用的内存空间是不是8 bytes呢?从图1Java对象内存布局分析看,肯定不止8 bytes。

Primitive Type Memory Required(bytes)
byte, boolean 1
short, char 2
int, float 4
long, double 8

下面举例分析Java原生类型对象占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
    public static void main(String[] args) {
        boolean bool = true;
        byte b = (byte)0xFF;
        short s = (short)1;
        char c = 'c';
        int i = 1;
        float f = 1.0f;
        long l = 1L;
        double d = 1.0;

        System.out.printf("sizeOf(byte) = %s bytes\n", RamUsageEstimator.sizeOf(b));
        System.out.printf("sizeOf(boolean) = %s bytes\n", RamUsageEstimator.sizeOf(bool));

        System.out.printf("sizeOf(short) = %s bytes\n", RamUsageEstimator.sizeOf(s));
        System.out.printf("sizeOf(char) = %s bytes\n", RamUsageEstimator.sizeOf(c));

        System.out.printf("sizeOf(int) = %s bytes\n", RamUsageEstimator.sizeOf(i));
        System.out.printf("sizeOf(float) = %s bytes\n", RamUsageEstimator.sizeOf(f));

        System.out.printf("sizeOf(long) = %s bytes\n", RamUsageEstimator.sizeOf(l));
        System.out.printf("sizeOf(double) = %s bytes\n", RamUsageEstimator.sizeOf(d));
    }
}

执行结果:

sizeOf(byte) = 16 bytes
sizeOf(boolean) = 16 bytes
sizeOf(short) = 16 bytes
sizeOf(char) = 16 bytes
sizeOf(int) = 16 bytes
sizeOf(float) = 16 bytes
sizeOf(long) = 24 bytes
sizeOf(double) = 24 bytes

分析原生类型对象占用内存空间情况:

sizeOf(byte)=12(Header) + 1(Instance Data) + 3(Padding)=16 bytes
sizeOf(boolean)=12(Header) + 1(Instance Data) + 3(Padding)=16 bytes
sizeOf(short)=12(Header) + 2(Instance Data) + 2(Padding)=16 bytes
sizeOf(char)=12(Header) + 2(Instance Data) + 2(Padding)=16 bytes
sizeOf(int)=12(Header) + 4(Instance Data)=16 bytes
sizeOf(float)=12(Header) + 4(Instance Data)=16 bytes
sizeOf(long)=12(Header) + 8(Instance Data) + 4(Padding)=24 bytes
sizeOf(double)=12(Header) + 8(Instance Data) + 4(Padding)=24 bytes

下面进一步举例分析Java原生类型的包装类对象占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
    public static void main(String[] args) {
        Boolean bool = true;
        Byte b = (byte)0xFF;
        Short s = (short)1;
        Character c = 'c';
        Integer i = 1;
        Float f = 1.0f;
        Long l = 1L;
        Double d = 1.0;

        System.out.printf("sizeOf(Boolean) = %s bytes\n", RamUsageEstimator.sizeOf(b));
        System.out.printf("sizeOf(Byte) = %s bytes\n", RamUsageEstimator.sizeOf(bool));

        System.out.printf("sizeOf(Short) = %s bytes\n", RamUsageEstimator.sizeOf(s));
        System.out.printf("sizeOf(Character) = %s bytes\n", RamUsageEstimator.sizeOf(c));

        System.out.printf("sizeOf(Integer) = %s bytes\n", RamUsageEstimator.sizeOf(i));
        System.out.printf("sizeOf(Float) = %s bytes\n", RamUsageEstimator.sizeOf(f));

        System.out.printf("sizeOf(Long) = %s bytes\n", RamUsageEstimator.sizeOf(l));
        System.out.printf("sizeOf(Double) = %s bytes\n", RamUsageEstimator.sizeOf(d));
    }
}

执行结果与原生类型对象内存布局分析一致。

sizeOf(Boolean) = 16 bytes
sizeOf(Byte) = 16 bytes
sizeOf(Short) = 16 bytes
sizeOf(Character) = 16 bytes
sizeOf(Integer) = 16 bytes
sizeOf(Float) = 16 bytes
sizeOf(Long) = 24 bytes
sizeOf(Double) = 24 bytes

特殊对象

下面举例分析null和Object对象占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
    public static void main(String[] args) {
        System.out.printf("sizeOf(null) = %s bytes\n", RamUsageEstimator.sizeOf((Object)null));
        System.out.printf("sizeOf(new Object()) = %s bytes\n", RamUsageEstimator.sizeOf(new Object()));
    }
}

执行结果如下,说明null对象在内存中不分配任何空间;
sizeOf(new Object())=12(Header) + 4(Padding)=16 bytes。

sizeOf(null) = 0 bytes
sizeOf(new Object()) = 16 bytes

数组

下面举例分析Java数组对象占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
    public static void main(String[] args) {
        int[] array0 = new int[0];
        int[] array1 = new int[1];
        int[] array2 = new int[2];
        int[] array3 = new int[3];
        int[] array8 = new int[8];
        int[] array9 = new int[9];
        System.out.printf("sizeOf(array0) = %s bytes\n", RamUsageEstimator.sizeOf(array0));
        System.out.printf("length(array0) = %s bytes\n", array0.length);
        System.out.printf("sizeOf(array1) = %s bytes\n", RamUsageEstimator.sizeOf(array1));
        System.out.printf("sizeOf(array2) = %s bytes\n", RamUsageEstimator.sizeOf(array2));
        System.out.printf("sizeOf(array3) = %s bytes\n", RamUsageEstimator.sizeOf(array3));
        System.out.printf("sizeOf(array8) = %s bytes\n", RamUsageEstimator.sizeOf(array8));
        System.out.printf("sizeOf(array9) = %s bytes\n", RamUsageEstimator.sizeOf(array9));
    }
}

执行结果:

sizeOf(array0) = 16 bytes
length(array0) = 0 bytes
sizeOf(array1) = 24 bytes
sizeOf(array2) = 24 bytes
sizeOf(array3) = 32 bytes
sizeOf(array8) = 48 bytes
sizeOf(array9) = 56 bytes

参考图1,Java数组实例的对象头为16bytes,区别与Java对象实例,分析数组实例占用内存空间情况如下:

sizeOf(array0)=16(Header)=16 bytes
length(array0)=0
sizeOf(array1)=16(Header) + 4(int) + 4(Padding)=24 bytes
sizeOf(array2)=16(Header) + 4(int)*2=24 bytes
sizeOf(array3)=16(Header) + 4(int)*3 + 4(Padding)=32 bytes
sizeOf(array8)=16(Header) + 4(int)*8=48 bytes
sizeOf(array9)=16(Header) + 4(int)*9 + 4(Padding)=56 bytes

String

在JDK1.7及以上版本中,String部分源码如下,包含String的4个属性变量,static变量属于类,不属于实例对象,存放在全局数据段,普通变量才纳入Java对象占用空间的计算,一个用于存放字符串数据的char[], 一个int类型的hashcode。关于static属性字段不纳入Java对象占用堆空间的验证请看下面自定义对象一节。

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];
}

因此,一个String本身需要 12(Header) + 4(char[] reference) + 4(int) + 4(Padding) = 24 bytes。
除此之外,一个char[]占用16(Array Header) + length * 2 bytes(8字节对齐),length是字符串长度,参考图2,一个String对象占用的内存空间大小为:

40 + length * 2 bytes + Padding

图2 Java String对象内存布局

下面举例分析Java String对象占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
    public static void main(String[] args) {
        String s0 = "";
        String s1 = "a";
        String s2 = "aa";
        String s4 = "aaaa";
        String s5 = "aaaaa";
        System.out.printf("sizeOf(s0) = %s bytes\n", RamUsageEstimator.sizeOf(s0));
        System.out.printf("sizeOf(s1) = %s bytes\n", RamUsageEstimator.sizeOf(s1));
        System.out.printf("sizeOf(s2) = %s bytes\n", RamUsageEstimator.sizeOf(s2));
        System.out.printf("sizeOf(s4) = %s bytes\n", RamUsageEstimator.sizeOf(s4));
        System.out.printf("sizeOf(s5) = %s bytes\n", RamUsageEstimator.sizeOf(s5));
    }
}

执行结果:

sizeOf(s0) = 40 bytes
sizeOf(s1) = 48 bytes
sizeOf(s2) = 48 bytes
sizeOf(s4) = 48 bytes
sizeOf(s5) = 56 bytes

对上述字符串执行结果分析:

sizeOf(s0)=40 + 0 * 2 = 40 bytes
sizeOf(s1)=40 + 1 * 2 + 6(Padding) = 48 bytes
sizeOf(s2)=40 + 2 * 2 + 4(Padding) = 48 bytes
sizeOf(s4)=40 + 4 * 2 = 48 bytes
sizeOf(s2)=40 + 5 * 2 + 6(Padding) = 56 bytes

自定义对象

下面举例分析Java自定义对象占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
    private long id;
    private int age;
    public Employee(long id, int age) {
        this.id = id;
        this.age = age;
    }
    public static void main(String[] args) {
        System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
    }
}

执行结果:

sizeOf(Employee) = 24 bytes

参看图3,从Java对象内存布局分析数组对象占用内存空间:

sizeOf(Employee) = 12(Header) + 8(long) + 4(int) = 24 bytes

图3 Employee内存布局

Employee自定义对象新增一个static字段,如下:

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
    private long id;
    private int age;
    // static变量属于类,不属于实例,存放在全局数据段
    private static int staticField = 88;
    public Employee(long id, int age) {
        this.id = id;
        this.age = age;
    }
    public static void main(String[] args) {
        System.out.printf("sizeOf(Employee) = %s bytes", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
    }
}

执行结果如下,证明static变量属于类,不属于实例,存放在全局数据段,普通变量才纳入Java对象占用空间的计算。

sizeOf(Employee) = 24 bytes

Employee自定义对象引用其他Java对象,如下,引用一个Long和Integer对象:

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
    private Long id;
    private Integer age;
    public Employee(long id, int age) {
        this.id = id;
        this.age = age;
    }
    public static void main(String[] args) {
        System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
    }
}

执行结果:

sizeOf(Employee) = 64 bytes

参看图4,从Java对象内存布局分析数组对象占用内存空间:

sizeOf(Employee) = 24(Employee Object) + 24(Long Object) + 16(Integer Object) =64 bytes

图4 Employee内存布局

ArrayList

在JDK1.7及以上版本中,ArrayList部分源码如下,包含String的6个属性,static变量属于类,不属于实例,存放在全局数据段,普通变量才纳入Java对象占用空间的计算,一个用于存放数组元素的Object[], 一个int类型的size。

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    private static final long serialVersionUID = 8683452581122892189L;

    /**
     * Default initial capacity.
     */
    private static final int DEFAULT_CAPACITY = 10;

    /**
     * Shared empty array instance used for empty instances.
     */
    private static final Object[] EMPTY_ELEMENTDATA = {};

    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    transient Object[] elementData; // non-private to simplify nested class access

    /**
     * The size of the ArrayList (the number of elements it contains).
     */
    private int size;
}

因此,一个ArrayList本身需要 12(Header) + 4(Object[] reference) + 4(int) + 4(Padding) = 24 bytes。
除此之外,一个Object[]占用16(Array Header) + length * 4(Object reference) bytes(8字节对齐),length是Object[]长度,即ArrayList容量,size是ArrayList存放的元素数量,其中length >= size,另加数组初始化的Object占用的内存空间,结合图5,所以一个ArrayList占用的内存空间大小为:

((40 + length * 4)(8字节对齐) + size * n bytes)(8字节对齐),假设Object对象占用n bytes,size * n表示只有在数组初始化的Object才需要分配内存空间。

图5 ArrayList对象实力内存分析

下面举例分析ArrayList对象占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
import java.util.ArrayList;
import java.util.List;

public class RamUsageEstimatorTest {
    public static void main(String[] args) {
        System.out.printf("sizeOf(ArrayList with 0 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(new ArrayList<>(0)));
        System.out.printf("sizeOf(ArrayList with default capacity) = %s bytes\n", RamUsageEstimator.sizeOf(new ArrayList<>()));
        List<Integer> list1 = new ArrayList<>(1);
        list1.add(1);
        System.out.printf("sizeOf(list1 with 1 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(list1));
        list1 = new ArrayList<>();
        list1.add(1);
        System.out.printf("sizeOf(list1 with default capacity) = %s bytes\n", RamUsageEstimator.sizeOf(list1));
    }
}

执行结果如下:

sizeOf(ArrayList with 0 capacity) = 40 bytes
sizeOf(ArrayList with default capacity) = 40 bytes
sizeOf(list1 with 1 capacity) = 64 bytes
sizeOf(list1 with default capacity) = 96 bytes

sizeOf(ArrayList with 0 capacity) = 40 bytes分析:构造函数指定initialCapacity=0,sizeOf(ArrayList with 0 capacity) = 40 + 0 * 4(int reference) + 0 * 16(int) = 40 bytes
sizeOf(ArrayList with default capacity) = 40 bytes分析:构造函数new ArrayList()创建elementData为空,sizeOf(ArrayList with default capacity) = 40 + 0 * 4(int reference) + 0 * 16(int) = 40 bytes
sizeOf(list1 with 1 capacity) = 64 bytes分析:构造函数指定initialCapacity=1,sizeOf(ArrayList with 0 capacity) = 40 + 1 * 4(int reference) + 1 * 16(int) + 4(Padding) = 64 bytes
sizeOf(list1 with default capacity) = 96 bytes分析:构造函数new ArrayList()创建elementData为空,当第一次调用add()方法添加元素时,初始化elementData默认最小容量为10,size=1。所以sizeOf(list1 with default capacity) = 40 + 10 * 4(int reference) + 1 * 16(int) = 96 bytes

Java列表10万数据占用内存空间

下面举例分析如何评估Java列表10万数据占用内存空间。

package study.estimator;

import org.apache.lucene.util.RamUsageEstimator;
import java.util.ArrayList;
import java.util.List;
public class Employee {
    private long id;
    private int age;
    public Employee(long id, int age) {
        this.id = id;
        this.age = age;
    }
    public static void main(String[] args) {
        System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));

        List<Employee> employeeList = new ArrayList<>(100000);
        for (int i = 0; i < 100000; i++) {
            employeeList.add(new Employee(123456789L, 28));
        }
        System.out.printf("sizeOf(List<Employee> contains 10000 Employee object with 10000 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(employeeList));
        employeeList = new ArrayList<>();
        for (int i = 0; i < 100000; i++) {
            employeeList.add(new Employee(123456789L, 28));
        }
        System.out.printf("sizeOf(List<Employee> contains 10000 Employee object) = %s bytes\n", RamUsageEstimator.sizeOf(employeeList));
    }
}

执行结果如下:

sizeOf(Employee) = 24 bytes
sizeOf(List<Employee> contains 100000 Employee object with 10000 capacity) = 2800040 bytes
sizeOf(List<Employee> contains 100000 Employee object) = 2826880 bytes

根据上一节对ArrayList对象的分析:

sizeOf(List<Employee> contains 100000 Employee object with 100000 capacity) = 2800040 bytes = 40 + 100000 * 4(Employee Reference) + 100000 * 24(Employee Object)

如果在new ArrayList没有指定capacity或者列表大小大于capacity,列表的elementData会进行扩容,将老数组中的元素重新拷贝一份到新的数组中,每次elementData扩容的增长是原容量的1.5倍。所以为了扩容ArrayList以放置10000数据,capacity初始值默认为10,capacity最终值为106710,计算如下:

package study;

public class StudyTest {
    public static void main(String[] args) {
        int capacity = 10;
        while (true) {
            capacity += capacity * 0.5;
            if (capacity >= 100000) {
                break;
            }
        }
        System.out.println(capacity);
    }
}

最后,上述执行结果的最后一行分析如下:

sizeOf(List<Employee> contains 100000 Employee object) = 2826880 bytes = 40 + 106710 * 4(Employee Reference) + 100000 * 24(Employee Object) = 2.696MB

延伸实践

  • 大家可以根据上面分析方法实践HashMap、枚举类或者自定义对象。
  • 结合上述代码,大家可以使用-XX:-UseCompressedOops关闭压缩指针,执行代码验证对象头大小变化对Java对象占用内存空间的影响。

推荐阅读更多精彩内容