从字节码的角度来看String

Java String的比较是平常开发常常遇到的坑,也是各种面试经常考察的问题。
本文从Java编译后字节码的出发,来深入理解这个问题的本质,以使得我们对String有更加深入的理解。



    1. 来看一段代码:
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      public class TestContantPool{
      private static String a = "1234";
      final String x = "34";
      final String y;

      public TestContantPool(){
      y = "34";
      }

      public void test(){
      String b = "1234";
      String c = "12" + "34";
      String d = new String("12345");
      final String e = "34";
      String f = "34";
      String g = "12" + e;
      String h = "12" + f;
      String i = "12" + x;
      String j = "12" + y;

      System.out.println(a == b);
      System.out.println(a == c);
      System.out.println(a != d);
      System.out.println(a == g);
      System.out.println(a != h && a == h.intern());
      System.out.println(a == i && a == i.intern());
      System.out.println(a != j && a == j.intern());
      }

      public static void main(String[] ar){
      new TestContantPool().test();
      }

      }
      /**
      true
      true
      true
      true
      true
      true
      true
      */

    1. 编译之后,我们来分析一下部分字节码,重点在于常量池(Constant Pool),静态块static{},构造方法和test()方法:
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      47
      48
      49
      50
      51
      52
      53
      54
      55
      56
      57
      58
      59
      60
      61
      62
      63
      64
      65
      66
      67
      68
      69
      70
      71
      72
      73
      74
      75
      76
      77
      78
      79
      80
      81
      82
      83
      84
      85
      86
      87
      88
      89
      90
      91
      92
      93
      94
      95
      96
      97
      98
      99
      100
      101
      102
      103
      104
      105
      106
      107
      108
      109
      110
      111
      112
      javap -verbose TestContantPool.class 

      Constant pool:
      #5 = Utf8 a
      #7 = Utf8 x
      #9 = String #10 // 34
      #10 = Utf8 34
      #11 = Utf8 y

      #15 = String #16 // 1234
      #16 = Utf8 1234

      #31 = Class #32 // java/lang/String
      #32 = Utf8 java/lang/String
      #33 = String #34 // 12345
      #34 = Utf8 12345

      #39 = Utf8 java/lang/StringBuilder
      #40 = String #41 // 12
      #41 = Utf8 12

      #66 = Utf8 b
      #67 = Utf8 c
      #68 = Utf8 d
      #69 = Utf8 e
      #70 = Utf8 f
      #71 = Utf8 g
      #72 = Utf8 h
      #73 = Utf8 i
      #74 = Utf8 j

      final java.lang.String x;
      descriptor: Ljava/lang/String;
      flags: ACC_FINAL
      ConstantValue: String 34

      final java.lang.String y;
      descriptor: Ljava/lang/String;
      flags: ACC_FINAL

      static {};
      descriptor: ()V
      flags: ACC_STATIC
      Code:
      stack=1, locals=0, args_size=0
      0: ldc #15 // String 1234
      2: putstatic #17 // Field a:Ljava/lang/String;
      5: return
      LineNumberTable:
      line 6: 0
      LocalVariableTable:
      Start Length Slot Name Signature

      public bytecode.TestContantPool();
      descriptor: ()V
      flags: ACC_PUBLIC
      Code:
      stack=2, locals=1, args_size=1
      0: aload_0
      1: invokespecial #22 // Method java/lang/Object."<init>":()V
      4: aload_0
      5: ldc #9 // String 34
      7: putfield #24 // Field x:Ljava/lang/String;
      10: aload_0
      11: ldc #9 // String 34
      13: putfield #26 // Field y:Ljava/lang/String;
      16: return
      LocalVariableTable:
      Start Length Slot Name Signature
      0 17 0 this Lbytecode/TestContantPool;

      public void test();
      descriptor: ()V
      flags: ACC_PUBLIC
      Code:
      stack=3, locals=10, args_size=1
      0: ldc #15 // String 1234
      2: astore_1
      3: ldc #15 // String 1234
      5: astore_2
      6: new #31 // class java/lang/String
      9: dup
      10: ldc #33 // String 12345
      12: invokespecial #35 // Method java/lang/String."<init>":(Ljava/lang/String;)V
      15: astore_3
      16: ldc #9 // String 34
      18: astore 4
      20: ldc #9 // String 34
      22: astore 5
      24: ldc #15 // String 1234
      26: astore 6
      28: new #38 // class java/lang/StringBuilder
      31: dup
      32: ldc #40 // String 12
      34: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
      37: aload 5
      39: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
      42: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      45: astore 7
      47: ldc #15 // String 1234
      49: astore 8
      51: new #38 // class java/lang/StringBuilder
      54: dup
      55: ldc #40 // String 12
      57: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
      60: aload_0
      61: getfield #26 // Field y:Ljava/lang/String;
      64: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
      67: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      70: astore 9

      235: return

    1. 首先分析ConstantPool:
      可以看到编译之后常量池中就有了String的字面量”12”,”34”,1234”。
      下面代码中所有的其他String的创建都离不开常量池中的这些字面量。
      1
      2
      0: ldc           #15                 // String 1234    (将int,float,String型常量值从常量池推送至栈顶)
      2: putstatic #17 // Field a:Ljava/lang/String; (为指定的类的静态域赋值)

    1. 再看类变量a的初始化过程:
  1. 根据虚拟机类加载机制可以知道,static变量和static{}代码块会在编译生成的()方法中执行。
  2. ()是由编译器自动收集类中的所有static变量的赋值动作和static{}代码块中的语句合并产生的,
  3. ()方法如果存在,会在实例构造函数()执行。
  4. 下面两句代码是对static变量a的赋值语句,可以看出来a是从常量池中取值的:
    1
    2
    0: ldc           #15                 // String 1234    (将int,float,String型常量值从常量池推送至栈顶)
    2: putstatic #17 // Field a:Ljava/lang/String; (为指定的类的静态域赋值)

    1. 最后分析test()方法。
      b和c 的赋值”1234”,都是从常量池中取值:
      1
      2
      3
      4
      0: ldc           #15                 // String 1234    (将int,float,String型常量值从常量池推送至栈顶)
      2: astore_1 (将栈顶引用型数值存入第二个本地变量)
      3: ldc #15 // String 1234
      5: astore_2 (将栈顶引用型数值存入第三个本地变量)

d的赋值”1234”,可以看到是重新调用了new了一个String对象:

1
2
3
4
5
6: new           #31                 // class java/lang/String (创建一个对象,并且将其引用压入栈顶)
9: dup (赋值栈顶数值,并且将复制值压入栈顶)
10: ldc #33 // String 12345 (将int,float,String型常量值从常量池推送至栈顶)
12: invokespecial #35 (调用超类构造方法,实例初始化方法)
15: astore_3 (将栈顶引用型数值存入第四个本地变量)

e,f,g的赋值,都是从常量池中取值:

1
2
3
4
5
6
16: ldc           #9                  // String 34
18: astore 4
20: ldc #9 // String 34
22: astore 5
24: ldc #15 // String 1234
26: astore 6

h的赋值,使用StringBuilder new一个String(从中可以看到,String 的”+”操作是用StringBuilder实现的):

1
2
3
4
5
6
7
8
28: new           #38                 // class java/lang/StringBuilder
31: dup
32: ldc #40 // String 12
34: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
37: aload 5
39: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
42: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
45: astore 7

i的赋值,常量池去取:

1
2
47: ldc           #15                 // String 1234
49: astore 8

j的赋值,使用StringBuilder new一个String(从中可以看到,String 的”+”操作是用StringBuilder实现的):

1
2
3
4
5
6
7
8
9
51: new           #38                 // class java/lang/StringBuilder
54: dup
55: ldc #40 // String 12
57: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
60: aload_0
61: getfield #26 // Field y:Ljava/lang/String;
64: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
67: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
70: astore 9

根据以上分析,只要是从常量池中取值的,”==”比较都是true;其他的或者是new String()产生的,或者是new StringBuilder()产生的, 都是在Heap区新建对象,因此都不相等。按照这种逻辑,就能够理解String对象的比较问题了。