使用 IDEA 搭建 Hadoop3.1.1 项目

96
_Binguner B67c298d f020 4f89 aac6 0710bc0709ec
2019.03.27 10:06 字数 587

Hadoop 的版本是 3.1.1

1. 启动 Hadoop 服务

$ start-all.sh

2. 新建 IDEA 的 Maven 项目

2.1 选中 Maven,Project SDK 选择为 1.8,再点击 Next

点击 Next

2.2 填写好 GroupId,ArtifactId 后,点击 Next


2.3 点击 Finish


image.png

3. 修改 Target bytecode version

打开 Setting,选中 Build, Execution, Deployment -> Compiler -> java,将 Target bytecode version 改为 1.8 或 8。

Target bytecode version

确认这几个配置下的 jdk 版本都为 1.8


Project SDK
Module SDK

4. 导入需要的 jar 包

4.1 选中 Dependencies 后点击下方的 + 号,选择「JARs or directories」


添加 jar 包

JARs or directories

4.2 进入 Hadoop 目录下的 share/hadoop/ 中,把这几个包都导进去

share/hadoop/
选择 OK
继续 OK

4.2 在 pom.xml 中添加如下依赖

    <dependencies>
        <!-- https://mvnrepository.com/artifact/junit/junit -->
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
            <scope>test</scope>
        </dependency>

        <!--&lt;!&ndash; https://mvnrepository.com/artifact/commons-logging/commons-logging &ndash;&gt;-->
        <dependency>
            <groupId>commons-logging</groupId>
            <artifactId>commons-logging</artifactId>
            <version>1.2</version>
        </dependency>

        <!--&lt;!&ndash; https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common &ndash;&gt;-->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>3.1.1</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
            <version>1.2.1</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.1.1</version>
        </dependency>
        
    </dependencies>

5. 编写 Hadoop 项目的 Java 代码

5.1 新建 Java 类「Test.java」

image.png

5.2 编写代码

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;

public class Test {

    // 在 HDFS 中新建一个 test 文件夹
    public static void main(String[] args) {

        FileSystem fileSystem = null;
        try {
            fileSystem = FileSystem.get(new URI("hdfs://localhost:9000/"),new Configuration(),"binguner");
            fileSystem.mkdirs(new Path("/test"));
            fileSystem.close();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (InterruptedException e) {
            e.printStackTrace();
        } catch (URISyntaxException e) {
            e.printStackTrace();
        }
    }

}

5.3 运行 Java 程序

image.png

6. 运行结果

6.1 运行前的 HDFS 目录下没有 test 文件夹


6.2 运行后的 HDFS 目录下多了 test 文件夹

7. FileSystem 常用接口

  • 7.1 mkdirs
public boolean mkdirs(Path f) throws IOException {
    return this.mkdirs(f, FsPermission.getDirDefault());
}

参数是新的文件夹的路径,可以在文件夹里嵌套文件夹进行创建。

  • 7.2 create
    public FSDataOutputStream create(Path f) throws IOException {
        return this.create(f, true);
    }

    public FSDataOutputStream create(Path f, boolean overwrite) throws IOException {
        return this.create(f, overwrite, this.getConf().getInt("io.file.buffer.size", 4096), this.getDefaultReplication(f), this.getDefaultBlockSize(f));
    }

    public FSDataOutputStream create(Path f, Progressable progress) throws IOException {
        return this.create(f, true, this.getConf().getInt("io.file.buffer.size", 4096), this.getDefaultReplication(f), this.getDefaultBlockSize(f), progress);
    }

    public FSDataOutputStream create(Path f, short replication) throws IOException {
        return this.create(f, true, this.getConf().getInt("io.file.buffer.size", 4096), replication, this.getDefaultBlockSize(f));
    }

    public FSDataOutputStream create(Path f, short replication, Progressable progress) throws IOException {
        return this.create(f, true, this.getConf().getInt("io.file.buffer.size", 4096), replication, this.getDefaultBlockSize(f), progress);
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize) throws IOException {
        return this.create(f, overwrite, bufferSize, this.getDefaultReplication(f), this.getDefaultBlockSize(f));
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, Progressable progress) throws IOException {
        return this.create(f, overwrite, bufferSize, this.getDefaultReplication(f), this.getDefaultBlockSize(f), progress);
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize) throws IOException {
        return this.create(f, overwrite, bufferSize, replication, blockSize, (Progressable)null);
    }

    public FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException {
        return this.create(f, FsCreateModes.applyUMask(FsPermission.getFileDefault(), FsPermission.getUMask(this.getConf())), overwrite, bufferSize, replication, blockSize, progress);
    }

    public abstract FSDataOutputStream create(Path var1, FsPermission var2, boolean var3, int var4, short var5, long var6, Progressable var8) throws IOException;

    public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException {
        return this.create(f, permission, flags, bufferSize, replication, blockSize, progress, (ChecksumOpt)null);
    }

    public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress, ChecksumOpt checksumOpt) throws IOException {
        return this.create(f, permission, flags.contains(CreateFlag.OVERWRITE), bufferSize, replication, blockSize, progress);
    }

create 有多个重载函数,它的参数可以指定是否覆盖已有的文件、文件备份数量、写入文件缓冲区大小、文件块大小以及文件权限。它的返回值是一个 FSDataOutputStream,通过返回的 FSDataOutputStream 对象可以对文件进行写入。

  • 7.3 copyFromLocal
    public void copyFromLocalFile(Path src, Path dst) throws IOException {
        this.copyFromLocalFile(false, src, dst);
    }

    public void copyFromLocalFile(boolean delSrc, Path src, Path dst) throws IOException {
        this.copyFromLocalFile(delSrc, true, src, dst);
    }

    public void copyFromLocalFile(boolean delSrc, boolean overwrite, Path[] srcs, Path dst) throws IOException {
        Configuration conf = this.getConf();
        FileUtil.copy(getLocal(conf), srcs, this, dst, delSrc, overwrite, conf);
    }

将本地文件拷贝到文件系统,参数可以指定上传本地文件的路径,上传的多个路径组成的 Path 数组,存放目标对路径,可以指定是否删除本地本地的文件或者覆盖 hdfs 上已经创建的文件。

  • 7.4 copyToLocalFile
    public void copyToLocalFile(Path src, Path dst) throws IOException {
        this.copyToLocalFile(false, src, dst);
    }

    public void copyToLocalFile(boolean delSrc, Path src, Path dst) throws IOException {
        this.copyToLocalFile(delSrc, src, dst, false);
    }

将目标文件复制到本地指定路径,delSrc 参数指定移动文件后是否要删除源文件。

  • 7.6 moveToLocalFile
    public void moveToLocalFile(Path src, Path dst) throws IOException {
        this.copyToLocalFile(true, src, dst);
    }

将目标文件移动到指定路径,函数内部调用的是 copyToLocalFile

  • 7.6 exists
    public boolean exists(Path f) throws IOException {
        try {
            return this.getFileStatus(f) != null;
        } catch (FileNotFoundException var3) {
            return false;
        }
    }

输入一个路径,检查 HDFS 上是否存在这个路径,存在返回 true,不存在返回 false

  • 7.7 delete
    public abstract boolean delete(Path var1, boolean var2) throws IOException;

第一个参数是要删除的路径,第二个参数为 true 时,如果目标文件夹内有文件,会强制删除。

BigData