Hadoop

Execute file system commands via HTTPFS

HttpFS is a service that provides HTTP access to HDFS. i.e we can access the HDFS from other filesystems from browsers, and using programming languages.

HttpFS has a REST HTTP API supporting all HDFS filesystem operations (both read and write).

Using HttpFS, we can Read and write data in HDFS using HTTP utilities (such as curl or wget) and HTTP libraries from languages other than Java.

Cloudera offers HttpFS role as part of the HDFS service and you can assign the HttpFS role to hosts during initial setup or at any time .

If not assigned, go to HDFS – instances – Add role instances

Select the host on which you want to assign HttpFS and finish (Client Deploy).

Once added, you can see it in the HDFS – instances page. (Assigned for standby host)

Now to execute filesystem commands, login to any nodes in the cluster.

I logged into the master host and listing a file in the hdfs, /hadoop/conf directory.

To execute the same command via HttpFS

# curl ‘http://httpfshost:port/webhdfs/v1/hdfspath?op=OPERATION&user.name=user’

Here our httpfs host is standby and the default port is 14000, operation for ls is LISTSTATUS and username is root.

For more info on httpfs and webhdfs api commands, check

https://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-hdfs-httpfs/index.html

https://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

 

Problem Scenarios:

· Assign a httpfs role to the given host and create a file in hdfs.

· Using httpfs, create a directory in hdfs, etc.,

Thus we covered how to Execute file system commands via HTTPFS

Use the comments section below to post your doubts, questions and feedback.

Please follow my blog to get notified of more certification related posts, exam tips, etc.

 


 

6 thoughts on “Execute file system commands via HTTPFS

  1. How to create a file in hdfs using httpfs, what is the command
    and how much internet speed is required during the exam
    it it ok if i have around 1mbps speed
    And what is the process for documents means i have to upload scan proof and they will take my photo

  2. Hi I am also trying to put the file from local file system to hadoop file system in 2 step. But in both the step I am getting 307 Temporary redirect only…
    [hduser@ip-10-0-0-8 ~]$ curl -i -X PUT “http://ip-10-0-0-17.ec2.internal:14000/webhdfs/v1/tmp/abcSnapshot/WEB/upload.csv?user.name=hduser&op=CREATE”
    HTTP/1.1 307 Temporary Redirect
    Server: Apache-Coyote/1.1
    Set-Cookie: hadoop.auth=”u=hduser&p=hduser&t=simple-dt&e=1511657216609&s=KWkeaYoEdj2pfJKLszrAQ6n1jb8=”; Path=/; HttpOnly
    Location: http://ip-10-0-0-17.ec2.internal:14000/webhdfs/v1/tmp/abcSnapshot/WEB/upload.csv?op=CREATE&user.name=hduser&data=true
    Content-Type: application/json
    Content-Length: 0
    Date: Sat, 25 Nov 2017 14:46:56 GMT

    [hduser@ip-10-0-0-8 ~]$ curl -i -X PUT -T upload.csv “http://ip-10-0-0-17.ec2.internal:14000/webhdfs/v1/tmp/abcSnapshot/WEB/upload.csv?user.name=hduser&op=CREATE”
    HTTP/1.1 100 Continue

    HTTP/1.1 307 Temporary Redirect
    Server: Apache-Coyote/1.1
    Set-Cookie: hadoop.auth=”u=hduser&p=hduser&t=simple-dt&e=1511657235409&s=I/QOQz7a4ykoVh8ah6tFH56jmpg=”; Path=/; HttpOnly
    Location: http://ip-10-0-0-17.ec2.internal:14000/webhdfs/v1/tmp/abcSnapshot/WEB/upload.csv?op=CREATE&user.name=hduser&data=true
    Content-Type: application/json
    Content-Length: 0
    Date: Sat, 25 Nov 2017 14:47:15 GMT

    any help on why is this behaviour? Anything wrong in command or configuration?

Leave a Reply

Your email address will not be published. Required fields are marked *