Friday, 15 March 2013

hadoop - Flume folder routing based on HTTP header -


To use curls and fluvs, I have different locations based on the values ​​of CSV files on my local machine / HDFS I would like to post the HTTP header example, for this HTTP header (Network-Element: GGSN) I would like to store my files on my local machine in a folder called GGSN.

I have the following flue configuration

<
  • An HDFS sync that route event files to different locations based on HTTP header
  • P> Then after curl using the CRV files:

      find / path / files -type f -exec curl -x post http : // localhost: 9043-h "Content-Type: Text / Xml" -H "Network-Element: GGSN" --Data-Binary "@ {}" -v \;  

    These logs are generated:

      * About trying to connect to localhost port 9043 (# 0) * 1. Connection rejected * 127.0.0.1 Trying ... linked * Local Hosts (127.0.0.1) Port 9043 (# 0) & gt; Post / HTTP / 1.1 & gt; User-agent: Curl / 7.1 9.7 (x86_64-redhat-linux-gnu) libcurl / 7.1 9.7 nss / 3.14.0.0 zlib / 1.2.3 Libya / 1.18 Libsc2 / 1.4.2 & gt; Host: Localhost: 9043 & gt; Accept: * / * & gt; Content-Type: Text / xml & gt; Network-element: GGSN & gt; Content-Length: 9 72660 & gt; Expect: 100-Continue & gt; & Lt; Continue to http / 1.1 100 & lt; HTTP / 1.1200 OK & lt; Transfer encoding: chunked & lt; Server: Jetty (6.1.26) & lt; * Connection # 0 remains hostile to the local host * Connection Closure # 0  

    Show flame logs as follows:

      2015-03-16 19:41: 14,887 debug org.apache.flume.sink.solr.morphline.BlobHandler: requestHeaders: {required = 100 is released, host = localhost: 9043 content-length = 972 660, network-element = GGSN, user -Agent = curl / 7.19 .7 (x86_64-redhat-linux-gnu) libcurl / 7.1 9.7 NSS / 3.14.0.0 Jhlib / 1.2.3 Libidan / 1.18 Libss 2 / 1.4.2, content-type = text / xml, approved = * / *} 2015 -03-16 19: 41: 14,891 debuting org.apache.flume.sink .solr.morphline.BlobHandler: blobEvent: [Event extend header = {content type = text / xml}, body.length = 972660]  

    I use this Flu configuration: < / p>

      Saksorsej = Httpsorsl Sakcanls = Memorichannell Saksinks = Lokalsinkl Saksorsejkhttpsorslktaip = http Saksorsej Khttpsorslkhandlr = Orgkapachekflumeksinkksolr .morphline.BlobHandler sa.sources.httpsource1.port = 9043 sa.sources.httpsource1.channels = Memoricanel 1 sa.channels.me morychannel1.type = memory sa.channels.memorychannel1.capacity = 10000 sa.channels.memorychannel1.transactionCapacity = 1000 sa.sinks.localsink1.type = file_roll sa.sinks.localsink1.channel = memorychannel1 sa.sinks.localsink1.sink.directory / Path /% {network-element} sa.sinks.localsink1.sink.rollInterval = 36000 < / Code> 

    Files can not be kept under this path for some reason: / path /% {network-element} It seems that this path does not exist, even if I manually GGSN folder is created and set all permissions for


    No comments:

    Post a Comment