Ingesting All The Weather Data With Apache NiFi

Timothy Spann. 🇺🇦 - Jul 11 '20 - - Dev Community



Ingesting All The Weather Data With Apache NiFi

Step By Step NiFi Flow

  1. GenerateFlowFile - build a schedule matching when NOAA updates weather
  2. InvokeHTTP - download all weather ZIP
  3. CompressContent - decompress ZIP
  4. UnpackContent - extract files from ZIP
  5. *RouteOnAttribute - just give us ones that are airports (${filename:startsWith('K')}). optional.
  6. *QueryRecord - XMLReader to JsonRecordSetWriter. Query : SELECT * FROM FLOWFILE WHERE NOT location LIKE '%Unknown%'. This is to remove some locations that are not identified. optional.
  7. Send it somewhere for storage. Could put PutKudu, PutORC, PutHDFS, PutHiveStreaming, PutHbaseRecord, PutDatabaseRecord, PublishKafkaRecord2* or others.

URL For All US Data

invokehttp.request.url

https://w1.weather.gov/xml/current\_obs/all\_xml.zip

Example Record As Converted JSON

[ {

"credit" : "NOAA's National Weather Service",

"credit_URL" : "http://weather.gov/",

"image" : {

"url" : "http://weather.gov/images/xml\_logo.gif",

"title" : "NOAA's National Weather Service",

"link" : "http://weather.gov"
Enter fullscreen mode Exit fullscreen mode

},

"suggested_pickup" : "15 minutes after the hour",

"suggested_pickup_period" : 60,

"location" : "Stanley Municipal Airport, ND",

"station_id" : "K08D",

"latitude" : 48.3008,

"longitude" : -102.4064,

"observation_time" : "Last Updated on Jul 10 2020, 9:55 am CDT",

"observation_time_rfc822" : "Fri, 10 Jul 2020 09:55:00 -0500",

"weather" : "Fair",

"temperature_string" : "66.0 F (19.0 C)",

"temp_f" : 66.0,

"temp_c" : 19.0,

"relative_humidity" : 83,

"wind_string" : "South at 6.9 MPH (6 KT)",

"wind_dir" : "South",

"wind_degrees" : 180,

"wind_mph" : 6.9,

"wind_kt" : 6,

"pressure_in" : 30.03,

"dewpoint_string" : "60.8 F (16.0 C)",

"dewpoint_f" : 60.8,

"dewpoint_c" : 16.0,

"visibility_mi" : 10.0,

"icon_url_base" : "http://forecast.weather.gov/images/wtf/small/",

"two_day_history_url" : "http://www.weather.gov/data/obhistory/K08D.html",

"icon_url_name" : "skc.png",

"ob_url" : "http://www.weather.gov/data/METAR/K08D.1.txt",

"disclaimer_url" : "http://weather.gov/disclaimer.html",

"copyright_url" : "http://weather.gov/disclaimer.html",

"privacy_policy_url" : "http://weather.gov/notice.html"

} ]

Source Code

https://github.com/tspannhw/ClouderaFlowManagementWorkshop/tree/main/flows

Resources

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .