{"id":21603,"date":"2020-09-23T15:57:14","date_gmt":"2020-09-23T15:57:14","guid":{"rendered":"https:\/\/merikebi.warrenmyers.com\/?p=21603"},"modified":"2020-09-23T15:57:14","modified_gmt":"2020-09-23T15:57:14","slug":"answer-by-warren-for-extract-filename-out-of-raw-data-using-regex","status":"publish","type":"post","link":"https:\/\/merikebi.warrenmyers.com\/?p=21603","title":{"rendered":"Answer by warren for extract filename out of raw data using regex"},"content":{"rendered":"<h2>Here are two options:<\/h2>\n<h3>First<\/h3>\n<p>If you want what&#8217;s between the <code>GET <\/code> and <code> HTTP<\/code>, this will do it:<\/p>\n<pre><code>| rex field=_raw &quot;GET\\s+(?&lt;fname&gt;\\S+)\\s+HTTP&quot;\n<\/code><\/pre>\n<p>Start at the string literal <code>GET<\/code>, go one (or more) whitespaces, then put everything that&#8217;s <em>not<\/em> a whitespace character (up until a whitespace sequence that ends in the string literal <code>HTTP<\/code>) into the new field <code>fname<\/code>.<\/p>\n<p>Functionally, you can leave off the <code>\\s+HTTP<\/code> from the regex, but for <em>fullness&#8217; sake<\/em>, you may want to choose to leave it in there.<\/p>\n<h3>Second<\/h3>\n<p>If you only want the ending filename, this is it:<\/p>\n<pre><code>| rex field=_raw &quot;(?&lt;fname&gt;[\\.\\-\\w]+)\\s+HTTP&quot;\n<\/code><\/pre>\n<p>This will match all instances of <code>.<\/code>, <code>-<\/code>, and any word character (<code>\\w<\/code>) as many times as they are found before a sequence of whitespace characters (<code>\\s+<\/code>) followed by the string literal <code>HTTP<\/code> into the new field <code>fname<\/code>.<\/p>\n<p>Or, optionally (though more steps to find the match, it <em>might<\/em> be better in your case):<\/p>\n<pre><code>| rex field=_raw &quot;(?&lt;fname&gt;[^\\\/]+)\\s+HTTP&quot;\n<\/code><\/pre>\n<p>This one will match anything that is not a front slash (<code>\/<\/code>) up to the series of whitespaces followed by <code>HTTP<\/code> all into the new field <code>fname<\/code>.<\/p>\n<p>from User warren &#8211; Stack Overflow https:\/\/stackoverflow.com\/questions\/64017500\/extract-filename-out-of-raw-data-using-regex\/64031109#64031109<br \/>\nvia <a href=\"https:\/\/ifttt.com\/?ref=da&#038;site=wordpress\">IFTTT<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here are two options: First If you want what&#8217;s between the GET and HTTP, this will do it: | rex field=_raw &quot;GET\\s+(?&lt;fname&gt;\\S+)\\s+HTTP&quot; Start at the string literal GET, go one (or more) whitespaces, then put everything that&#8217;s not a whitespace character (up until a whitespace sequence that ends in the string literal HTTP) into the &hellip;<br \/><a href=\"https:\/\/merikebi.warrenmyers.com\/?p=21603\" class=\"more-link pen_button pen_element_default pen_icon_arrow_double\">Continue reading <span class=\"screen-reader-text\">Answer by warren for extract filename out of raw data using regex<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[4],"tags":[991],"keyring_services":[],"class_list":["post-21603","post","type-post","status-publish","format-standard","hentry","category-blih","tag-stackexchange"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=\/wp\/v2\/posts\/21603","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=21603"}],"version-history":[{"count":1,"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=\/wp\/v2\/posts\/21603\/revisions"}],"predecessor-version":[{"id":21604,"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=\/wp\/v2\/posts\/21603\/revisions\/21604"}],"wp:attachment":[{"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=21603"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=21603"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=21603"},{"taxonomy":"keyring_services","embeddable":true,"href":"https:\/\/merikebi.warrenmyers.com\/index.php?rest_route=%2Fwp%2Fv2%2Fkeyring_services&post=21603"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}