Technitribe

interesting problems (and a few solutions, too)

Technitribe
  • About the Authors
  • Log In
  • Log Out
  • Lost Password
  • Register
  • Reset Password
    • 9 Sep 2019

      Using jq to filter an array of objects from JSON

      Written by Tim Bielawa

      For some reason it took me an unreasonable amount of time to figure out how to filter an array (or list) of objects from a JSON stream. Every single example I found was a little too weird for me, or resulted in printing each object, but not in a final array format. Here’s what I came up with:

      Say for example you are parsing the AWS IP ranges JSON stream, you will receive an object like this:

      {
        "syncToken": "1567728788",
        "createDate": "2019-09-06-00-13-08",
        "prefixes": [
          {
            "ip_prefix": "18.208.0.0/13",
            "region": "us-east-1",
            "service": "AMAZON"
          },
          ... more objects here ...

      I was attempting to filter this down to ONLY objects where the service attribute was AMAZON. Using this jql I would get objects printed one after the other which is not what I wanted:

      $ jq -c '.prefixes[] | select(.service=="AMAZON")' < ip-ranges.json | head
      {"ip_prefix":"18.208.0.0/13","region":"us-east-1","service":"AMAZON"}
      {"ip_prefix":"52.95.245.0/24","region":"us-east-1","service":"AMAZON"}
      {"ip_prefix":"99.77.142.0/24","region":"ap-east-1","service":"AMAZON"}

      The correct syntax was ultimately very similar. 

      $ jq '.prefixes | map(. | select(.service=="AMAZON"))' < ip-ranges.json  | head
      [
      {
      "ip_prefix": "18.208.0.0/13",
      "region": "us-east-1",
      "service": "AMAZON"
      },

      Now we are getting each object returned as a member of an array. The difference is that we’re putting the .prefixes array objects into the map function and telling it to iterate every object through the select function. The map takes all of those matching objects and returns them as an array, whereas, previously we were only selecting objects that matched our select criteria. To get the objects back in a list we required the map. 

      0 Comments
    • 20 Jan 2017

      [Updated] GitHub + Gmail — Filtering for Review Requests and Mentions

      Written by Tim Bielawa

      Update – 2017-01-27

      Just 3 days after publishing this blog post GitHub made a new blog post:

      Pull request reviews are a great way to share the weight of building software, and with review requests you can get the exact feedback you need.

      To make it easier to find the pull requests that need your attention, you can now filter by review status from your repository pull request index.

      Source: Filter pull request reviews and review requests

      I have tried this out and it’s great! Like most everything else on GitHub it’s very intuitive and simple to use. I won’t steal their thunder and describe it all here. So go check out the blog post for yourself and read up on the details (screenshots included!).

      Continue reading if you’re still interested in incorporating this kind of filtering and labeling into your Gmail account.

      The Problem

      I’ve been looking for a way to filter my GitHub Pull Request lists under the condition that a review is requested of me. The online docs didn’t show any filter options for this, so I checked out the @GitHubHelp twitter account. The answer was there on the front page — they don’t support filtering PRs by review-requested-by:me yet:

      @zaghnaboot Adding a filter for reviewers is definitely on our radar, though I don’t have a specific timeline to share. –SJ

      — GitHub Support (@GitHubHelp) January 19, 2017

      So what is one to do? I’m using Gmail so I began considering what filter options were available to me there. My objectives were to clearly label and highlight:

      •  PRs where review has been requested
      • Comments where I am @mention‘d

      Interested in knowing more? Read on after the break for all the setup details.

      (more…)

      2 Comments
    • 24 Aug 2016

      bitmath-1.3.1 released

      Written by Tim Bielawa

      bitmath is a Python module I wrote which simplifies many facets of interacting with file sizes in various units as python objects. A few weeks ago version 1.3.1 was released with a few small updates.

      Updates

      • New function: bitmath.parse_string_unsafe(), a less strict version of bitmath.parse_string()

      This new function accepts inputs using non-standard prefix units such as single-letter, or mis-capitalized units. For example, parse_string will not accept a short unit like ‘100k‘, whereas parse_string_unsafe will gladly accept it:

      • Documentation Refresh: The project documentation has been thoroughly reviewed and refreshed.

      Several broken, moved, or redirecting links have been fixed. Wording and examples are more consistent. The documentation also lands correctly when installed via package.

      Getting bitmath-1.3.1

      bitmath-1.3.1 is available through several installation channels:

      • Fedora 23 and newer repositories
      • EPEL 6 and 7 repositories
      • PyPi

      Ubuntu builds have not been prepared yet due to issues I’ve been having with Launchpad and new package versions.

      0 Comments
    • 3 Feb 2016

      bitmath-1.3.0 released

      Written by Tim Bielawa

      It’s been quite a while since I’ve posted any bitmath updates (bitmath is a Python module I wrote which simplifies many facets of interacting with file sizes in various units as python objects) . In fact, it seems that the last time I wrote about bitmath here was back in 2014 when 1.0.8 was released! So here is an update covering everything post 1.0.8 up to 1.3.0.

      New Features

      • A command line tool, bitmath, you can use to do simple conversions right in your shell [docs]!
      • New utility function bitmath.parse_string for parsing a human-readable string into a bitmath object
      • New utility: argparse integration: bitmath.BitmathType. Allows you to specify arguments as bitmath types
      • New utility: progressbar integration: bitmath.integrations.BitmathFileTransferSpeed. A more functional file transfer speed widget
      • New bitmath module function: bitmath.query_device_capacity(). Create bitmath.Byte instances representing the capacity of a block device
        • This my favorite enhancement
        • In an upcoming  blog post I’ll talk about just how cool I thought it was learning how to code this feature
        • Conceptual and practical implementation topics included
      • The bitmath.parse_string() function now can parse ‘octet’ based units
        • Enhancement requested in #53 parse french unit names by walidsa3d.
      • New utility function: bitmath.best_prefix()
        • Return an equivalent instance which uses the best human-readable prefix-unit to represent it
        • This is way cooler than it may sound at the surface, I promise you

      Bug Fixes

      • #49 – Fix handling unicode input in the bitmath.parse_string function. Thanks drewbrew!
      • #50 – Update the setup.py script to be python3.x compat. Thanks ssut!
      • #55 “best_prefix for negative values”. Now bitmath.best_prefix() returns correct prefix units for negative values. Thanks mbdm!

      Misc

      To help with the Fedora Python3 Porting project, bitmath now comes in two variants in Fedora/EPEL repositories (BZ1282560). The Fedora and EPEL updates are now in the repos. TIP: python2-bitmath will obsolete the python-bitmath package. Do a dnf/yum ‘update‘ operation just to make sure you catch it.

      The PyPi release has already been pushed to stable.

      Back in bitmath-1.0.8 we had 150 unit tests. The latest release has almost 200! Go testing! :confetti:

      1 Comment
    • 30 Oct 2015

      Streaming the serial numbers from an X509 certificate revocation list

      Written by Alex Wood

      The project I work on uses X509 certificates with custom extensions to manage content access on the Red Hat CDN. The basic idea is that Candlepin issues X509 certificates with an extension saying what content the certificate is good for. Client systems then use that certificate for TLS client authentication when connecting to the CDN. If the content they are requesting (deduced from the request URL) matches the content available to them in the certificate, then access is granted.

      This system works well in practice except for one problem: every time content for a particular product changes, the content data in the X509 extension becomes obsolete. We have to revoke the obsolete certificates and issue new ones. The result is an extremely large certificate revocation list (CRL).

      For our cryptography needs, Candlepin uses the venerable Legion of the Bouncy Castle Java library. This library anticipates normal CRL usage so when building a CRL object from an existing file, the entire structure is read into memory at once. This approach doesn’t scale well with the numbers of revoked certificates we are dealing with, so we needed to devise a way to stream the CRL. Moreover, the only thing we really care about for our purposes is the revoked certificate’s serial number.

      Streaming the CRL means we need to dissect the ASN1 that describes the CRL one piece at a time. RFC 5280 to the rescue! Looking at the description of the ASN1 for a CRL reveals that before the sequence containing the revocation entries, there will be a thisUpdate and optionally nextUpdate field of either type UTCTime or GeneralizedTime. We need to descend in the ASN1 until we get to the thisUpdate field, look for and discard the optional nextUpdate field and then walk through the revokedCertificates sequence reading the serial numbers.

      That procedure is not exactly a walk in the park, so in the hope that someone else may find it useful, here is the solution I came up with. Keep in mind that the code does not check the signature on the CRL so this code should not be used for any CRL that you do not trust implicitly.

      The end results are pretty dramatic. The benchmarking toolkit I’m using shows an improvement in execution time by an order of magnitude (from around 7 seconds to .7 seconds) and memory usage drops by about 30%. You can see the GC statistics in the graph below.
      Visualization of X509CRLStream's benchmarks

      and the benchmarking results are

      Benchmark             Mode Cnt    Score     Error  Units
      CRLBenchmark.inMemory avgt  20  7493.602 ± 941.592  ms/op
      CRLBenchmark.stream   avgt  20   669.084 ±  91.382  ms/op
      

      In writing this, A Layman’s Guide to a Subset of ASN.1, BER, and DER was of invaluable assistance to me as was the Wikipedia page on X.690. I recommend reading them both.

      2 Comments
    • →
    Page 1 of 4
    • The Authors
    • Virtual Disk Guide

      Interested in virtualization? Do QCOWs rule your filesystem? Are you a libvirt or KVM+QEMU wizard? I wrote a book about virtual disk management. Check out the The Linux Sysadmin's Guide to Virtual Disks online for free at ScribesGuides.com.


      Consider supporting the author by purchasing a hard copy of the first edition for just $10.00 on Lulu.com.

    • bitmath

      bitmath is a Python library for dealing with file size units (GiB's, kB's, etc) in a sane way. bitmath supports arithmetic, rich comparison, conversion, automatic best human-readable representation, and many other utility functions. Read some examples on the docs site or check out the source on GitHub.

    • latest posts

      • Using jq to filter an array of objects from JSON September 9, 2019
      • Two Year Break — And we’re back! November 16, 2018
      • [Updated] GitHub + Gmail — Filtering for Review Requests and Mentions January 20, 2017
    • tags

      bitmath blog conference css dblatex DNS DocBook eclipse Emacs Erlang Fedora fedora 22 filter GNU Screen Haiku Introduction java jboss LCSEE Linux locale locales fix slicehost ubuntu Macports module nist nXML-Mode opengl open source OS X package packaging pki prefix units presentation project pypi Python scholarship si summit Tutorial ubuntu xcode XML XMPP
    • h4ck teh world

      tbielawatbielawa
      • Create
        aos-cd-jobs
        October 27, 2020 - 1:45 pm UTC
      • Pull Request
        openshift/aos-cd-jobs
        October 27, 2020 - 1:45 pm UTC

Creative Commons License
Technitribe by Tim Bielawa is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.