Every now and then, I sit down an try out some new features of Rails, or gems I’ve read about but never really had any usecase for in my daily work. This weekend I’ve played around with ActiveStorage and Hotwire, making a simple clone of a ETL project I like to rewrite.
The basic use-case for the app is to fetch EDI files from different locations, organise and parse them into a warehousing solution. The production version of this app uses S3 quite heavily, and has lots of custom code for managing the documents stored in S3.
ActiveStorage is quite easy to set up, and has built-in support for S3 storage (and many other cloud solutions). After a file has been attached, it’s queued for what is called analyzing. Basically it iterates an array of registered analyzers, and picks the one that responds to
accept? first. There are a couple of built in analyzers:
They are registered in that order, hooks on to the mimetype of the uploaded files, and gives back some metadata when ran.
For my particular usecase, I want to analyze mostly
text/plaintext files, at least for this basic testing. I’ll start by making a file called
lib/analyzer/edi_analyzer.rb with the very basic requirements of an ActiveStorage analyzer:
This should be enough to register my analyzer, which should be done in an initializer. In
config/initializers/active_storage_analyzers.rb I add the following code:
If I open a
rails console, I can call on
ActiveStorage.analyzers to see that my analyzer is correctly registered:
irb(main):026:0> ActiveStorage.analyzers => [ActiveStorage::Analyzer::ImageAnalyzer, ActiveStorage::Analyzer::VideoAnalyzer, Analyzer::EdiAnalyzer]
As you can see, the analyzer is registered as the last one being called, if you’d like to add it as the first, change
.prepend in your initializer.
While the analyzer now is registered, it will never be called, as it has not yet implemented the class method
accept?. Let’s do that.
This returns true given that the
content_type of the uploaded file is either
text/plain. In a later evolution of this analyzer you would probably do a more complex check, but this will do for now.
On to the actual result of the analyzer. When our analyzer is picked as the one to provide metadata, ActiveStorage will call
metadata on an instance of our class. I have already implemented a simple
EdiFileParser class that can parse the headers of these files, so let’s utilize that in our analyzer:
Quite a bit of code added there. While it should be pretty straightforward, let’s break it down some. When
metadata is called on our instance, we make a call for the private method
parse_file. If the call is successfull, and we get an object for our block, we use the object to return a hash of useful metadata.
parse_file private method, I call on
download_blob_to_tempfile. This is a private method provided by
ActiveStorage::Analyzer, that returns a
Tempfile with the blob contents.
I then instantiate a copy of my
EdiParser with this
Tempfile. If the parser thinks it’s
valid? I yield the instance of the parser back to my original block, and if it’s not, I log the result and fail silently. The
logger method is also provided by
ActiveStorage::Analyzer, and is an alias for
Given a known format, my ActiveStorage attachment will now have a metadata hash like this:
Happy hacking :)